# CONFLUENT PLATFORM

* [Overview](overview.md)
* [Get Started](get-started/index.md)
* [Install and Upgrade](installation/index.md)
  * [Overview](installation/overview.md)
  * [System Requirements](installation/system-requirements.md)
  * [Supported Versions and Interoperability](installation/versions-interoperability.md)
  * [Install Manually](installation/installing_cp/index.md)
  * [Deploy with Ansible Playbooks](https://docs.confluent.io/ansible/current/overview.html)
  * [Deploy with Confluent for Kubernetes](https://docs.confluent.io/operator/current/)
  * [License](installation/license.md)
  * [Upgrade](installation/upgrade-overview.md)
  * [Installation Packages](installation/available_packages.md)
  * [Migrate to Confluent Platform](installation/migrating.md)
  * [Migrate to and from Confluent Server](installation/migrate-confluent-server.md)
  * [Migrate from Confluent Server to Confluent Kafka](installation/migrate-confluent-kafka.md)
  * [Migrate from ZooKeeper to KRaft](installation/migrate-zk-kraft.md)
  * [Installation FAQ](installation/faq.md)
* [Build Client Applications](clients/index.md)
  * [Overview](clients/overview.md)
  * [Configure Clients](clients/config-index.md)
  * [Client Guides](clients/guides.md)
  * [Client Examples](clients/examples-index.md)
  * [Kafka Client APIs](clients/client-api.md)
  * [Deprecated Client APIs](clients/deprecate-how-to.md)
  * [VS Code Extension](clients/vscode/index.md)
  * [MQTT Proxy](kafka-mqtt/index.md)
* [Build Kafka Streams Applications](streams/index.md)
  * [Overview](streams/overview.md)
  * [Quick Start](streams/quickstart.md)
  * [Streams API](streams/introduction.md)
  * [Tutorial: Streaming Application Development Basics on Confluent Platform](streams/microservices-orders.md)
  * [Connect Streams to Confluent Cloud](https://docs.confluent.io/cloud/current/cp-component/streams-cloud-config.html)
  * [Concepts](streams/concepts.md)
  * [Architecture](streams/architecture.md)
  * [Examples](streams/code-examples.md)
  * [Developer Guide](streams/developer-guide/index.md)
  * [Build Pipeline with Connect and Streams](streams/connect-streams-pipeline.md)
  * [Operations](streams/operations.md)
  * [Upgrade](streams/upgrade-guide.md)
  * [Frequently Asked Questions](streams/faq.md)
  * [Javadocs](streams/javadocs.md)
  * [ksqlDB](ksqldb/index.md)
* [Confluent Private Cloud](private-cloud/index.md)
  * [Overview](private-cloud/overview.md)
  * [Confluent Private Cloud Gateway](https://docs.confluent.io/private-cloud-gateway/current)
  * [Intelligent Replication](private-cloud/intelligent-replication/index.md)
  * [Release Notes for Confluent Private Cloud](private-cloud/cpc-release-notes.md)
* [Confluent REST Proxy for Apache Kafka](kafka-rest/overview.md)
  * [Overview](kafka-rest/index.md)
  * [Quick Start](kafka-rest/quickstart.md)
  * [API Reference](kafka-rest/api.md)
  * [Production Deployment](kafka-rest/production-deployment/index.md)
  * [Connect to Confluent Cloud](https://docs.confluent.io/cloud/current/cp-component/kafka-rest-config.html)
* [Process Data With Flink](flink/index.md)
  * [Overview](flink/overview.md)
  * [Installation and Upgrade](flink/installation/index.md)
  * [Get Started](flink/get-started/index.md)
  * [Architecture and Features](flink/concepts/index.md)
  * [Configure Environments, Catalogs and Compute Pools](flink/configure/index.md)
  * [Deploy and Manage Flink Jobs](flink/jobs/index.md)
  * [Disaster Recovery](flink/disaster-recovery.md)
  * [Clients and APIs](flink/clients-api/index.md)
  * [How-to Guides](flink/how-to-guides/index.md)
  * [FAQ](flink/faq.md)
  * [Get Help](flink/get-help.md)
  * [What’s New](flink/changelog.md)
* [Connect to External Services](connect/overview.md)
  * [Overview](connect/index.md)
  * [Get Started](connect/userguide.md)
  * [Connectors](connect/kafka_connectors.md)
  * [Confluent Hub](connect/confluent-hub/overview.md)
  * [Connect on z/OS](connect/connect-zos.md)
  * [Install](connect/install.md)
  * [License](connect/license.md)
  * [Supported](connect/supported-overview.md)
  * [Preview](connect/preview.md)
  * [Configure](connect/configuring.md)
  * [Monitor](connect/monitoring.md)
  * [Logging](connect/logging.md)
  * [Connect to Confluent Cloud](https://docs.confluent.io/cloud/current/cp-component/connect-cloud-config.html)
  * [Developer Guide](connect/devguide.md)
  * [Tutorial: Moving Data In and Out of Kafka](connect/quickstart.md)
  * [Reference](connect/references/index.md)
  * [Transform](https://docs.confluent.io/kafka-connectors/transforms/current/overview.html)
  * [Custom Transforms](connect/transforms/custom.md)
  * [Security](connect/security-overview.md)
  * [Design](connect/design.md)
  * [Add Connectors and Software](connect/extending.md)
  * [Install Community Connectors](connect/community.md)
  * [Upgrade](connect/upgrade.md)
  * [Troubleshoot](connect/troubleshoot.md)
  * [FileStream Connectors](connect/filestream_connector.md)
  * [FAQ](connect/faq.md)
* [Manage Schema Registry and Govern Data Streams](schema-registry/overview.md)
  * [Overview](schema-registry/index.md)
  * [Get Started with Schema Registry Tutorial](schema-registry/schema_registry_onprem_tutorial.md)
  * [Install and Configure](schema-registry/installation/overview.md)
  * [Fundamentals](schema-registry/fundamentals/overview.md)
  * [Manage Schemas](schema-registry/schemas-overview.md)
  * [Security](schema-registry/security/overview.md)
  * [Reference](schema-registry/develop/overview.md)
  * [FAQ](schema-registry/faqs-cp.md)
* [Manage Security](security/index.md)
  * [Overview](security/overview.md)
  * [Deployment Profiles](security/deployment-profiles.md)
  * [Compliance](security/compliance/index.md)
  * [Authenticate](security/authentication/index.md)
  * [Authorize](security/authorization/index.md)
  * [Protect Data](security/protect-data/index.md)
  * [Configure Security Properties using Prefixes](kafka/security_prefixes.md)
  * [Secure Components](security/component/index.md)
  * [Enable Security for a Cluster](security/security_tutorial.md)
  * [Add Security to Running Clusters](security/incremental-security-upgrade.md)
  * [Configure Confluent Server Authorizer](security/csa-introduction.md)
  * [Security Management Tools](security/sec-manage-tools.md)
  * [Cluster Registry](security/cluster-registry.md)
  * [Encrypt using Client-Side Payload Encryption](security/encrypt/cspe.md)
* [Deploy Confluent Platform in a Multi-Datacenter Environment](multi-dc-deployments/overview.md)
  * [Overview](multi-dc-deployments/index.md)
  * [Multi-Data Center Architectures on Confluent Platform](multi-dc-deployments/multi-region-architectures.md)
  * [Cluster Linking on Confluent Platform](multi-dc-deployments/cluster-linking/overview.md)
  * [Multi-Region Clusters on Confluent Platform](multi-dc-deployments/multi-region-overview.md)
  * [Replicate Topics Across Kafka Clusters in Confluent Platform](multi-dc-deployments/replicator/overview.md)
* [Configure and Manage](administer.md)
  * [Overview](config-manage/overview.md)
  * [Configuration Reference](installation/configuration/config-index.md)
  * [CLI Tools for Use with Confluent Platform](tools/cli-reference-overview.md)
  * [Change Configurations Without Restart](kafka/dynamic-config.md)
  * [Manage Clusters](clusters/index.md)
  * [Metadata Service (MDS) in Confluent Platform](kafka/configure-mds/overview.md)
  * [Docker Operations for Confluent Platform](installation/docker/operations/overview.md)
  * [Run Kafka in Production](kafka/deployment.md)
  * [Production Best Practices](kafka/post-deployment.md)
* [Manage Hybrid Environments with USM](usm/index.md)
  * [Overview](usm/overview.md)
  * [Get Started with USM](usm/get-started.md)
  * [USM Agent Operations](usm/agent-sizing.md)
  * [Schema Registry in Hybrid setup](usm/usm-schema.md)
* [Monitor with Control Center](https://docs.confluent.io/control-center/current/)
* [Monitor](monitor/index.md)
  * [Logging](monitor/cp-logging.md)
  * [Monitor with JMX](kafka/monitoring.md)
  * [Monitor with Metrics Reporter](monitor/metrics-reporter.md)
  * [Monitor Consumer Lag](monitor/monitor-consumer-lag.md)
  * [Monitor with Health+](health-plus/overview.md)
* [Confluent CLI](https://docs.confluent.io/confluent-cli/current/)
* [Release Notes](release-notes/overview.md)
* [APIs and Javadocs for Confluent Platform](api-javadoc/overview.md)
  * [Overview](api-javadoc/index.md)
  * [Kafka API and Javadocs for Confluent Platform](api-javadoc/kafka-api.md)
  * [Client APIs](clients/client-api.md)
  * [Confluent APIs for Confluent Platform](api-javadoc/confluent-api.md)
* [Glossary](_glossary.md)


### Azure Blob Storage object names

The Azure Blob Storage data model is a flat structure: each bucket stores
objects, and the name of each Azure Blob Storage object serves as the unique key.
However, a logical hierarchy can be inferred when the Azure Blob Storage object
names uses directory delimiters, such as `/`. The Azure Blob Storage connector
allows you to customize the names of the Azure Blob Storage objects it uploads to
the Azure Blob Storage bucket.

In general, the names of the Azure Blob Storage object uploaded by the Azure Blob
Storage connector follow this format:

```bash
<prefix>/<topic>/<encodedPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


### Azure Data Lake Storage Gen1 object names

The Azure Data Lake Storage Gen1 data model is a flat structure: each bucket
stores objects, and the name of each Azure Data Lake Storage Gen1 object serves
as the unique key. However, a logical hierarchy can be inferred when the Azure
Data Lake Storage Gen1 object names uses directory delimiters, such as `/`.
The Azure Data Lake Storage Gen1 Sink connector allows you to customize the names
of the Azure Data Lake Storage Gen1 objects it uploads to the Azure Data Lake
Storage Gen1 bucket.

In general, the names of the Azure Data Lake Storage Gen1 object uploaded by the
Azure Data Lake Storage Gen1 Sink connector follow this format:

```bash
<prefix>/<topic>/<encodedPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


### Azure Data Lake Storage Gen2 object names

The Azure Data Lake Storage Gen2 data model is a flat structure: each bucket
stores objects, and the name of each Azure Data Lake Storage Gen2 object serves
as the unique key. However, a logical hierarchy can be inferred when the Azure
Data Lake Storage Gen2 object names uses directory delimiters, such as `/`.
The Azure Data Lake Storage Gen2 Sink connector allows you to customize the names
of the Azure Data Lake Storage Gen2 objects it uploads to the Azure Data Lake
Storage Gen2 bucket.

In general, the names of the Azure Data Lake Storage Gen2 object uploaded by the
Azure Data Lake Storage Gen2 Sink connector follow this format:

```text
<prefix>/<topic>/<encodedPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


## FTPS File Names

The FTPS data model is a flat structure. Each record gets stored into a file and
the name of each file serves as the unique key. Generally, the hierarchy of
files in which records get stored follow this format:

```bash
<prefix>/<topic>/<encodedPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


### GCS object names

The GCS data model is a flat structure: each bucket stores objects, and the name
of each GCS object serves as the unique key. However, a logical hierarchy can be
inferred when the GCS object names uses directory delimiters, such as `/`. The
GCS connector allows you to customize the names of the GCS objects it uploads to
the GCS bucket.

In general, the names of the GCS object uploaded by the GCS connector follow
this format:

```bash
<prefix>/<topic>/<encodedPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


### S3 Object Names

The S3 data model is a flat structure: each bucket stores objects, and the name
of each S3 object serves as the unique key. However, a logical hierarchy can be
inferred when the S3 object names uses directory delimiters, such as `/`. The
S3 connector allows you to customize the names of the S3 objects it uploads to
the S3 bucket.

In general, the names of the S3 object uploaded by the S3 connector follow this
format:

```bash
<prefix>/<topic>/<encodedPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


### SFTP file names

The SFTP data model is a flat structure. Each record gets stored into a file and
the name of each file serves as the unique key. Generally, the hierarchy of
files in which records get stored follow this format:

```bash
<prefix>/<topic>/<encodedPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


# Apache Kafka Glossary

New to Apache Kafka® and Confluent or looking for definitions? The terms below provide brief
explanations and links to related content for important terms you’ll encounter
when working with the Confluent event streaming platform.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


## Backup and Restore Azure Blob Storage Source Connector Partitions

The Backup and Restore Azure Blob Storage Source connector’s *partitioner* determines how records read from Azure Blob Storage objects are pushed into a Kafka topic.

The partitioner is specified in the connector configuration with the `partitioner.class` configuration property.
The Backup and Restore Azure Blob Storage Source connector comes with the following partitioner:

* **Default Partitioner**: The `io.confluent.connect.storage.partitioner.DefaultPartitioner`
  reads records from each Azure Blob Storage objects with names that include the Kafka topic and push it to the same topic partitions as in Kafka.
  The `<encodedPartition>` is always `<topicName>/partition=<kafkaPartition>`, resulting in Azure Blob Storage object names
  such as `<prefix>/<topic>/partition=<kafkaPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


### Partitioners

The connector comes with the following partitioners:

* **Default Kafka Partitioner**: The
  `io.confluent.connect.storage.partitioner.DefaultPartitioner`
  preserves the same topic partitions as the partitions in the Kafka records.
  Each topic partition ultimately ends up as a storage object with a name that
  includes both the Kafka topic and Kafka partition. The
  `<encodedPartition>` is always `<topicName>/partition=<kafkaPartition>`,
  resulting in storage object names like
  `<prefix>/<topic>/partition=<kafkaPartition>/<topic>+<kafkaPartition>+<startOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


## HDFS 2 Source Connector Partitions

The connector comes out of the box with partitioners that support default
partitioning based on Kafka partitions, field partitioning, and time-based
partitioning in days or hours. You may implement your own partitioners by
extending the Partitioner class.

The following partitioners are available by default:

* **DefaultPartitioner**: To use `DefaultPartitioner` you have to configure
  the
  `partition.class`:`io.confluent.connect.storage.partitioner.DefaultPartitioner`.
  This partitioner helps to read the data from hadoop2 files which are of the
  form `<prefix>/<topic>/partition=<kafka
  Partition>/<topic>+<kafkaPartition>+<startOffset>+<endOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


## HDFS 3 Source Connector Partitions

The connector comes out of the box with partitioners that support default partitioning based on Kafka partitions, field partitioning, and time-based partitioning in days or hours. You may implement your own partitioners by extending the Partitioner class.

The following partitioners are available by default:

* **DefaultPartitioner** : To use `DefaultPartitioner` you have to configure the `partition.class`:`io.confluent.connect.storage.partitioner.DefaultPartitioner`. This partitioner helps to read the data from hadoop3 files which are of the form `<prefix>/<topic>/partition=<kafkaPartition>/<topic>+<kafkaPartition>+<startOffset>+<endOffset>.

Admin API
: The Admin API is the Kafka REST API that enables administrators to manage and
  monitor Kafka clusters, topics, brokers, and other Kafka components.


Ansible Playbooks for Confluent Platform
: Ansible Playbooks for Confluent Platform is a set of Ansible playbooks and roles
  that are designed to automate the deployment and management of Confluent Platform.


Apache Flink
: Apache Flink is an open source stream processing framework for stateful computations over
  unbounded and bounded data streams. Flink provides a unified API for batch
  and stream processing that supports event-time and out-of-order processing,
  and supports exactly-once semantics. Flink applications include real-time
  analytics, data pipelines, and event-driven applications.


  **Related terms**: *bounded stream*, *data stream*, *stream processing*,
  *unbounded stream*


  **Related content**


  - [Apache Flink: Stream Processing and SQL on Confluent Cloud](/cloud/current/flink/index.html#)
  - [What is Apache Flink?](https://www.confluent.io/learn/apache-flink/)
  - [Apache Flink 101 (Confluent Developer course)](https://developer.confluent.io/courses/apache-flink/intro/)


Apache Kafka
: Apache Kafka is an open source event streaming platform that provides a unified,
  high-throughput, low-latency, fault-tolerant, scalable, distributed, and secure
  data streaming platform.


  Kafka is a publish-and-subscribe messaging system that enables distributed
  applications to ingest, process, and share data in real-time.


  **Related content**


  - [Introduction to Kafka](/kafka/introduction.html)


audit log
: An audit log is a historical record of actions and operations that are triggered
  when auditable events occurs.


  Audit log records can be used to troubleshoot system issues, manage security,
  and monitor compliance, by tracking administrative activity, data access and
  modification, monitoring sign-in attempts, and reconstructing security breaches
  and fraudulent activity.


  **Related terms**: *auditable event*


  **Related content**


  - [Audit Log Concepts for Confluent Cloud](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
  - [Audit Log Concepts for Confluent Platform](/platform/current/security/audit-logs/audit-logs-concepts.html)


auditable event
: An auditable event is an event that represents an action or operation that can be
  tracked and monitored for security purposes and compliance.


  When an auditable event occurs, an auditable event method is triggered and
  an event message is sent to the audit log cluster and stored as an audit
  log record.


  **Related terms**: *audit log*, *event message*


  **Related content**


  - [Auditable Events in Confluent Cloud](/cloud/current/monitoring/audit-logging/event-methods/index.html)
  - [Auditable Events in Confluent Platform](/platform/current/security/audit-logs/auditable-events.html)


authentication
: Authentication is the process of verifying the identity of a principal
  that interacts with a system or application. Authentication is often
  used in conjunction with authorization to determine whether a principal
  is allowed to access a resource and perform a specific action or
  operation on that resource.


  Digital authentication requires one or more of the following: something
  a principal knows (a password or security question), something a principal
  has (a security token or key), or something a principal is (a biometric
  characteristic, such as a fingerprint or voiceprint).


  Multi-factor authentication (MFA) requires two or more forms of
  authentication.


  **Related terms**: *authorization*, *identity*, *identity provider*,
  *identity pool*, *principal*, *role*


authorization
: Authorization is the process of evaluating and then granting or denying
  a principal a set of permissions required to access and perform operations
  on resources.


  **Related terms**: *authentication*, *group mapping*, *identity*, *identity
  provider*, *identity pool*, *principal*, *role*


Avro
: Avro is a data serialization and exchange framework that provides data structures,
  remote procedure call (RPC), compact binary data format, a container file,
  and uses JSON to represent schemas.


  Avro schemas ensure that every field is properly described and documented
  for use with serializers and deserializers. You can either send a schema
  with every message or use Schema Registry to store and receive schemas for use by
  consumers and producers to save bandwidth and storage space.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Apache Avro - a data serialization system](https://avro.apache.org/)
  - Confluent Cloud: [Avro Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [Avro Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-avro.html)


Basic Kafka cluster
: A Confluent Cloud cluster type. Basic Kafka clusters are designed for experimentation, early development, and basic use cases.


batch processing
: Batch processing is the method of collecting a large volume of data over
  a specific time interval, after which the data is processed all at once
  and loaded into a destination system.


  Batch processing is often used when processing data can occur independently
  of the source and timing of the data. It is efficient for non-real-time
  data processing, such as data warehousing, reporting, and analytics.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


CIDR block
: A CIDR block is a group of IP addresses that are contiguous and can be
  represented as a single block. CIDR blocks are expressed using
  Classless Inter-domain Routing (CIDR) notation that includes an IP address
  and a number of bits in the network mask.


  **Related content**


  - [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation)
  - [Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan [RFC 4632]](https://www.rfc-editor.org/rfc/rfc4632.html)


Cluster Linking
: Cluster Linking is a highly performant data replication feature that enables
  links between Kafka clusters to mirror data from one cluster to another.
  Cluster Linking creates perfect copies of Kafka topics, which keep data
  in sync across clusters. Use cases include geo-replication of data,
  data sharing, migration, disaster recovery, and tiered separation of
  critical applications.


  **Related content**


  - [Geo-replication with Cluster Linking on Confluent Cloud](/cloud/current/multi-cloud/cluster-linking/index.html)
  - [Cluster Linking for Confluent Platform](/platform/current/multi-dc-deployments/cluster-linking/index.html)


commit log
: A commit log is a log of all event messages about commits (changes or operations made) sent
  to a Kafka topic.


  A commit log ensures that all event messages are processed at least once and
  provides a mechanism for recovery in the event of a failure.


  The commit log is also referred to as a write-ahead log (WAL) or a transaction
  log.


  **Related terms**: *event message*


Confluent Cloud
: Confluent Cloud is the fully managed, cloud-native event streaming service powered
  by Kora, the event streaming platform based on Kafka and extended by Confluent to provide
  high availability, scalability, elasticity, security, and global interconnectivity.
  Confluent Cloud offers cost-effective multi-tenant configurations as well as dedicated
  solutions, if stronger isolation is required.


  **Related terms**: *Apache Kafka*, *Kora*


  **Related content**


  - [Confluent Cloud Overview](/cloud/current/index.html)
  - [Confluent Cloud](https://www.confluent.io/confluent-cloud/)


Confluent Cloud network
: A Confluent Cloud network is an abstraction for a single tenant network environment
  that hosts Dedicated Kafka clusters in Confluent Cloud along with their single tenant services,
  like ksqlDB clusters and managed connectors.


  **Related content**


  - [Confluent Cloud Network Overview](/cloud/current/networking/overview.html#ccloud-networks)


Confluent for Kubernetes (CFK)
: *Confluent for Kubernetes (CFK)* is a cloud-native control plane for deploying and
  managing Confluent in private cloud environments through declarative API.


Confluent Platform
: Confluent Platform is a specialized distribution of Kafka at its core, with
  additional components for data integration, streaming data pipelines, and
  stream processing.


Confluent REST Proxy
: Confluent REST Proxy provides a RESTful interface to an Kafka cluster, making
  it easy to produce and consume messages, view the state of the cluster, and
  perform administrative actions without using the native Kafka protocol or
  clients.


  **Related content**


  - Confluent Platform: [REST Proxy](/platform/current/kafka-rest/index.html)


Confluent Server
: Confluent Server is the default Kafka broker component of Confluent Platform that builds
  on the foundation of Apache Kafka® and provides enhanced proprietary features
  designed for enterprise use. Confluent Server is fully compatible with Kafka, and adds
  Kafka cluster support for Role-Based Access Control, Audit Logs, Schema
  Validation, Self Balancing Clusters, Tiered Storage, Multi-Region Clusters,
  and Cluster Linking.


  **Related terms**: *Confluent Platform*, *Apache Kafka*, *Kafka broker*,
  *Cluster Linking*, *multi-region cluster (MRC)*


Confluent Unit for Kafka (CKU)
: Confluent Unit for Kafka (CKU) is a unit of horizontal scaling for
  Dedicated Kafka clusters in Confluent Cloud that provide
  preallocated resources.


  CKUs determine the capacity of a Dedicated Kafka cluster in
  Confluent Cloud.


  **Related content**


  - [CKU limits per cluster](/cloud/current/clusters/cluster-types.html#cku-limits-per-cluster)


Connect API
: The Connect API is the Kafka API that enables a connector to read event streams from a source
  system and write to a target system.


Connect worker
: A Connect worker is a server process that runs a connector and performs the actual work of
  moving data in and out of Kafka topics.


  A worker is a server process that runs on hardware independent of the Kafka
  brokers themselves. It is scalable and fault-tolerant, meaning you can run
  a cluster of workers that share the load of moving data in and out of Kafka
  from and to external systems.


  **Related terms**: *connector*, *Kafka Connect*


connection attempts
: In Confluent Cloud, connection attempts are a Kafka cluster billing dimension that defines the maximum number of
  new TCP connections to the cluster you can create in one second.


  This includes successful and unsuccessful
  authentication attempts. Available in the Metrics API as `successful_authentication_count` (only includes
  successful authentications, not unsuccessful authentication attempts).


  To reduce usage on connection attempts, use longer-lived connections to the cluster. If you exceed the
  maximum, connection attempts may be refused.


connector
: A connector is an abstract mechanism that enables communication, coordination, or cooperation
  among components by transferring data elements from one interface to another
  without changing the data.


connector offset
: Connector offset uniquely identifies the position of a connector as it processes data. Connectors
  use a variety of strategies to implement the connector offset, including everything
  from monotonically increasing integers to replay ids, lists of files, timestamps and even
  checkpoint information.


  Connector offsets keep track of already-processed data in the event of a connector restart or
  recovery. While sink connectors use a pattern for connector offsets similar to the offset
  mechanism used throughout Kafka, the implementation details for source connectors are
  often much different. This is because source connectors track the progress
  of a source system as it process data.


consumer
: A consumer is a Kafka client application that subscribes to (reads and processes) event messages
  from a Kafka topic.


  The Streams API and the Consumer API are the two APIs that enable consumers
  to read event streams from Kafka topics.


  **Related terms**: *Consumer API*, *consumer group*, *producer*, *Streams API*


Consumer API
: The Consumer API is the Kafka API used for consuming (reading) event messages or records from
  Kafka topics and enables a Kafka consumer to subscribe to a topic and read
  event messages as they arrive.


  Batch processing is a common use case for the Consumer API.


consumer group
: A consumer group is a single logical consumer implemented with multiple physical consumers for
  reasons of throughput and resilience.


  By dividing topics among consumers in the group into partitions,
  consumers in the group can process messages in parallel, increasing message
  throughput and enabling load balancing.


  **Related terms**: *consumer*, *partition*, *partition*, *producer*, *topic*


consumer lag
: Consumer lag is the number of consumer offsets between the latest message produced
  in a partition and the last message consumed by a consumer, that is the number of messages
  pending to be consumed from a particular partition.


  A large consumer lag, or a quickly growing lag, indicates that the consumer
  is unable to read from a partition as fast as the messages are available. This
  can be caused by a slow consumer, slow network, or slow broker.


consumer offset
: Consumer offset is the unique and monotonically increasing integer value that uniquely identifies
  the position of an event record in a partition. Consumers use offsets to track their current
  position in the Kafka topic, allowing consumers to resume processing from where they left off.


  Offsets are stored on the Kafka broker, which does not track which records have been
  read and which have not. It is up to the consumer connection to track
  this information. When a consumer acknowledges receiving and processing a message, it commits
  an offset value that is stored in the special internal topic `__commit_offsets`.


cross-resource RBAC role binding
: A cross-resource RBAC role binding is a role binding in Confluent Cloud that is applied
  at the Organization or Environment scope and grants access to multiple resources.
  For example, assigning a principal the NetworkAdmin role at the Organization scope
  lets them administer all networks across all Environments in their Organization.


  **Related terms**: *identity pool*, *principal*, *role*, *role binding*


CRUD
: CRUD is an acronym for the four basic operations that can be performed on data: Create, Read,
  Update, and Delete.


custom connector
: A custom connector is a connector created using Connect plugins uploaded to Confluent Cloud by users.
  This includes connector plugins that are built from scratch, modified
  open-source connector plugins, or third-party connector plugins.


data at rest
: Data at rest is data that is physically stored on non-volatile media (such as hard drives,
  solid-state drives, or other storage devices) and is not actively
  being transmitted or processed by a system.


data contract
: A data contract is a formal agreement between an upstream component and a downstream component on the structure and semantics of data that is in motion.
  A schema is a key element of a data contract. The schema, metadata, rules, policies, and evolution plan form the data contract.
  You can associate data contracts (schemas and more) with [topics](#term-Kafka-topic).


  **Related content**


  - Confluent Platform: [Data Contracts for Schema Registry on Confluent Platform](/platform/current/schema-registry/fundamentals/data-contracts.html)
  - Confluent Cloud: [Data Contracts for Schema Registry on Confluent Cloud](/cloud/current/sr/fundamentals/data-contracts.html)
  - Cloud Console: [Manage Schemas in Confluent Cloud](/cloud/current//sr/schemas-manage.html)


data encryption key (DEK)
: A data encryption key (DEK) is a symmetric key that is used to encrypt and decrypt data.
  The DEK is used in client-side field level encryption (CSFLE) to encrypt sensitive data.
  The DEK is itself encrypted using a key encryption key (KEK) that is only
  accessible to authorized users. The encrypted DEK and encrypted data are
  stored together. Only users with access to the KEK can decrypt the DEK and
  access the sensitive data.


  **Related terms**: *envelope encryption*, *key encryption key (KEK)*


data in motion
: Data in motion is data that is actively being transferred between source and destination,
  typically systems, devices, or networks.


  Data in motion is also referred to as data in transit or data in flight.


data in use
: Data in use is data that is actively being processed or manipulated in memory (RAM, CPU
  caches, or CPU registers).


data ingestion
: Data ingestion is the process of collecting, importing, and integrating data from various
  sources into a system for further processing, analysis, or storage.


data mapping
: Data mapping is the process of defining relationships or associations between source data
  elements and target data elements.


  Data mapping is an important process in data integration, data migration,
  and data transformation, ensuring that data is accurately and consistently
  represented when it is moved or combined.


data pipeline
: A data pipeline is a series of processes and systems that enable the flow of data from sources
  to destinations, automating the movement and transformation of data for
  various purposes, such as analytics, reporting, or machine learning.


  A data pipeline typically comprised of a source system, a data ingestion
  tool, a data transformation tool, and a target system. A data pipeline
  covers the following stages: data extraction, data transformation, data
  loading, and data validation.


Data Portal
: Data Portal is a Confluent Cloud application that uses Stream Catalog and Stream
  Lineage to provide self-service access throughout Confluent Cloud Console for data
  practitioners to search and discover existing topics using tags and business
  metadata, request access to topics and data, and access data in topics to
  to build streaming applications and data pipelines.


  Leverages Stream Catalog and Stream Lineage to provide a data-centric view
  of Confluent optimized for self-service access to data where users can search,
  discover and understand available data, request access to data, and use data.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Data Portal on Confluent Cloud](/cloud/current/stream-governance/data-portal.html)


data serialization
: Data serialization is the process of converting data structures or objects into a format
  that can be stored or transmitted, and reconstructed later in the same
  or another computer environment.


  Data serialization is a common technique for implementing data persistence,
  interprocess communication, and object communication. Confluent Schema Registry (in Confluent Platform)
  and Confluent Cloud Schema Registry support data serialization using serializers and deserializers
  for the following formats: Avro, JSON Schema, and Protobuf.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


data steward
: A data steward is a person with data-related responsibilities, such as data governance, data
  quality, and data security.


data stream
: A data stream is a continuous flow of data records that are produced and consumed by applications.


dead letter queue (DLQ)
: A dead letter queue (DLQ) is a queue where messages that could not be processed successfully by
  a sink connector are placed. Instead of stopping, the sink connector
  sends messages that could not be written successfully as event records
  to the DLQ topic while the sink connector continues processing messages.


Dedicated Kafka cluster
: A Confluent Cloud cluster type. Dedicated Kafka clusters are designed for critical production workloads with
  high traffic or private networking requirements.


deserializer
: A deserializer is a tool that converts a serial byte stream back into objects and parallel data.
  Deserializers work with serializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides Serdes for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


egress
: In general networking, egress refers to outbound traffic leaving a network or a specific network segment.


  In Confluent Cloud, egress is a Kafka cluster billing dimension that defines the number of bytes consumed from
  the cluster in one second.  Available in the Metrics API as `sent_bytes` (convert from bytes to MB).


  To reduce egress in Confluent Cloud, compress your messages and ensure each consumer is only consuming from
  the topics it requires. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Elastic Confluent Unit for Kafka (eCKU)
: Elastic Confluent Unit for Kafka (eCKU) is used to express capacity for Basic, Standard, Enterprise, and Freight Kafka
  clusters. These clusters automatically scale up to a fixed ceiling. There is no need to resize these
  type clusters. When you need more capacity, your cluster expands up to the fixed
  ceiling. If you’re not using capacity above the minimum, you’re not paying for it.


ELT
: ELT is an acronym for Extract-Load-Transform, where data is extracted from a source
  system and loaded into a target system before processing or transformation.


  Compared to ETL, ELT is a more flexible approach to data ingestion because
  the data is loaded into the target system before transformation.


Enterprise Kafka cluster
: A Confluent Cloud cluster type. Enterprise Kafka clusters are designed for production-ready functionality
  that requires private endpoint networking capabilities.


envelope encryption
: Envelope encryption is a cryptographic technique that uses two keys to encrypt data. The symmetric
  data encryption key (DEK) is used to encrypt sensitive data. The separate
  asymmetric key encryption key (KEK) is the master key used to encrypt the
  DEK. The DEK and encrypted data are stored together. Only users with access
  to the KEK can decrypt the DEK and access the sensitive data.


  In Confluent Cloud, envelope encryption is used to enable client-side field level
  encryption (CSFLE). CSFLE encrypts sensitive data in a message before it
  is sent to Confluent Cloud and allows for temporary decryption of sensitive data
  when required to perform operations on the data.


  **Related terms**: *data encryption key (DEK)*, *key encryption key (KEK)*


ETL
: ETL is an acronym for Extract-Transform-Load, where data is extracted from a source system,
  transformed into a target format, and loaded into a target system.


  Compared to ELT, ETL is a more rigid approach to data ingestion because the
  data is transformed before loading into the target system.


event
: An event is a meaningful action or occurrence of something that happened.


  Events that can be recognized by a program, either human-generated
  or triggered by software, can be recorded in a log file or other data
  store.


  **Related terms**: *event message*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event message
: An event message is a record of an event sent to a Kafka topic, represented as a key-value pair.


  Each event message consists of a key-value pair, a timestamp, the compression
  type, headers for metadata (optional), and a partition and offset ID (once
  the message is written). The key is optional and can be used to identify the
  event. The value is required and contains details about the event that happened.


  **Related terms**: *event*, *event record*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event record
: An event record is the record of an event stored in a Kafka topic.


  Event records are organized and durably stored in topics. Examples of events
  include orders, payments, activities, or measurements. An event typically
  contains one or more data fields that describe the fact, as well as a timestamp
  that denotes when the event was created by its event source. The event may
  also contain various metadata, such as its source of origin (for example,
  the application or cloud service that created the event) and storage-level
  information (for example, its position in the event stream).


  **Related terms**: *event*, *event message*, *event sink*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event sink
: An event sink is a consumer of events, which can include applications, cloud services, databases,
  IoT sensors, and more.


  **Related terms**: *event*, *event message, \*event record*, *event source*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event source
: An event source is a producer of events, which can include cloud services, databases, IoT sensors,
  mainframes, and more.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event stream*, *event streaming*, *event streaming platform*, *event time*


event stream
: An event stream is a continuous flow of event messages produced by an event source and consumed
  by one or more consumers.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*, *event time*


event streaming
: Event streaming is the practice of capturing event data in real-time from data sources.


  Event streaming is a form of data streaming that is used to capture, store,
  process, and react to data in real-time or retrospectively.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming platform*, *event time*


event streaming platform
: An event streaming platform is a platform that events can be written to once, allowing distributed functions
  within an organization to react in realtime.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event time*


event time
: Event time is the time when an event occurred on the producing device, as opposed to
  the time when the event was processed or recorded. Event time is often
  used in stream processing to determine the order of events and to
  perform windowing operations.


  **Related terms**: *event*, *event message*, *event record*, *event sink*,
  *event source*, *event streaming*, *event streaming platform*


exactly-once semantics
: Exactly-once semantics is a guarantee that a message is delivered exactly once and in the order that it
  was sent.


  Even if a producer retries sending a message, or a consumer retries processing
  a message, the message is delivered exactly once. This guarantee is achieved
  by the broker assigning a unique ID to each message and storing the ID in
  the consumer offset. The consumer offset is committed to the broker only
  after the message is processed. If the consumer fails to process the message,
  the message is redelivered and processed again.


Freight Kafka cluster
: A Confluent Cloud cluster type. Freight Kafka clusters are designed for high-throughput, relaxed latency
  workloads that are less expensive than self-managed open source Kafka.


granularity
: Granularity is the degree or level of detail to which an entity (a system, service, or
  resource) is broken down into subcomponents, parts, or elements.


  Entities that are *fine-grained* have a higher level of detail, while
  *coarse-grained* entities have a reduced level of detail, often combining
  finer parts into a larger whole.


  In the context of access control, granular permissions provide precise control
  over resource access. They allow administrators to grant specific operations on
  distinct resources. This ensures users only have permissions tailored to their
  needs, minimizing unnecessary or potentially risky access.


group mapping
: Group mapping is a set of rules that map groups in your SSO identity provider to Confluent Cloud
  RBAC roles. When a user signs in to Confluent Cloud using SSO, Confluent Cloud uses the
  group mapping to grant access to Confluent Cloud resources.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


  **Related content**


  - [Group Mapping for Confluent Cloud](/cloud/current/access-management/authenticate/sso/group-mapping/overview.html)


identity
: An identity is a unique identifier that is used to authenticate and authorize users and
  applications to access resources.


  Identity is often used in conjunction with access control to determine
  whether a user or application is allowed to access a resource and perform
  a specific action or operation on that resource.


  **Related terms**: *identity provider*, *identity pool*, *principal*, *role*


identity pool
: An identity pool is a collection of identities that can be used to authenticate and authorize
  users and applications to access resources.


  Identity pools are used to manage permissions for users and applications
  that access resources in Confluent Cloud. They are also used to manage permissions
  for Confluent Cloud service accounts that are used to access resources in Confluent Cloud.


identity provider
: An identity provider is a trusted provider that authenticates users and issues security tokens that
  are used to verify the identity of a user.


  Identity providers are often used in single sign-on (SSO) scenarios, where
  a user can log in to multiple applications or services with a single set of
  credentials.


Infinite Storage
: Infinite Storage is the Confluent Cloud storage service that enhances the scalability of Confluent Cloud
  resources by separating storage and processing. Tiered storage within Confluent Cloud moves data
  between storage layers based on the needs of the workload, retrieves tiered data when
  requested, and garbage collects data that is past retention or otherwise deleted.


  If an application reads historical data, latency is not increased for other
  applications reading more recent data. Storage resources are decoupled from
  compute resources, you only pay for what you produce to Confluent Cloud and for
  storage that you use, and CKUs do not have storage limits.


  Related content:


  - [Infinite Storage in Confluent Cloud for Apache Kafka](https://www.confluent.io/blog/infinite-kafka-data-storage-in-confluent-cloud/)


ingress
: In general networking, ingress refers to traffic that enters a network from an external source.


  In Confluent Cloud, ingress is a Kafka cluster billing dimension that defines the number of bytes produced to
  the cluster in one second. Available in the Metrics API as `received_bytes` (convert from bytes to MB).


  To reduce ingress in Confluent Cloud, compress your messages. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


internal topic
: An internal topic is a topic, prefixed with double underscores (“_\_”), that is automatically created
  by a Kafka component to store metadata about the broker, partition assignment,
  consumer offsets, and other information.


  Examples of internal topics: `__cluster_metadata`, `__consumer_offsets`,
  `__transaction_state`, `__confluent.support.metrics`, and
  `__confluent.support.metrics-raw`.


JSON Schema
: JSON Schema is a declarative language used for data serialization and exchange to define
  data structures, specify formats, and validate JSON documents. It is a way
  to encode expected data types, properties, and constraints to ensure that
  all fields are properly described for use with serializers and deserializers.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [JSON Schema - a declarative language that allows you to annotate and validate JSON documents.](https://json-schema.org/)
  - Confluent Cloud: [JSON Schema Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-avro.html)
  - Confluent Platform: [JSON Schema Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-json.html)


Kafka bootstrap server
: A Kafka bootstrap server is a Kafka broker that a Kafka client initiates a connection to a Kafka cluster
  and returns metadata, which includes the addresses for all of the brokers
  in the Kafka cluster.


  Although only one bootstrap server is required to connect to a Kafka cluster,
  multiple brokers can be specified in a bootstrap server list to provide high
  availability and fault tolerance in case a broker is unavailable. In Confluent Cloud,
  the bootstrap server is the general cluster endpoint.


Kafka broker
: A Kafka broker is a server in the Kafka storage layer that stores event streams from one or
  more sources.


  A Kafka cluster is typically comprised of several brokers. Every broker in a
  cluster is also a bootstrap server, meaning if you can connect to one broker
  in a cluster, you can connect to every broker.


Kafka client
: A Kafka client allows you to write distributed applications and microservices
  that read, write, and process streams of events in parallel, at scale, and
  in a fault-tolerant manner, even in the case of network problems or machine
  failures.


  The Kafka client library provides functions, classes, and utilities that allow
  developers to create Kafka producer clients (Producers) and consumer clients
  (Consumers) using various programming languages. The primary way to build
  production-ready Producers and Consumers is by using your preferred programming
  language and a Kafka client library.


  **Related content**


  - [Build Client Applications for Confluent Cloud](/cloud/current/client-apps/overview.html)
  - [Build Client Applications for Confluent Platform](/platform/current/clients/index.html)
  - [Getting Started with Apache Kafka and Java (or Python, Go, .Net, and others)](https://developer.confluent.io/get-started/java/)


Kafka cluster
: A Kafka cluster is a group of interconnected Kafka brokers that manage and distribute real-time
  data streaming, processing, and storage as if they are a single system.


  By distributing tasks and services across multiple Kafka brokers, the Kafka
  cluster improves availability, reliability, and performance.


Kafka Connect
: Kafka Connect is the component of Kafka that provides data integration between databases,
  key-value stores, search indexes, file systems, and Kafka brokers.


  Kafka Connect is an ecosystem of a client application and pluggable connectors.
  As a client application, Connect is a server process that runs on hardware
  independent of the Kafka brokers themselves. It is scalable and fault-tolerant,
  meaning you can run a cluster of Connect workers that share the load of moving
  data in and out of Kafka from and to external systems. Connect also
  abstracts the business of code away from the user and instead requires only
  JSON configuration to run.


  **Related content**


  - Confluent Cloud: [Kafka Connect](/cloud/current/billing/overview.html#kconnect-long)
  - Confluent Platform: [Kafka Connect](/platform/current/connect/index.html)


Kafka controller
: A Kafka controller is the node in a Kafka cluster that is responsible for managing and changing the metadata of the cluster.
  This node also communicates metadata changes to the rest of the cluster. When Kafka uses ZooKeeper for metadata management,
  the controller is a broker, and the broker persists the metadata to ZooKeeper for backup and recovery. With KRaft, you
  dedicate Kafka nodes to operate as controllers and the metadata is stored in Kafka itself and not persisted to ZooKeeper.
  KRaft enables faster recovery because of this.


  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


Kafka listener
: A Kafka listener is an endpoint that Kafka brokers bind to use to communicate with clients.


  For Kafka clusters, Kafka listeners are configured in the `listeners` property
  of the `server.properties` file. Advertised listeners are publicly accessible
  endpoints that are used by clients to connect to the Kafka cluster.


  **Related content**


  - [Kafka Listeners – Explained](https://www.confluent.io/blog/kafka-listeners-explained/)


Kafka metadata
: Kafka metadata is the information about the Kafka cluster and the topics that are stored in it. This information
  includes details such as the brokers in the cluster, the topics that are available, the partitions for each topic,
  and the location of the leader for each partition.


  Kafka metadata is used by clients to discover the available brokers and topics, and to determine
  which broker is the leader for a particular partition.
  This information is essential for clients to be able to send and receive messages to and from Kafka.


Kafka Streams
: Kafka Streams is a stream processing library for building streaming applications and microservices
  that transform (filter, group mapping, aggregate, join, and more) incoming event streams
  in real-time to Kafka topics stored in an Kafka cluster.


  The Streams API can be used to build applications that process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Kafka Streams](/platform/current/streams/overview.html)


Kafka topic
: See *topic*.


key encryption key (KEK)
: A key encryption key (KEK) is a master key that is used to encrypt and a decrypt other keys, specifically
  the data encryption key (DEK). Only users with access to
  the KEK can decrypt the DEK and access the sensitive data.


  **Related terms**: *data encryption key (DEK)*, *envelope encryption*.


Kora
: Kora is the cloud-native streaming data service based on Kafka technology that
  powers the Confluent Cloud event streaming platform for building real-time data
  pipelines and streaming applications. Kora abstracts low-level resources,
  such as Kafka brokers, and hides operational complexities, such as system
  upgrades.


  Kora is built on the following foundations: a tiered storage layer that
  improves cost and performance, elasticity and consistent performance
  through incremental load balancing, cost effective multi-tenancy with
  dynamic quota management and cell-based isolation, continuous monitoring
  of both system health and data integrity, and clean abstraction with
  standard Kafka protocols and CKUs to hide underlying resources.


  **Related terms**: *Apache Kafka*, *Confluent Cloud*, *Confluent Unit
  for Kafka (CKU)*


  **Related content**


  - [Kora: The Cloud Native Engine for Apache Kafka](https://www.confluent.io/blog/cloud-native-data-streaming-kafka-engine/)
  - [Kora: A Cloud-Native Event Streaming Platform For Kafka](https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf)


KRaft
: KRaft (or Apache Kafka Raft) is a consensus protocol introduced in Kafka 2.4 to provide metadata management for Kafka with the
  goal to replace ZooKeeper. KRaft simplifies Kafka because it enables the management of metadata in Kafka itself, rather than splitting
  it between ZooKeeper and Kafka. As of Confluent Platform 7.5, KRaft is the default method of metadata management in new deployments.
  For more information, see [KRaft overview](/platform/current/kafka-metadata/kraft.html).


ksqlDB
: ksqlDB is a streaming SQL database engine purpose-built for creating stream processing
  applications on top of Kafka.


logical Kafka cluster (LKC)
: A logical Kafka cluster (LKC) is a subset of a physical Kafka cluster (PKC) that is isolated from other logical
  clusters within Confluent Cloud. Each logical unit of isolation is considered a tenant
  and maps to a specific organization. If the mapping is one-to-one, one LKC maps
  to one PKC (a Dedicated cluster). If the mapping is many-to-one,
  one LKC maps to one of the multitenant Kafka cluster types (Basic, Standard, Enterprise, and Freight).


  **Related terms**: *Confluent Cloud*, *Kafka cluster*, *physical Kafka cluster (PKC)*


  **Related content**


  - [Kafka Cluster Types in Confluent Cloud](/cloud/current/clusters/cluster-types.html)
  - [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


multi-region cluster (MRC)
: A multi-region cluster (MRC) is a single Kafka cluster that replicates data between datacenters across regional
  availability zones.


multi-tenancy
: Multi-tenancy is a software architecture in which a single physical instance is shared among
  multiple logical instances, or tenants. In Confluent Cloud, each Basic, Standard, Enterprise, and Freight
  cluster is a logical Kafka cluster (LKC) that shares a physical Kafka cluster (PKC)
  with other tenants. Each LKC is isolated from other L and has its own resources,
  such as memory, compute, and storage.


  **Related terms**: *Confluent Cloud*, *logical Kafka cluster (LKC)*,
  *physical Kafka cluster (PKC)*


  **Related content**


  * [From On-Prem to Cloud-Native: Multi-Tenancy in Confluent Cloud](https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/)
  * [Multi-tenancy and Client Quotas on Confluent Cloud](/cloud/current/clusters/client-quotas.html)


offset
: An offset is an integer assigned to each message that uniquely represents its position
  within the data stream, guaranteeing the ordering of records
  and allowing offset-based connections to replay messages from any point in time.


  **Related terms**: *consumer offset*, *connector offset*, *offset commit*, *replayability*


offset commit
: An offset commit is the process of keeping track of the current position of an
  offset-based connection (primarily Kafka consumers and connectors) within the data stream.


  The offset commit process is not specific to consumers, producers, or connectors.
  It is a general mechanism in Kafka to track the position of any application that
  is reading data.


  When a consumer commits an offset, the offset identifies the
  next message the consumer should consume. For example, if a consumer has an offset of
  5, it has consumed messages 0 through 4 and will next consume message 5.


  If the consumer crashes or is shut down, its partitions are reassigned to
  another consumer which initiates consuming from the last committed offset
  of each partition.


  The committed offset for consumers is stored on a Kafka broker. When a consumer commits an
  offset, it sends a commit request to the Kafka cluster, specifying the
  partition and offset it wants to commit for a particular consumer group.
  The Kafka broker receiving the commit request then stores this offset in
  the `__consumer_offsets` internal topic.


  **Related terms**: *consumer offset*, *offset*


OpenSSL
: OpenSSL is an open-source software library and toolkit that implements the Secure Sockets Layer
  (SSL) and Transport Layer Security (TLS) protocols. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/OpenSSL).


parent cluster
: The Kafka cluster that a resource belongs to.


  **Related terms**: *Kafka cluster*


partition
: A partition is a unit of data storage that divides a topic into multiple, parallel event
  streams, each of which is stored on separate Kafka brokers and can be consumed
  independently.


  Partitioning is a key concept in Kafka because it allows Kafka to scale horizontally
  by adding more brokers to the cluster. Partitions are also the unit of parallelism
  in Kafka. A topic can have one or more partitions, and each partition is an ordered,
  immutable sequence of event records that is continually appended to a partition log.


partitions (pre-replication)
: In Confluent Cloud, partitions are a Kafka cluster billing dimension that define the maximum number of
  partitions that can exist on the cluster at one time, before replication.


  While you are not charged for partitions on any type of Kafka cluster, the number of partitions
  you use has an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only
  for pre-replication (leader partitions) across a cluster.


  All topics that you create (as well as internal topics that are automatically created by Confluent Platform
  components such as ksqlDB, Kafka Streams, Connect, and Control Center (Legacy)) count towards the cluster
  partition limit. Confluent prefixes topics created automatically with an underscore (_). Topics
  that are internal to Kafka itself (consumer offsets) are not visible in Cloud Console and
  do not count against partition limits or toward partition billing.


  Available in the Metrics API as `partition_count`.


  In Confluent Cloud, attempts to create additional partitions beyond the cluster limit fail with an error message.


  To reduce usage on partitions (pre-replication), delete unused topics and create new topics with fewer
  partitions. Use the Kafka Admin interface to increase the partition count of an existing topic if the initial
  partition count is too low.


physical Kafka cluster (PKC)
: A physical Kafka cluster (PKC) is a Kafka cluster comprised of multiple brokers.


  Each physical Kafka cluster is created on a Kubernetes cluster by the control
  plane. A PKC is not directly accessible by clients.


principal
: A principal is an entity that can be authenticated and granted permissions based on roles
  to access resources and perform operations. An entity can be a user account,
  service account, group mapping, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *role*,
  *service account*, *user account*


private internet
: A private internet is a closed, restricted computer network typically used by organizations to
  provide secure environments for managing sensitive data and resources.


processing time
: Processing time is the time when an event is processed or recorded by a system, as opposed to
  the time when the event occurred on the producing device. Processing time
  is often used in stream processing to determine the order of events and to
  perform windowing operations.


producer
: A producer is a client application that publishes (writes) data to a topic in an Kafka cluster.


  Producers write data to a topic and are the only clients that can write data
  to a topic. Each record written to a topic is appended to the partition of
  the topic that is selected by the producer.


Producer API
: The Producer API is the Kafka API that allows you to write data to a topic in an Kafka cluster.


  The Producer API is used by producer clients to publish data to a topic in
  an Kafka cluster.


Protobuf
: Protobuf (or Protocol Buffers) is an open-source data format used to serialize
  structured data for storage.


  **Related terms**: *data serialization*, *deserializer*, *serializer*


  **Related content**


  - [Protocol Buffers](https://protobuf.dev/)
  - [Getting Started with Protobuf in Confluent Cloud](https://www.confluent.io/blog/using-protobuf-in-confluent-cloud/)
  - Confluent Cloud: [Protobuf Serializer and Deserializer](/cloud/current/sr/fundamentals/serdes-develop/serdes-protobuf.html)
  - Confluent Platform: [Protobuf Serializer and Deserializer](/platform/current/schema-registry/fundamentals/serdes-develop/serdes-protobuf.html)


public internet
: The public internet is the global system of interconnected computers and networks that use TCP/IP to
  communicate with each other.


rebalancing
: Rebalancing is the process of redistributing the partitions of a topic among the consumers
  of a consumer group for improved performance and scalability.


  A rebalance can occur if a consumer has failed the heartbeat and has been
  excluded from the group, it voluntarily left the group, metadata has been
  updated for a consumer, or a consumer has joined the group.


replayability
: Replayability is the ability to replay messages from any point in time.


  **Related terms**: *consumer offset*, *offset*, *offset commit*


replication
: Replication is the process of creating and maintaining multiple copies (or *replicas*) of
  data across different nodes in a distributed system to increase availability,
  reliability, redundancy, and accessibility.


replication factor
: A replication factor is the number of copies of a partition that are distributed across the brokers
  in a cluster.


requests
: In Confluent Cloud, requests are a Kafka cluster billing dimension that defines the number of client requests to the
  cluster in one second.


  Available in the Metrics API as `request_count`.


  To reduce usage on requests, you can adjust producer batching configurations, consumer client batching configurations,
  and shut down otherwise inactive clients.


  For Dedicated clusters, a high number of requests per second results in increased load on the cluster.


role
: A role is a Confluent-defined job function assigned a set of permissions required to
  perform specific actions or operations on Confluent resources bound to a
  principal and Confluent resources. A role can be assigned to a user account,
  group mapping, service account, or identity pool.


  **Related terms**: *group mapping*, *identity*, *identity pool*, *principal*,
  *service account*


  **Related content**


  - [Predefined RBAC Roles in Confluent Cloud](/cloud/current/access-management/access-control/rbac/predefined-rbac-roles.html)
  - [Role-Based Access Control Predefined Roles in Confluent Platform](/platform/current/security/rbac/rbac-predefined-roles.html)


rolling restart
: A rolling restart restarts the brokers in a Kafka cluster with zero downtime by incrementally
  restarting a Kafka broker after verifying that there are no under-replicated
  partitions on the broker before proceeding to the next broker.


  Restarting the brokers one at a time allows for software upgrades, broker
  configuration updates, or cluster maintenance while maintaining high
  availability by avoiding downtime.


  **Related content**


  - [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart)


schema
: A schema is the structured definition or blueprint used to describe the format and
  structure event messages sent through the Kafka event streaming platform.


  Schemas are used to validate the structure of data in event messages and
  ensures that producers and consumers are sending and receiving data in the
  same format. Schemas are defined in the Schema Registry.


Schema Registry
: Schema Registry is a centralized repository for managing and validating schemas for topic message
  data that stores and manages schemas for Kafka topics. Schema Registry is built into Confluent Cloud
  as a managed service, available with the Advanced Stream Governance package,
  and offered as part of Confluent Enterprise for self-managed deployments.


  The Schema Registry is a RESTful service that stores and manages schemas for Kafka topics.
  The Schema Registry is integrated with Kafka and Connect to provide a central location
  for managing schemas and validating data. Producers and consumers to Kafka topics
  use schemas to ensure data consistency and compatibility as schemas evolve.
  Schema Registry is a key component of Stream Governance.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Overview](/platform/current/schema-registry/index.html)


schema subject
: A schema subject is the namespace for a schema in Schema Registry. This unique identifier
  defines a logical grouping of related schemas. Kafka topics contain event messages
  serialized and deserialized using the structure and rules defined in a schema
  subject. This ensures compatibility and supports schema evolution.


  **Related content**


  - Confluent Cloud: [Manage Schemas in Confluent Cloud](/cloud/current/sr/schemas-manage.html)
  - Confluent Platform: [Schema Registry Concepts](/platform/current/schema-registry/index.html)
  - [Understanding Schema Subjects](https://developer.confluent.io/courses/schema-registry/schema-subjects/)


Serdes
: Serdes are serializers and deserializers that convert objects and parallel data into
  a serial byte stream for efficient storage and high-speed data transmission
  over the wire. Confluent provides Serdes for schemas in Avro, Protobuf, and
  JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
  - [Serde](https://serde.rs/)


serializer
: A serializer is a tool that converts objects and parallel data into a serial byte stream.
  Serializers work with deserializers (known together as Serdes) to support
  efficient storage and high-speed data transmission over the wire. Confluent
  provides serializers for schemas in Avro, Protobuf, and JSON Schema formats.


  **Related content**


  - Confluent Cloud: [Formats, Serializers, and Deserializers](/cloud/current/sr/fundamentals/serdes-develop/index.html)
  - Confluent Platform: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)


service account
: A service account is a non-person entity used by an application or service to access resources
  and perform operations.


  Because a service account is an identity independent of the user who
  created it, it can be used programmatically to authenticate to resources
  and perform operations without the need for a user to be signed in.


  **Related content**


  - [Service Accounts for Confluent Cloud](/cloud/current/access-management/identity/service-accounts.html)


service quota
: A service quota is the limit, or maximum value, for a specific Confluent Cloud resource or operation
  that might vary by the resource scope it applies to.


  **Related content**


  - [Service Quotas for Confluent Cloud](/cloud/current/quotas/index.html)


single message transform (SMT)
: A single message transform (SMT) is a transformation or operation applied in realtime on an individual message
  that changes the values, keys, or headers of a message before being sent to
  a sink connector or after being read from a source connector. SMTs are
  convenient for inserting fields, masking information, event routing, and other
  minor data adjustments.


single sign-on (SSO)
: Single sign-on (SSO) is a centralized authentication service that allows users to use a single set
  of credentials to log in to multiple applications or services.


  **Related terms**: *authentication*, *group mapping*, *identity provider*


  **Related content**


  - [Single Sign-On for Confluent Cloud](/cloud/current/access-management/authenticate/sso/index.html)


sink connector
: A sink connector is a Kafka Connect connector that publishes (writes) data from a Kafka topic to
  an external system.


source connector
: A source connector is a Kafka Connect connector that subscribes (reads) data from a source (external
  system), extracts the payload and schema of the data, and publishes (writes)
  the data to Kafka topics.


standalone
: Standalone refers to a configuration in which a software application, system,
  or service operates independently on a single instance or device. This mode
  is commonly used for development, testing, and debugging purposes.


  For Kafka Connect, a standalone worker is a single process responsible for
  running all connectors and tasks on a single instance.


Standard Kafka cluster
: A Confluent Cloud cluster type. Standard Kafka clusters are designed for production-ready features and functionality.


static egress IP address
: A static egress IP address is an IP address used by a Confluent Cloud managed connector to establish outbound
  connections to endpoints of external data sources and sinks over the public
  internet.


  **Related content**


  - [Use Static IP Addresses on Confluent Cloud for Connectors and Cluster Linking](/cloud/current/networking/static-egress-ip-addresses.html)
  - [Static Egress IP Addresses for Confluent Cloud Connectors](/cloud/current/connectors/static-egress-ip.html)


storage (pre-replication)
: In Confluent Cloud, storage is a Kafka cluster billing dimension that defines the number of bytes retained on the cluster, pre-replication.
  Available in the Metrics API as `retained_bytes` (convert from bytes to TB). The returned value is pre-replication.


  Standard, Enterprise, Dedicated, and Freight clusters support Infinite Storage. This means there is no maximum size limit for the amount of data
  that can be stored on the cluster.


  You can configure policy settings `retention.bytes` and `retention.ms` at the topic level to control exactly how much and how long to
  retain data in a way that makes sense for your applications and helps control your costs.


  To reduce storage in Confluent Cloud, compress your messages and reduce retention settings. For compression, use lz4. Avoid gzip because
  of high overhead on the cluster.


Stream Catalog
: Stream Catalog is a pillar of Confluent Cloud Stream Governance that provides a
  centralized inventory of your organization’s data assets that supports
  data governance and data discovery. With Data Portal in Confluent Cloud Console,
  users can find event streams across systems, search topics by name or tags,
  and enrich event data to increase value and usefulness. REST and GraphQL
  APIs can be used to search schemas, apply tags to records or fields, manage
  business metadata, and discover relationships across data assets.


  **Related content**


  - [Stream Catalog on Confluent Cloud: User Guide to Manage Tags and Metadata](/cloud/current/stream-governance/stream-catalog.html)
  - [Stream Catalog in Streaming Data Governance (Confluent Developer course)](https://developer.confluent.io/courses/governing-data-streams/stream-catalog/)


Stream Governance
: Stream Governance is a collection of tools and features that provide data governance for data in motion.
  These include data quality tools such as Schema Registry, schema ID validation, and schema linking;
  built-in data catalog capabilities to classify, organize, and find event streams across systems;
  and stream lineage to visualize complex data relationships and uncover insights with interactive,
  end-to-end maps of event streams.


  Taken together, these and other governance tools enable teams to manage the availability, integrity, and security
  of data used across organizations, and help with standardization, monitoring, collaboration, reporting, and more.


  **Related terms**: *Stream Catalog*, *Stream Lineage*


  **Related content**


  - [Stream Governance on Confluent Cloud](/cloud/current/stream-governance/index.html)


stream lineage
: Stream lineage is the life cycle, or history, of data, including its origins, transformations,
  and consumption, as it moves through various stages in data pipelines,
  applications, and systems.


  Stream lineage provides a record of data’s journey from its source to its
  destination, and is used to track data quality, data governance, and data
  security.


  **Related terms**: **Data Portal**, *Stream Governance*


  **Related content**


  - [Stream Lineage on Confluent Cloud](/cloud/current/stream-governance/stream-lineage.html)


stream processing
: Stream processing is the method of collecting event stream data in real-time as it arrives,
  transforming the data in real-time using operations (such as filters, joins,
  and aggregations), and publishing the results to one or more target systems.


  Stream processing can be used to analyze data continuously, build data pipelines,
  and process time-sensitive data in real-time. Using the Confluent event streaming
  platform, event streams can be processed in real-time using Kafka Streams,
  Kafka Connect, or ksqlDB.


Streams API
: The Streams API is the Kafka API that allows you to build streaming applications and
  microservices that transform (for example, filter, group, aggregate, join)
  incoming event streams in real-time to Kafka topics stored in a Kafka cluster.


  The Streams API is used by stream processing clients to process data in real-time,
  analyze data continuously, and build data pipelines.


  **Related content**


  - [Introduction Kafka Streams API](/platform/current/streams/introduction.html)


throttling
: Throttling is the process Kafka clusters in Confluent Cloud use to protect themselves from getting to an over-utilized state.
  Also known as backpressure, throttling in Confluent Cloud occurs when cluster load reaches 80%. At this point, applications
  may start seeing higher latencies or timeouts as the cluster must begin throttling requests or connection attempts.


topic
: A topic is a user-defined category or feed name where event messages are stored and
  published by producers and subscribed to by consumers.


  Each topic is a log of event messages. Topics are stored in one or more
  partitions, which distribute topic records brokers in a Kafka cluster.
  Each partition is an ordered, immutable sequence of records that are
  continually appended to a topic.


  **Related content**


  - [Manage Topics in Confluent Cloud](/cloud/current/client-apps/topics/index.html)


total client connections
: In Confluent Cloud, total client connections are a Kafka cluster billing dimension that defines the number of TCP connections
  to the cluster you can open at one time.


  Available in the Metrics API as `active_connection_count`. Filter by principal
  to understand how many connections each application is creating.


  How many connections a cluster supports can vary widely based on several factors, including number of producer clients,
  number of consumer clients, partition keying strategy, produce patterns per client, and consume patterns per client.


  For Dedicated clusters, Confluent derives a guideline for total client connections from benchmarking that
  indicates exceeding this number of connections increases produce latency for test clients. However, this does not apply
  to all workloads. That is why total client connections are a guideline, not a hard limit for  Dedicated Kafka
  clusters. Monitor the impact on cluster load as connection count increases, as this is the final representation of the
  impact of a given workload or CKU dimension on the underlying resources of the cluster.


  Consider the Confluent guideline a per-CKU guideline. The number of connections tends to increase when you add brokers.
  In other words, if you significantly exceed the per-CKU guideline, cluster expansion doesn’t always give your cluster
  more connection count headroom.


Transport Layer Security (TLS)
: Transport Layer Security (TLS) is a cryptographic protocol that provides secure communication over a network. For more information,
  see [Wikipedia](https://en.wikipedia.org/wiki/Transport_Layer_Security).


unbounded stream
: An unbounded stream is a stream of data that is continuously generated in real-time and has no
  defined end. Examples of unbounded streams include stock prices, sensor
  data, and social media feeds.


  Processing unbounded streams requires a different approach than processing
  bounded streams. Unbounded streams are processed incrementally as data
  arrives, while bounded streams are processed as a batch after all data
  has arrived. Kafka Streams and Flink can be used to process unbounded streams.


  **Related terms**: *bounded stream*, *stream processing*, *unbounded stream*


under replication
: Under replication is a situation when the number of in-sync replicas is below the number of all replicas.


  Under Replicated partitions can occur when a broker is down or cannot
  replicate fast enough from the leader (replica fetcher lag).


user account
: A user account is an account representing the identity of a person who can be authenticated
  and granted access to Confluent Cloud resources.


  **Related content**


  - [User Accounts for Confluent Cloud](/cloud/current/access-management/identity/user-accounts/overview.html)


watermark
: A watermark in Flink is a marker that keeps track of time as data is
  processed. A watermark means that all records until the current moment in
  time have been “seen”. This way, Flink can correctly perform tasks that
  depend on when things happened, like calculating aggregations over time
  windows.


  **Related content**


  - [Time and Watermarks](/cloud/current/flink/concepts/timely-stream-processing.html)


#### Migrate Schemas

To migrate Schema Registry and associated schemas to Confluent Cloud, follow these steps:

1. Start the origin cluster.

   If you are running a local cluster; for example, from a [Quick Start for Confluent Platform](../../get-started/platform-quickstart.md#quickstart) download,
   start only Schema Registry for the purposes of this tutorial using the Confluent CLI [confluent local](https://docs.confluent.io/confluent-cli/current/command-reference/local/index.html) commands.
   ```bash
   confluent local services schema-registry start
   ```
2. Verify that `schema-registry`, `kafka`, and KRaft are running.

   For example, run `confluent local services status`:
   ```none
   Schema Registry is [UP]
   Kafka is [UP]
   Zookeeper is [UP]
   ```
3. Verify that no subjects exist on the destination Schema Registry in Confluent Cloud.
   ```bash
   curl -u <schema-registry-api-key>:<schema-registry-api-secret> <schema-registry-url>/subjects
   ```

   If no subjects exist, your output will be empty (`[]`), which is what you want.

   If subjects exist, delete them. For example:
   ```bash
   curl -X DELETE -u <schema-registry-api-key>:<schema-registry-api-secret> <schema-registry-url>/subjects/my-existing-subject
   ```
4. Set the destination Schema Registry to IMPORT mode.  For example:
   ```bash
   curl -u <schema-registry-api-key>:<schema-registry-api-secret> -X PUT -H "Content-Type: application/json" "https://<destination-schema-registry>/mode" --data '{"mode": "IMPORT"}'
   ```
5. Configure a Replicator worker to specify the addresses of brokers in the destination cluster, as described in [Configure and run Replicator](../../multi-dc-deployments/replicator/replicator-quickstart.md#config-and-run-replicator).

   The worker configuration file is in `CONFLUENT_HOME/etc/kafka/connect-standalone.properties`.
   ```properties
   # Connect Standalone Worker configuration
   bootstrap.servers=<path-to-cloud-server>:9092
   ```
6. Configure [Replicator](../../multi-dc-deployments/replicator/replicator-quickstart.md#replicator-quickstart) with Schema Registry and destination cluster information.
   - For stand-alone Connect instance, configure the following properties in `CONFLUENT_HOME/etc/kafka-connect-replicator/quickstart-replicator.properties`:
     ```properties
     # basic connector configuration
     name=replicator-source
     connector.class=io.confluent.connect.replicator.ReplicatorSourceConnector

     key.converter=io.confluent.connect.replicator.util.ByteArrayConverter
     value.converter=io.confluent.connect.replicator.util.ByteArrayConverter
     header.converter=io.confluent.connect.replicator.util.ByteArrayConverter

     tasks.max=4

     # source cluster connection info
     src.kafka.bootstrap.servers=localhost:9092

     # destination cluster connection info
     dest.kafka.ssl.endpoint.identification.algorithm=https
     dest.kafka.sasl.mechanism=PLAIN
     dest.kafka.request.timeout.ms=20000
     dest.kafka.bootstrap.servers=<path-to-cloud-server>:9092
     retry.backoff.ms=500
     dest.kafka.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<encrypted-username>" password="<encrypted-password>";
     dest.kafka.security.protocol=SASL_SSL

     # Schema Registry migration topics to replicate from source to destination
     # topic.whitelist indicates which topics are of interest to replicator
     topic.whitelist=_schemas
     # schema.registry.topic indicates which of the topics in the ``whitelist`` contains schemas
     schema.registry.topic=_schemas

     # Connection settings for destination Confluent Cloud Schema Registry
     schema.registry.url=https://<path-to-cloud-schema-registry>
     schema.registry.client.basic.auth.credentials.source=USER_INFO
     schema.registry.client.basic.auth.user.info=<schema-registry-api-key>:<schema-registry-api-secret>
     ```
   - If your clusters have TLS/SSL enabled, you must set the TLS/SSL configurations as appropriate for Schema Registry clients.
     ```properties
     # TLS/SSL configurations for clients to Schema Registry
      schema.registry.client.schema.registry.ssl.truststore.location
      schema.registry.client.schema.registry.ssl.truststore.type
      schema.registry.client.schema.registry.ssl.truststore.password
      schema.registry.client.schema.registry.ssl.keystore.location
      schema.registry.client.schema.registry.ssl.keystore.type
      schema.registry.client.schema.registry.ssl.keystore.password
      schema.registry.client.schema.registry.ssl.key.password
     ```
7. In `quickstart-replicator.properties`, the replication factor is set to `1` for demo purposes on a development cluster with one broker. For this schema migration tutorial, and in production, change this to at least `3`:
   ```none
   confluent.topic.replication.factor=3
   ```

   #### SEE ALSO
   For an example of a JSON configuration for Replicator in distributed mode, see [submit_replicator_schema_migration_config.sh](https://github.com/confluentinc/examples/tree/latest/ccloud/connectors/submit_replicator_schema_migration_config.sh) on GitHub [examples repository](https://github.com/confluentinc/examples).
8. Start Replicator so that it can perform the schema migration.

   For example:
   ```bash
   connect-standalone ${CONFLUENT_HOME}/etc/kafka/connect-standalone.properties \
   ${CONFLUENT_HOME}/etc/kafka-connect-replicator/quickstart-replicator.properties
   ```

   The method or commands you use to start Replicator is dependent on your
   application setup, and may differ from this example. For more information, see [Tutorial: Configure and Run Replicator for Confluent Platform as an Executable or Connector](../../multi-dc-deployments/replicator/replicator-run.md#replicator-run) and [Configure and run Replicator](../../multi-dc-deployments/replicator/replicator-quickstart.md#config-and-run-replicator).
9. Stop all producers that are producing to Kafka.
10. Wait until the replication lag is 0.

    For more information, see [Monitoring Replicator lag (Legacy versions only)](../../multi-dc-deployments/replicator/replicator-monitoring.md#monitor-replicator-lag).
11. Stop Replicator.
12. Enable mode changes in the self-managed source Schema Registry properties file by adding the following to the
    configuration and restarting.
    ```none
    mode.mutability=true
    ```

    #### IMPORTANT
    Modes are only supported starting with version 5.2 of Schema Registry.
    This step and the one following (set Schema Registry to READYONLY) are
    precautionary and not strictly necessary. If using version 5.1
    of Schema Registry or earlier, you can skip these two steps if you make
    certain to stop all producers so that no further schemas are
    registered in the source Schema Registry.
13. Set the source Schema Registry to READONLY mode.
    ```bash
    curl -u <schema-registry-api-key>:<schema-registry-api-secret> -X PUT -H "Content-Type: application/json" "https://<source-schema-registry>/mode" --data '{"mode": "READONLY"}'
    ```
14. Set the destination Schema Registry to READWRITE mode.
    ```bash
    curl -u <schema-registry-api-key>:<schema-registry-api-secret> -X PUT -H "Content-Type: application/json" "https://<destination-schema-registry>/mode" --data '{"mode": "READWRITE"}'
    ```
15. Stop all consumers.
16. Configure all consumers to point to the destination Schema Registry in the cloud and restart them.

    For example, if you are configuring Schema Registry in a Java client, change Schema Registry URL
    from source to destination either in the code or in a properties file that
    specifies the Schema Registry URL, type of authentication USER_INFO, and credentials).

    For more examples, see [Java Consumers](../schema_registry_onprem_tutorial.md#sr-tutorial-java-consumers).
17. Configure all producers to point to the destination Schema Registry in the cloud and restart them.

    For more examples, see [Java Producers](../schema_registry_onprem_tutorial.md#sr-tutorial-java-producers).
18. (Optional) Stop the source Schema Registry.


# Kafka Streams for Confluent Platform


* [Overview](overview.md)
* [Quick Start](quickstart.md)
* [Streams API](introduction.md)
* [Tutorial: Streaming Application Development Basics on Confluent Platform](microservices-orders.md)
* [Connect Streams to Confluent Cloud](https://docs.confluent.io/cloud/current/cp-component/streams-cloud-config.html)
* [Concepts](concepts.md)
* [Architecture](architecture.md)
* [Examples](code-examples.md)
* [Developer Guide](developer-guide/index.md)
  * [Overview](developer-guide/overview.md)
  * [Write a Streams Application](developer-guide/write-streams.md)
  * [Configure](developer-guide/config-streams.md)
  * [Run a Streams Application](developer-guide/running-app.md)
  * [Test](developer-guide/test-streams.md)
  * [Domain Specific Language](developer-guide/dsl-api.md)
  * [Name Domain Specific Language Topologies](developer-guide/dsl-topology-naming.md)
  * [Optimize Topologies](developer-guide/optimizing-streams.md)
  * [Processor API](developer-guide/processor-api.md)
  * [Data Types and Serialization](developer-guide/datatypes.md)
  * [Interactive Queries](developer-guide/interactive-queries.md)
  * [Memory](developer-guide/memory-mgmt.md)
  * [Manage Application Topics](developer-guide/manage-topics.md)
  * [Security](developer-guide/security.md)
  * [Reset Streams Applications](developer-guide/app-reset-tool.md)
* [Build Pipeline with Connect and Streams](connect-streams-pipeline.md)
* [Operations](operations.md)
  * [Metrics](kafka-streams-metrics.md)
  * [Monitor Kafka Streams Applications in Confluent Platform](monitoring.md)
  * [Integration with Confluent Control Center](monitoring.md#integration-with-c3)
  * [Plan and Size](sizing.md)
* [Upgrade](upgrade-guide.md)
* [Frequently Asked Questions](faq.md)
* [Javadocs](javadocs.md)
* [ksqlDB](../ksqldb/index.md)
  * [Overview](../ksqldb/overview.md)
  * [Quick Start](../ksqldb/quickstart.md)
  * [Install](../ksqldb/installing.md)
  * [Operate](../ksqldb/operations.md)
  * [Upgrade](../ksqldb/upgrading.md)
  * [Concepts](../ksqldb/concepts/index.md)
    * [Overview](../ksqldb/concepts/overview.md)
    * [Kafka Primer](../ksqldb/concepts/apache-kafka-primer.md)
    * [Connectors](../ksqldb/concepts/connectors.md)
    * [Events](../ksqldb/concepts/events.md)
    * [Functions](../ksqldb/concepts/functions.md)
    * [Lambda Functions](../ksqldb/concepts/lambda-functions.md)
    * [Materialized Views](../ksqldb/concepts/materialized-views.md)
    * [Queries](../ksqldb/concepts/queries.md)
    * [Streams](../ksqldb/concepts/streams.md)
    * [Stream Processing](../ksqldb/concepts/stream-processing.md)
    * [Tables](../ksqldb/concepts/tables.md)
    * [Time and Windows in ksqlDB Queries](../ksqldb/concepts/time-and-windows-in-ksqldb-queries.md)
  * [How-to Guides](../ksqldb/how-to-guides/index.md)
    * [Overview](../ksqldb/how-to-guides/overview.md)
    * [Control the Case of Identifiers](../ksqldb/how-to-guides/control-the-case-of-identifiers.md)
    * [Convert a Changelog to a Table](../ksqldb/how-to-guides/convert-changelog-to-table.md)
    * [Create a User-defined Function](../ksqldb/how-to-guides/create-a-user-defined-function.md)
    * [Manage Connectors](../ksqldb/how-to-guides/use-connector-management.md)
    * [Query Structured Data](../ksqldb/how-to-guides/query-structured-data.md)
    * [Test an Application](../ksqldb/how-to-guides/test-an-app.md)
    * [Update a Running Persistent Query](../ksqldb/how-to-guides/update-a-running-persistent-query.md)
    * [Use Variables in SQL Statements](../ksqldb/how-to-guides/substitute-variables.md)
    * [Use a Custom Timestamp Column](../ksqldb/how-to-guides/use-a-custom-timestamp-column.md)
    * [Use Lambda Functions](../ksqldb/how-to-guides/use-lambda-functions.md)
  * [Develop Applications](../ksqldb/developer-guide/index.md)
    * [Overview](../ksqldb/developer-guide/overview.md)
    * [Joins](../ksqldb/developer-guide/joins/index.md)
    * [Reference](../ksqldb/developer-guide/ksqldb-reference/index.md)
    * [REST API](../ksqldb/developer-guide/ksqldb-rest-api/index.md)
    * [Java Client](../ksqldb/developer-guide/java-client/java-client.md)
  * [Operate and Deploy](../ksqldb/operate-and-deploy/index.md)
    * [Overview](../ksqldb/operate-and-deploy/overview.md)
    * [Installation](../ksqldb/operate-and-deploy/installation/index.md)
    * [ksqlDB Architecture](../ksqldb/operate-and-deploy/how-it-works.md)
    * [Capacity Planning](../ksqldb/operate-and-deploy/capacity-planning.md)
    * [Changelog](../ksqldb/operate-and-deploy/changelog.md)
    * [Processing Guarantees](../ksqldb/operate-and-deploy/processing-guarantees.md)
    * [High Availability](../ksqldb/operate-and-deploy/high-availability.md)
    * [High Availability Pull Queries](../ksqldb/operate-and-deploy/high-availability-pull-queries.md)
    * [KSQL versus ksqlDB](../ksqldb/operate-and-deploy/ksql-vs-ksqldb.md)
    * [Logging](../ksqldb/operate-and-deploy/logging.md)
    * [Manage Metadata Schemas](../ksqldb/operate-and-deploy/migrations-tool.md)
    * [Monitoring](../ksqldb/operate-and-deploy/monitoring.md)
    * [Performance Guidelines](../ksqldb/operate-and-deploy/performance-guidelines.md)
    * [Schema Inference With ID](../ksqldb/operate-and-deploy/schema-inference-with-id.md)
    * [Schema Inference](../ksqldb/operate-and-deploy/schema-registry-integration.md)
  * [Reference](../ksqldb/reference/index.md)
    * [Overview](../ksqldb/reference/overview.md)
    * [SQL](../ksqldb/reference/sql/index.md)
    * [Metrics](../ksqldb/reference/metrics.md)
    * [Migrations Tool](../ksqldb/reference/migrations-tool-configuration.md)
    * [Processing Log](../ksqldb/reference/processing-log.md)
    * [Serialization Formats](../ksqldb/reference/serialization.md)
    * [Server Configuration Parameters](../ksqldb/reference/server-configuration.md)
    * [User-defined functions (UDFs)](../ksqldb/reference/user-defined-functions.md)
  * [Run ksqlDB in Confluent Cloud](https://docs.confluent.io/cloud/current/ksqldb/ksqldb-quick-start.html)
  * [Connect Local ksqlDB to Confluent Cloud](https://docs.confluent.io/cloud/current/cp-component/ksql-cloud-config.html)
  * [Connect ksqlDB to Control Center](../ksqldb/integrate-ksql-with-confluent-control-center.md)
  * [Secure ksqlDB with RBAC](../ksqldb/ksqldb-redirect.md)
  * [Frequently Asked Questions](../ksqldb/faq.md)
  * [Troubleshoot](../ksqldb/troubleshoot-ksql.md)
  * [Tutorials and Examples](../ksqldb/tutorials/index.md)


### Environment Setup

1. Use the [Quick Start for Confluent Platform](../get-started/platform-quickstart.md#quickstart) to bring up a single-node Confluent Platform development environment. With a single-line [confluent local](https://docs.confluent.io/confluent-cli/current/command-reference/local/index.html) command, you can have a basic Kafka cluster with Schema Registry, Control Center, and other services running on your local machine.
   ```bash
   confluent local start
   ```

   Your output should resemble:
   ```bash
   Starting zookeeper
   zookeeper is [UP]
   Starting kafka
   kafka is [UP]
   Starting schema-registry
   schema-registry is [UP]
   Starting kafka-rest
   kafka-rest is [UP]
   Starting connect
   connect is [UP]
   Starting ksql-server
   ksql-server is [UP]
   Starting control-center
   control-center is [UP]
   ```
2. Clone the Confluent [examples](https://github.com/confluentinc/examples) repo from GitHub and work in the `clients/avro/` subdirectory, which provides the sample code you will compile and run in this tutorial.
   ```bash
   git clone https://github.com/confluentinc/examples.git
   ```

   ```bash
   cd examples/clients/avro
   ```

   ```bash
   git checkout 8.1.0-post
   ```
3. Create a local configuration file with all the Kafka and Schema Registry connection information that is running on your local machine, and save it to `$HOME/.confluent/java.config`,
   where [$HOME](https://en.wikipedia.org/wiki/Environment_variable#Syntax) represents your user home directory. It should resemble below:
   ```none
   # Required connection configs for Kafka producer, consumer, and admin
   bootstrap.servers={{ BROKER_ENDPOINT }}
   security.protocol=SASL_SSL
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}';
   sasl.mechanism=PLAIN
   # Required for correctness in Apache Kafka clients prior to 2.6
   client.dns.lookup=use_all_dns_ips

   # Best practice for higher availability in Apache Kafka clients prior to 3.0
   session.timeout.ms=45000

   # Best practice for Kafka producer to prevent data loss
   acks=all

   # Required connection configs for Confluent Cloud Schema Registry
   schema.registry.url=https://{{ SR_ENDPOINT }}
   basic.auth.credentials.source=USER_INFO
   basic.auth.user.info={{ SR_API_KEY }}:{{ SR_API_SECRET }}
   ```


# CONFLUENT FOR KUBERNETES

* [Overview](overview.md)
* [Quick Start](co-quickstart.md)
* [Plan for Deployment](co-plan.md)
* [Prepare Kubernetes Cluster](co-prepare.md)
* [Deploy CFK](co-deploy-cfk.md)
* [Configure Confluent Platform](co-configure.md)
  * [Overview](co-configure-overview.md)
  * [Configure Storage](co-storage.md)
  * [Manage License](co-license.md)
  * [Use Custom Docker Registry](co-custom-registry.md)
  * [Configure CPU and Memory](co-resources.md)
  * [Configure Networking](co-networking.md)
    * [Overview](co-networking-overview.md)
    * [Configure Load Balancers](co-loadbalancers.md)
    * [Configure Node Ports](co-nodeports.md)
    * [Configure Port-Based Static Access](co-staticportbased.md)
    * [Configure Host-Based Static Access](co-statichostbased.md)
    * [Configure Routes](co-routes.md)
  * [Configure Security](co-security.md)
    * [Overview](co-security-overview.md)
    * [Authentication](co-authenticate.md)
    * [Authorization](co-authorize.md)
    * [Network Encryption](co-network-encryption.md)
    * [Security Compliance](co-security-compliance.md)
    * [Credentials and Certificates](co-credentials.md)
  * [Configure Pod Scheduling](co-schedule-workloads.md)
  * [Configure Connect](co-configure-connect.md)
  * [Configure Replicator](co-configure-replicator.md)
  * [Configure Rack Awareness](co-configure-rack-awareness.md)
  * [Configure REST Proxy](co-configure-rest-proxy.md)
  * [Configure KRaft](co-configure-kraft.md)
  * [Configure Unified Stream Manager](co-configure-usm.md)
  * [Advanced Configuration](co-configure-misc.md)
* [Deploy Confluent Platform](co-deploy-cp.md)
* [Manage Confluent Platform](co-manage-cp.md)
  * [Overview](co-manage-overview.md)
  * [Manage Flink](co-manage-flink.md)
  * [Manage Kafka Admin REST Class](co-manage-rest-api.md)
  * [Manage Kafka Topics](co-manage-topics.md)
  * [Manage Schemas](co-manage-schemas-index.md)
    * [Manage Schemas](co-manage-schemas.md)
    * [Link Schemas](co-link-schemas.md)
    * [Schema Registry Switchover using Unified Stream Manager](co-schema-registry-switchover.md)
  * [Manage Connectors](co-manage-connectors.md)
  * [Scale Clusters](co-scale-cluster.md)
  * [Scale Storage](co-scale-storage.md)
  * [Link Kafka Clusters](co-link-clusters.md)
  * [Manage Security](co-manage-security.md)
    * [Overview](co-manage-security-overview.md)
    * [Manage Authentication](co-manage-authentication.md)
    * [Manage RBAC](co-manage-rbac.md)
    * [Manage Certificates](co-manage-certificates.md)
    * [Manage Password Encoder Secret](co-password-encoder-secret.md)
  * [Restart Confluent Components](co-roll-cluster.md)
  * [Delete Confluent Deployment](co-delete-deployment.md)
  * [Manage Confluent Cloud](co-manage-ccloud.md)
* [Monitor Confluent Platform](co-monitor-cp.md)
* [Upgrade](co-upgrade.md)
  * [Upgrade Overview](co-upgrade-overview.md)
  * [Upgrade Confluent for Kubernetes](co-upgrade-cfk.md)
  * [Upgrade Confluent Platform](co-upgrade-cp.md)
  * [Migrate Zookeeper to KRaft](co-migrate-kraft.md)
  * [Migrate On-Premise Deployment to Confluent for Kubernetes](co-migrate-onprem.md)
  * [Migrate from Operator to Confluent for Kubernetes](co-migration.md)
* [Deployment Scenarios](co-scenarios.md)
  * [Overview](co-scenarios-overview.md)
  * [Multi-AZ Deployment](co-multi-az.md)
  * [Multi-Region Deployment](co-multi-region.md)
  * [Hybrid Deployment with Confluent Cloud](co-hybrid.md)
* [Troubleshoot](co-troubleshooting.md)
* [Manage Confluent Gateway](gateway/co-gateway-index.md)
  * [Overview](gateway/co-gateway-overview.md)
  * [Deploy Confluent Gateway](gateway/co-gateway-deploy.md)
  * [Configure Security for Confluent Gateway](gateway/co-gateway-security.md)
* [API Reference](co-api.md)
* [Confluent Plugin Reference](co-plugin-cli-index.md)
* [Release Notes](release-notes.md)
* [Glossary](_glossary.md)


### Development and connectivity features

To supplement Kafka’s Java APIs, and to help you connect all of your systems to
Kafka, Confluent Platform provides the following features:

- Confluent Connectors, which leverage the Kafka Connect API to connect Kafka to other systems such as databases, key-value stores, search indexes, and file systems.
  Confluent Hub has downloadable connectors for the most popular data sources and sinks.
  These include fully tested and supported versions of these connectors with Confluent Platform.
  See the following documentation for more information:
  - [How to Use Kafka Connect - Get Started](../connect/userguide.md#connect-userguide)
  - [Supported Self-Managed Connectors](../connect/supported.md#connect-bundled-connectors)
  - [Preview Self-Managed Connectors](../connect/preview.md#connect-preview-connectors)

  Confluent provides both commercial and community licensed connectors.
  For details, and to download connectors , see [Confluent Hub](https://www.confluent.io/hub/).
- [Non-java clients](../clients/overview.md#kafka-clients) such as a [C/C++](/kafka-clients/librdkafka/current/overview.html), [Python](/kafka-clients/python/current/overview.html), [Go](/kafka-clients/go/current/overview.html),
  and [.NET client](/kafka-clients/dotnet/current/overview.html) libraries in addition to the Java client. These clients are full-featured and performant.
  For more information, see the [Build Streaming Applications on Confluent Platform](../clients/overview.md#kafka-clients).
- A [REST Proxy](../kafka-rest/index.md#kafkarest-intro), which leverages the Admin API and makes it easy to work with Kafka from any language by providing a RESTful HTTP service for interacting with Kafka clusters. The REST Proxy supports all the admin core functionality: sending messages to Kafka, reading messages, both individually and as part of a consumer group, and inspecting cluster metadata, such as the list of topics and their settings. You get the full benefits of the high quality, officially maintained Java clients from any language.
  The REST Proxy also integrates with Schema Registry. Because it automatically translates JSON data to and from Avro, you can get all the benefits of centralized schema management from any language using only HTTP and JSON.
- All of the Kafka command-line tools and additional tools, including the
  [Confluent CLI](https://docs.confluent.io/confluent-cli/current/overview.html).
  You can find a list of all of these tools in [CLI Tools Bundled With Confluent Platform](../tools/cli-reference.md#cp-all-cli).
- [Schema Registry](/platform/current/schema-registry/index.html), which provides a centralized repository for managing and validating schemas
  for topic message data, and for serialization and deserialization of data over a network. With a messaging
  service like Kafka, services that interact with each other must agree on a common format, called a schema,
  for messages.
  Schema Registry helps enable safe, zero-downtime evolution of schemas by centralizing schema management.
  It provides a RESTful interface for storing and retrieving Avro®, JSON Schema, and Protobuf schemas.
  Schema Registry tracks all versions of schemas and enables the evolution of schemas according to user-defined
  compatibility settings. Schema Registry also includes plugins for Kafka clients that handle schema storage and
  retrieval for Kafka messages that are sent in the Avro format.
  For more information, see the [Schema Registry Documentation](/platform/current/schema-registry/index.html).
  For a hands-on introduction to working with schemas, see the [On-Premises Schema Registry Tutorial](/platform/current/schema-registry/schema_registry_onprem_tutorial.html).
  For a deep dive into supported serialization and deserialization formats, see [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html).
- [ksqlDB](../ksqldb/overview.md#ksql-home), a streaming SQL engine for Kafka. It provides an interactive SQL interface for stream processing on Kafka,
  without the need to write code in a programming language such as Java or Python. ksqlDB is scalable, elastic,
  fault-tolerant, and real-time. It supports a wide range of streaming operations, including data filtering,
  transformations, aggregations, joins, windowing, and sessionization.
  For more information, see the [ksqlDB Documentation](../ksqldb/overview.md#ksql-home), or the [ksqlDB Quick Start](../ksqldb/quickstart.md#ksqldb-quick-start).
- A [MQTT Proxy](../kafka-mqtt/intro.md#kafka-mqtt-intro), which provides a way to publish data directly to Kafka from MQTT devices and gateways without the need for a MQTT Broker in the middle.
  For more information, see  [MQTT Proxy](../kafka-mqtt/intro.md#kafka-mqtt-intro).


# ksqlDB for Confluent Platform


* [Overview](overview.md)
* [Quick Start](quickstart.md)
* [Install](installing.md)
* [Operate](operations.md)
* [Upgrade](upgrading.md)
* [Concepts](concepts/index.md)
  * [Overview](concepts/overview.md)
  * [Kafka Primer](concepts/apache-kafka-primer.md)
  * [Connectors](concepts/connectors.md)
  * [Events](concepts/events.md)
  * [Functions](concepts/functions.md)
  * [Lambda Functions](concepts/lambda-functions.md)
  * [Materialized Views](concepts/materialized-views.md)
  * [Queries](concepts/queries.md)
  * [Streams](concepts/streams.md)
  * [Stream Processing](concepts/stream-processing.md)
  * [Tables](concepts/tables.md)
  * [Time and Windows in ksqlDB Queries](concepts/time-and-windows-in-ksqldb-queries.md)
* [How-to Guides](how-to-guides/index.md)
  * [Overview](how-to-guides/overview.md)
  * [Control the Case of Identifiers](how-to-guides/control-the-case-of-identifiers.md)
  * [Convert a Changelog to a Table](how-to-guides/convert-changelog-to-table.md)
  * [Create a User-defined Function](how-to-guides/create-a-user-defined-function.md)
  * [Manage Connectors](how-to-guides/use-connector-management.md)
  * [Query Structured Data](how-to-guides/query-structured-data.md)
  * [Test an Application](how-to-guides/test-an-app.md)
  * [Update a Running Persistent Query](how-to-guides/update-a-running-persistent-query.md)
  * [Use Variables in SQL Statements](how-to-guides/substitute-variables.md)
  * [Use a Custom Timestamp Column](how-to-guides/use-a-custom-timestamp-column.md)
  * [Use Lambda Functions](how-to-guides/use-lambda-functions.md)
* [Develop Applications](developer-guide/index.md)
  * [Overview](developer-guide/overview.md)
  * [Joins](developer-guide/joins/index.md)
  * [Reference](developer-guide/ksqldb-reference/index.md)
  * [REST API](developer-guide/ksqldb-rest-api/index.md)
  * [Java Client](developer-guide/java-client/java-client.md)
* [Operate and Deploy](operate-and-deploy/index.md)
  * [Overview](operate-and-deploy/overview.md)
  * [Installation](operate-and-deploy/installation/index.md)
  * [ksqlDB Architecture](operate-and-deploy/how-it-works.md)
  * [Capacity Planning](operate-and-deploy/capacity-planning.md)
  * [Changelog](operate-and-deploy/changelog.md)
  * [Processing Guarantees](operate-and-deploy/processing-guarantees.md)
  * [High Availability](operate-and-deploy/high-availability.md)
  * [High Availability Pull Queries](operate-and-deploy/high-availability-pull-queries.md)
  * [KSQL versus ksqlDB](operate-and-deploy/ksql-vs-ksqldb.md)
  * [Logging](operate-and-deploy/logging.md)
  * [Manage Metadata Schemas](operate-and-deploy/migrations-tool.md)
  * [Monitoring](operate-and-deploy/monitoring.md)
  * [Performance Guidelines](operate-and-deploy/performance-guidelines.md)
  * [Schema Inference With ID](operate-and-deploy/schema-inference-with-id.md)
  * [Schema Inference](operate-and-deploy/schema-registry-integration.md)
* [Reference](reference/index.md)
  * [Overview](reference/overview.md)
  * [SQL](reference/sql/index.md)
  * [Metrics](reference/metrics.md)
  * [Migrations Tool](reference/migrations-tool-configuration.md)
  * [Processing Log](reference/processing-log.md)
  * [Serialization Formats](reference/serialization.md)
  * [Server Configuration Parameters](reference/server-configuration.md)
  * [User-defined functions (UDFs)](reference/user-defined-functions.md)
* [Run ksqlDB in Confluent Cloud](https://docs.confluent.io/cloud/current/ksqldb/ksqldb-quick-start.html)
* [Connect Local ksqlDB to Confluent Cloud](https://docs.confluent.io/cloud/current/cp-component/ksql-cloud-config.html)
* [Connect ksqlDB to Control Center](integrate-ksql-with-confluent-control-center.md)
* [Secure ksqlDB with RBAC](ksqldb-redirect.md)
* [Frequently Asked Questions](faq.md)
* [Troubleshoot](troubleshoot-ksql.md)
* [Tutorials and Examples](tutorials/index.md)


# Confluent Platform


    <img class="bg-image" src="_static/images/platform-header-bg.svg" />

        Confluent Platform Documentation
        <p class="content-section-description">
        An enterprise-grade distribution of Apache Kafka® that is available on-premises as self-managed software, complete with enterprise-grade security, stream processing, and governance tooling.
        </p>
        Quick Start


        Products


                    <img class="card-image" src="_static/images/card-schema-registry.png" />


                    Schema Registry


                    Schema Registry provides a serving layer for your metadata. It provides a RESTful interface for storing and retrieving your Avro®, JSON Schema, and Protobuf schemas.


                    <img class="card-image" src="_static/images/card-kafka-clients.png" />


                    Kafka Clients


                    Clients make it fast and easy to produce and consume messages through Apache Kafka. Official Confluent clients are available for Java, along with librdkafka and derived clients.


                    <img class="card-image" src="_static/images/card-kafka-connectors-a.png" />

                Kafka Connect

                Use connectors to stream data between Apache Kafka and other systems that you want to pull data from or push data to.


                    <img class="card-image" src="_static/images/card-usm.png" />

                Unified Stream Manager

                Unified Stream Manager provides a single, centralized interface to manage and monitor both self-managed Confluent Platform
                clusters and fully-managed Confluent Cloud clusters.


                    <img class="card-image" src="_static/images/card-flink.png" />


                    Confluent Platform for Apache Flink


                    Use Confluent Platform for Apache Flink® to run complex, stateful, and low-latency streaming
                    applications.


                    <img class="card-image" src="_static/images/card-confluent-cli.png" />


                    Confluent CLI


                    Command line interface for administering your streaming service, including Apache Kafka topics, clusters, schemas, Connectors, security, billing, and more.


                    <img class="card-image" src="_static/images/card-ansible.png" />


                    Ansible Playbooks


                    Automate configuration and deployment of Confluent Platform on multiple hosts.


                    <img class="card-image" src="_static/images/card-kubernetes.png" />


                    Confluent for Kubernetes


                   Deploy and manage Confluent Platform as a cloud-native system on Kubernetes.


                    <img class="card-image" src="_static/images/card-ksql-db.png" />


                    Kafka Streams


                    Build applications and microservices using Kafka Streams.


                    <img class="card-image" src="_static/images/card-kafka-api.png" />


                    Kafka APIs


                    Apache Kafka provides APIs for Producer, Consumer, Streams, Connect, and Admin.


                    <img class="card-image" src="_static/images/card-security.png" />


                    Security


                    Security protects your mission-critical services and
                    data, end-to-end across all Confluent Platform components.


        Get started


                <img class="card-icon" src="_static/images/icons/icon-settings.svg" />

                    Kafka Configs


                    The Kafka configuration reference provides broker, topic, producer, consumer, Connect, Streams, and AdminClient configuration properties.


                <img class="card-icon" src="_static/images/icons/icon-connector.svg" />

                    Kafka Connect


                    Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other data systems.


                <img class="card-icon" src="_static/images/icons/icon-lock.svg" />

                    Authorization using ACLs


                    Apache Kafka ships with a pluggable, out-of-the-box Authorizer implementation that uses Apache ZooKeeper to store all the ACLs.


        Learning resources


                <img class="card-icon" src="_static/images/icons/icon-kafka-blue.svg" />

                    Apache Kafka 101


                    Learn the fundamentals of Apache Kafka with this video course.


                <img class="card-icon" src="_static/images/icons/icon-connector.svg" />

                    Introduction to Kafka Connect


                    Learn the fundamentals of Kafka Connect with this video course.


                <img class="card-icon" src="_static/images/icons/icon-kafka streams.svg" />

                    Kafka Streams 101


                    Learn the fundamentals of Kafka Streams with this video course.


## Avro and Confluent Cloud Schema Registry


This example is similar to the previous example, except the value is formatted
as Avro and integrates with the Confluent Cloud Schema Registry. Before using Confluent Cloud Schema Registry, check
its [availability and limits](https://docs.confluent.io/cloud/current/overview.html).

1.

   As described in the [Quick Start for Schema Management on Confluent Cloud](https://docs.confluent.io/cloud/current/get-started/schema-registry.html) in the Confluent Cloud Console, enable
   Confluent Cloud Schema Registry and create an API key and secret to connect
   to it.
2.

   Verify that your VPC can connect to the Confluent Cloud Schema Registry public internet endpoint.
3.

   Update your local configuration file (for example, at `$HOME/.confluent/java.config`) with parameters to connect to Schema Registry.
   - Template configuration file for Confluent Cloud
     ```none
     # Required connection configs for Kafka producer, consumer, and admin
     bootstrap.servers={{ BROKER_ENDPOINT }}
     security.protocol=SASL_SSL
     sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}';
     sasl.mechanism=PLAIN
     # Required for correctness in Apache Kafka clients prior to 2.6
     client.dns.lookup=use_all_dns_ips

     # Best practice for higher availability in Apache Kafka clients prior to 3.0
     session.timeout.ms=45000

     # Best practice for Kafka producer to prevent data loss
     acks=all

     # Required connection configs for Confluent Cloud Schema Registry
     schema.registry.url=https://{{ SR_ENDPOINT }}
     basic.auth.credentials.source=USER_INFO
     basic.auth.user.info={{ SR_API_KEY }}:{{ SR_API_SECRET }}
     ```
   - Template configuration file for local host
     ```none
     # Kafka
     bootstrap.servers=localhost:9092

     # Confluent Schema Registry
     schema.registry.url=http://localhost:8081
     ```
4.

   Verify your Confluent Cloud Schema Registry credentials by listing the Schema Registry subjects.
   In the following example, substitute your values for `{{ SR_API_KEY }}`,
   `{{ SR_API_SECRET }}`, and `{{ SR_ENDPOINT }}`.
   ```text
   curl -u {{ SR_API_KEY }}:{{ SR_API_SECRET }} https://{{ SR_ENDPOINT }}/subjects
   ```


# Connect Local ksqlDB to Confluent Cloud

You can connect ksqlDB to your Apache Kafka® cluster in Confluent Cloud.

The ksqlDB servers must be configured to use Confluent Cloud. The ksqlDB CLI does not require configuration.

**Prerequisites**

- [Confluent Platform](/platform/current/installation/index.html)
- [Confluent Cloud CLI](https://docs.confluent.io/confluent-cli/current/overview.html)

1. Use the Confluent CLI to log in to your Confluent Cloud cluster, and run the
   `confluent kafka cluster list` command to get the Kafka cluster ID.
   ```bash
   confluent kafka cluster list
   ```

   Your output should resemble:
   ```none
         Id      |       Name        |     Type     | Cloud    |  Region  | Availability | Status
   +-------------+-------------------+--------------+----------+----------+--------------+--------+
       lkc-a123b | ksqldb-quickstart | BASIC_LEGACY | gcp      | us-west2 | multi-zone   | UP
   ```
2. Run the `confluent kafka cluster describe` command to get the endpoint for
   your Confluent Cloud cluster.
   ```bash
   confluent kafka cluster describe lkc-a123b
   ```

   Your output should resemble:
   ```text
    +--------------+--------------------------------------------------------+
    | Id           | lkc-a123b                                              |
    | Name         | ksqldb-quickstart                                      |
    | Type         | BASIC                                                  |
    | Ingress      |                                                    100 |
    | Egress       |                                                    100 |
    | Storage      |                                                   5000 |
    | Cloud        | azure                                                  |
    | Availability | single-zone                                            |
    | Region       | us-west2                                               |
    | Status       | UP                                                     |
    | Endpoint     | SASL_SSL://pkc-4s987.us-west2.gcp.confluent.cloud:9092 |
    | ApiEndpoint  | https://pkac-42kz6.us-west2.gcp.confluent.cloud        |
    +--------------+--------------------------------------------------------+
   ```

   Save the `Endpoint` value, which you’ll use in a later step.
3. Create a service account named `my-ksqldb-app`. You must include a
   description.
   ```bash
   confluent iam service-account create my-ksqldb-app --description "My ksqlDB API and secrets service account."
   ```

   Your output should resemble:
   ```text
   +-------------+--------------------------------+
   | Id          |                         123456 |
   | Resource ID | sa-efg123                      |
   | Name        | my-ksqldb-app                  |
   | Description | My ksqlDB API and secrets      |
   |             | service account.               |
   +-------------+--------------------------------+
   ```

   Save the service account ID, which you’ll use in later steps.
4. Create an API key and secret for service account `123456`. Be sure to
   replace the service account ID and Kafka cluster ID values shown here with your
   own:
   ```bash
   confluent api-key create --service-account 123456 --resource lkc-a123b
   ```

   Your output should resemble:
   ```text
   It may take a couple of minutes for the API key to be ready.
   Save the API key and secret. The secret is not retrievable later.
   +---------+------------------------------------------------------------------+
   | API key | ABCXQHYDZXMMUDEF                                                 |
   | Secret  | aBCde3s54+4Xv36YKPLDKy2aklGr6x/ShUrEX5D1Te4AzRlphFlr6eghmPX81HTF |
   +---------+------------------------------------------------------------------+
   ```

   #### IMPORTANT
   **Save the API key and secret.** You require this information to
   configure your client applications. Be aware that this is the *only*
   time that you can access and view the API key and secret.
5. Customize your `/etc/ksqldb/ksql-server.properties` properties file.

   The following example shows the minimum configuration required to use ksqlDB
   with Confluent Cloud. You should also review the
   [Recommended ksqlDB production settings](/platform/current/ksqldb/operate-and-deploy/installation/server-config.html).
   Replace `<api-key>` and `<api-secret>` with the API key and secret that
   you generated previously.
   ```properties
   # For bootstrap.servers, assign the Endpoint value from the "confluent kafka cluster describe" command.
   # eg. pkc-4s087.us-west2.gcp.confluent.cloud:9092
   bootstrap.servers=<broker-endpoint>
   ksql.internal.topic.replicas=3
   ksql.streams.replication.factor=3
   ksql.logging.processing.topic.replication.factor=3
   listeners=http://0.0.0.0:8088
   security.protocol=SASL_SSL
   sasl.mechanism=PLAIN
   # Replace <api-key> and <api-secret> with your API key and secret.
   sasl.jaas.config=\
       org.apache.kafka.common.security.plain.PlainLoginModule required \
       username="<api-key>" \
       password="<api-secret>";
   ```
6. (Optional) Add configs for Confluent Cloud Schema Registry per the example in
   [ksql-server-ccloud.delta](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs/ksql-server-ccloud.delta)
   on GitHub at [ccloud/examples/template_delta_configs](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs).
   ```properties
   # Confluent Schema Registry configuration for ksqlDB Server
   ksql.schema.registry.basic.auth.credentials.source=USER_INFO
   ksql.schema.registry.basic.auth.user.info=<SCHEMA_REGISTRY_API_KEY>:<SCHEMA_REGISTRY_API_SECRET>
   ksql.schema.registry.url=https://<SCHEMA_REGISTRY_ENDPOINT>
   ```
7. Restart the ksqlDB server. The steps to restart are
   [dependent on your environment](/platform/current/installation/installing_cp/index.html).

For more information,
[ksqlDB Configuration Parameter Reference](/platform/current/ksqldb/operate-and-deploy/installation/server-config.html).


### Configure and connect

1. Configure Schema Registry by modifying `etc/schema-registry/schema-registry.properties`. The minimally required Schema Registry property settings for Confluent Cloud are provided below:
   ```bash
   # If set to true, API requests that fail will include extra debugging information, including stack traces.
   debug=false

   # REQUIRED: Specifies the bootstrap servers for your Kafka cluster. It is used for selecting the primary
   # Schema Registry instance and for storing the registered schema data.
   kafkastore.bootstrap.servers=<bootstrap-servers>

   # REQUIRED: Specifies Confluent Cloud authentication.
   kafkastore.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
     username="<api-key>" \
     password="<api-secret>";

   # Configures Schema Registry to use SASL authentication.
   kafkastore.sasl.mechanism=PLAIN

   # Configures Schema Registry for SSL encryption.
   kafkastore.security.protocol=SASL_SSL

   # Specifies the name of the topic to store schemas in.
   kafkastore.topic=_schemas

   # Specifies the address the socket server listens on. The format is
   # "listeners = listener_name://host_name:port". For example, "listeners = PLAINTEXT://your.host.name:9092".
   listeners=http://0.0.0.0:8081
   ```

   For more information, see [Schema Registry configuration options](/platform/current/schema-registry/installation/deployment.html), [Configuring PLAIN](/platform/current/kafka/authentication_sasl/authentication_sasl_plain.html#sr),
   and [Quick Start for Schema Management on Confluent Cloud](../get-started/schema-registry.md#cloud-sr-config) (for native cloud Schema Registry).
2. Start Schema Registry with the `schema-registry.properties` file specified.
   ```bash
   bin/schema-registry-start etc/schema-registry/schema-registry.properties
   ```


# Configure and Manage Confluent Platform

Use these resources to administer Confluent Platform in your environment.

* [Overview](config-manage/overview.md)
* [Configuration Reference](installation/configuration/config-index.md)
  * [Overview](installation/configuration/index.md)
  * [Configure Brokers and Controllers](installation/configuration/broker-configs.md)
  * [Configure Topics](installation/configuration/topic-configs.md)
  * [Configure Consumers](installation/configuration/consumer-configs.md)
  * [Configure Producers](installation/configuration/producer-configs.md)
  * [Configure Connect](installation/configuration/connect/overview.md)
    * [Overview](installation/configuration/connect/index.md)
    * [Configure Sink Connectors](installation/configuration/connect/sink-connect-configs.md)
    * [Configure Source Connectors](installation/configuration/connect/source-connect-configs.md)
  * [Configure AdminClient](installation/configuration/admin-configs.md)
  * [Configure Licenses](installation/configuration/license-configs.md)
  * [Configure Streams](installation/configuration/streams-configs.md)
* [CLI Tools for Use with Confluent Platform](tools/cli-reference-overview.md)
* [Change Configurations Without Restart](kafka/dynamic-config.md)
* [Manage Clusters](clusters/index.md)
  * [Overview](clusters/overview.md)
  * [Cluster Metadata Management](kafka-metadata/index.md)
    * [Overview](kafka-metadata/overview.md)
    * [KRaft Overview](kafka-metadata/kraft.md)
    * [Configure KRaft](kafka-metadata/config-kraft.md)
    * [Find ZooKeeper Resources](kafka-metadata/zk-production.md)
  * [Manage Self-Balancing Clusters](clusters/sbc/overview.md)
    * [Overview](clusters/sbc/index.md)
    * [Tutorial: Adding and Remove Brokers](clusters/sbc/sbc-tutorial.md)
    * [Configure](clusters/sbc/configuration-options.md)
    * [Performance and Resource Usage](clusters/sbc/performance.md)
  * [Auto Data Balancing](clusters/rebalancer/overview.md)
    * [Overview](clusters/rebalancer/index.md)
    * [Quick Start](clusters/rebalancer/quickstart.md)
    * [Tutorial: Add and Remove Brokers](clusters/rebalancer/adb-docker-tutorial.md)
    * [Configure](clusters/rebalancer/configuration-options.md)
  * [Tiered Storage](clusters/tiered-storage.md)
* [Metadata Service (MDS) in Confluent Platform](kafka/configure-mds/overview.md)
  * [Configure MDS](kafka/configure-mds/index.md)
  * [Configure Communication with MDS over TLS](kafka/configure-mds/mds-ssl-config-for-components.md)
  * [Configure mTLS Authentication and RBAC for Kafka Brokers](kafka/configure-mds/mutual-tls-auth-rbac.md)
  * [Configure Kerberos Authentication for Brokers Running MDS](kafka/configure-mds/kerberos-auth-config.md)
  * [Configure LDAP Authentication](kafka/configure-mds/ldap-auth-mds.md)
  * [Configure LDAP Group-Based Authorization for MDS](kafka/configure-mds/ldap-auth-config.md)
  * [MDS as token issuer](kafka/configure-mds/mds-token-issuer.md)
  * [Metadata Service Configuration Settings](kafka/configure-mds/mds-configuration.md)
  * [MDS File-Based Authentication for Confluent Platform](kafka/configure-mds/mds-file-configuration.md)
* [Docker Operations for Confluent Platform](installation/docker/operations/overview.md)
  * [Overview](installation/docker/operations/index.md)
  * [Monitor and Track Metrics Using JMX](installation/docker/operations/monitoring.md)
  * [Configure Logs](installation/docker/operations/logging.md)
  * [Mount External Volumes](installation/docker/operations/external-volumes.md)
  * [Configure a Multi-Node Environment](kafka/multi-node.md)
* [Run Kafka in Production](kafka/deployment.md)
* [Production Best Practices](kafka/post-deployment.md)


### Security and resilience features

Confluent Platform also offers a number of features that build on Kafka’s security features to help ensure your deployment stays secure and resilient.

- You can set authorization by role with Confluent’s [Role-based Access Control (RBAC)](../security/authorization/rbac/overview.md#rbac-overview) feature.
- If you use Control Center, you can set up [Single Sign On (SSO)](../security/authentication/sso-for-c3/overview.md#sso-for-c3)  that integrates
  with a supported OIDC identity provider, and enable additional security measures such as multi-factor authentication.
- The [REST Proxy Security Plugins in Confluent Platform](../confluent-security-plugins/kafka-rest.md#kafka-rest-security-plugins-install) and  [Schema Registry Security Plugin for Confluent Platform](../confluent-security-plugins/schema-registry/introduction.md#confluentsecurityplugins-schema-registry-security-plugin) add security
  capabilities to the Confluent Platform REST Proxy and Schema Registry.
  The Confluent REST Proxy Security Plugin helps in authenticating the incoming requests and propagating
  the authenticated principal to requests to Kafka. This enables Confluent REST Proxy clients to utilize the multi-tenant security features of the Kafka broker.
  The Schema Registry Security Plugin supports authorization for both role-based access control (RBAC) and ACLs.
- [Audit logs](../security/compliance/audit-logs/audit-logs-concepts.md#audit-logs-concepts) provide the ability to capture, protect, and preserve authorization
  activity into topics in Kafka clusters on Confluent Platform using [Confluent Server Authorizer](../security/csa-introduction.md#confluent-server-authorizer).
- The [Cluster Linking](../multi-dc-deployments/cluster-linking/index.md#cluster-linking) feature enables you to directly connect clusters and mirror topics from one cluster to
  another. This makes it easier to build multi-datacenter, multi-region and hybrid cloud deployments.
- [Confluent Replicator](../multi-dc-deployments/replicator/index.md#replicator-detail) makes it easier to maintain multiple Kafka clusters in multiple data
  centers. Managing replication of data and topic configuration between data centers enables use-cases such as
  active geo-localized deployments, centralized analytics and cloud migration.
  You can use Replicator to configure and manage replication for all these scenarios from either
  Control Center or command-line tools.
  To get started, see the [Replicator documentation](../multi-dc-deployments/replicator/index.md#replicator-detail), including the [Replicator Quick Start](../multi-dc-deployments/replicator/replicator-quickstart.md#replicator-quickstart).


# CONTROL CENTER

* [Overview](overview.md)
* [Install and Configure](installation/index.md)
  * [Installation](installation/overview.md)
  * [System Requirements](installation/system-requirements.md)
  * [Support Policy](installation/support-policy.md)
  * [Sample Configurations](installation/properties.md)
  * [Configuration Reference](installation/configuration.md)
  * [Data Retention](installation/data-retention.md)
  * [Manage Licenses](installation/license.md)
  * [Monitor Logs](installation/logging.md)
  * [Manage Updates](installation/auto-update-ui.md)
  * [Troubleshoot](installation/troubleshooting.md)
  * [Migrate Alerts](installation/alert-migrate.md)
  * [Upgrade](installation/upgrade.md)
* [Security](security/index.md)
  * [Overview](security/overview.md)
  * [Configure TLS](security/ssl.md)
  * [Configure SASL](security/sasl.md)
  * [Configure HTTP Basic Authentication](security/authentication.md)
  * [Configure LDAP](security/c3-auth-ldap.md)
  * [Configure RBAC](security/c3-rbac.md)
  * [Authorize with Kafka ACLs](security/config-c3-for-kafka-acls.md)
  * [Add mTLS Authentication for Monitoring and Alerting](security/mtls-to-alert.md)
  * [Add Basic Authentication for Monitoring and Alerting](security/broker-to-alert.md)
  * [Manage and View RBAC Roles](security/c3-rbac-roles.md)
    * [Sign in to Control Center when RBAC enabled on Confluent Platform](security/c3-rbac-login.md)
    * [Manage RBAC roles with Control Center on Confluent Platform](security/c3-rbac-manage-roles-ui.md)
    * [View your RBAC roles in Control Center on Confluent Platform](security/c3-rbac-view-roles-ui.md)
* [Manage Clusters](clusters.md)
* [Manage Brokers](brokers.md)
* [Manage Topics](topics/index.md)
  * [Overview](topics/overview.md)
  * [Create Topics](topics/create.md)
  * [Topic Metrics](topics/view.md)
  * [View Messages](topics/messages.md)
  * [Configure Topics](topics/edit.md)
  * [Delete Topics](topics/delete.md)
* [Connect](connect.md)
* [Manage Flink](cmf.md)
* [ksqlDB](ksql.md)
* [Clients](clients/index.md)
  * [Overview](clients/overview.md)
  * [Consumers Groups](clients/consumers.md)
  * [Reset Offsets](clients/reset-offsets.md)
  * [Configure Cluster](clients/cluster-configuration.md)
* [Copy Topics](replicators.md)
* [Alerts](alerts/index.md)
  * [Overview](alerts/concepts.md)
  * [Manage Alerts](alerts/navigate.md)
  * [Configure Alerts](alerts/configure.md)
  * [Manage Triggers](alerts/triggers.md)
  * [Manage Actions](alerts/actions.md)
  * [Configure PagerDuty](alerts/pagerduty.md)
  * [REST API](alerts/rest.md)
  * [Usage Examples](alerts/examples.md)
  * [Troubleshoot](alerts/trouble.md)
* [Release Notes](release-notes.md)


# Manage Schemas on Confluent Platform


* [Overview](index.md)
* [Get Started with Schema Registry Tutorial](schema_registry_onprem_tutorial.md)
* [Install and Configure](installation/overview.md)
* [Fundamentals](fundamentals/overview.md)
  * [Concepts](fundamentals/index.md)
  * [Schema Evolution and Compatibility](fundamentals/schema-evolution.md)
  * [Schema Formats](fundamentals/serdes-develop/overview.md)
    * [Serializers and Deserializers Overview](fundamentals/serdes-develop/index.md)
    * [Avro](fundamentals/serdes-develop/serdes-avro.md)
    * [Protobuf](fundamentals/serdes-develop/serdes-protobuf.md)
    * [JSON Schema](fundamentals/serdes-develop/serdes-json.md)
  * [Data Contracts](fundamentals/data-contracts.md)
* [Manage Schemas](schemas-overview.md)
  * [Work with Schemas in Control Center](schema.md)
  * [Schema Contexts](schema-contexts-cp.md)
  * [Schema Linking](schema-linking-cp.md)
  * [Validate Schema IDs](schema-validation.md)
  * [Monitor](monitoring.md)
  * [Delete Schemas](schema-deletion-guidelines.md)
  * [Integrate Schemas from Connectors](connect.md)
* [Security](security/overview.md)
  * [Overview](security/index.md)
  * [Configure Role-Based Access Control](security/rbac-schema-registry.md)
  * [Configure OAuth](security/oauth-schema-registry.md)
  * [Schema Registry Security Plugin](../confluent-security-plugins/schema-registry/overview.md)
    *  [Overview](../confluent-security-plugins/schema-registry/introduction.md)
    *  [Install](../confluent-security-plugins/schema-registry/install.md)
    *  [Schema Registry Authorization](../confluent-security-plugins/schema-registry/authorization/overview.md)
      * [Operation and Resource Support](../confluent-security-plugins/schema-registry/authorization/index.md)
      * [Role-Based Access Control](security/rbac-schema-registry.md)
      * [ACL Authorizer](../confluent-security-plugins/schema-registry/authorization/sracl_authorizer.md)
      * [Topic ACL Authorizer](../confluent-security-plugins/schema-registry/authorization/topicacl_authorizer.md)
  * [Passwordless authentication for Schema Registry](security/passwordless-auth.md)
* [Reference](develop/overview.md)
  *  [Overview](develop/index.md)
  *  [Maven Plugin](develop/maven-plugin.md)
  *  [API](develop/api.md)
  *  [API Examples](develop/using.md)
* [FAQ](faqs-cp.md)


### Start the stack

To get started, create the following `docker-compose.yml` file. This
specifies all the infrastructure that you’ll need to run this tutorial:

```yaml

version: '2'

services:
  broker:
    image: confluentinc/cp-kafka:8.1.0
    hostname: broker
    container_name: broker
    ports:
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1

  schema-registry:
    image: confluentinc/cp-schema-registry:8.1.0
    hostname: schema-registry
    container_name: schema-registry
    depends_on:
      - broker
    ports:
      - "8081:8081"
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: "PLAINTEXT://broker:9092"

  ksqldb-server:
    image: confluentinc/cp-ksqldb-server:8.1.0
    hostname: ksqldb-server
    container_name: ksqldb-server
    depends_on:
      - broker
      - schema-registry
    ports:
      - "8088:8088"
    environment:
      KSQL_LISTENERS: "http://0.0.0.0:8088"
      KSQL_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
      # Configuration to embed Kafka Connect support.
      KSQL_CONNECT_GROUP_ID: "ksql-connect-cluster"
      KSQL_CONNECT_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.storage.StringConverter"
      KSQL_CONNECT_VALUE_CONVERTER: "io.confluent.connect.avro.AvroConverter"
      KSQL_CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_CONNECT_CONFIG_STORAGE_TOPIC: "_ksql-connect-configs"
      KSQL_CONNECT_OFFSET_STORAGE_TOPIC: "_ksql-connect-offsets"
      KSQL_CONNECT_STATUS_STORAGE_TOPIC: "_ksql-connect-statuses"
      KSQL_CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_PLUGIN_PATH: "/usr/share/kafka/plugins"

  ksqldb-cli:
    image: confluentinc/cp-ksqldb-cli:8.1.0
    container_name: ksqldb-cli
    depends_on:
      - broker
      - ksqldb-server
    entrypoint: /bin/sh
    tty: true
```

Bring up the stack by running:

```bash
docker-compose up
```


## Configuration steps

Following are the basic configuration steps:

1. Using an account with [OrganizationAdmin access](../security/access-control/rbac/predefined-rbac-roles.md#organizationadmin-role), create an API key and secret to connect to Confluent Cloud.
   For details, refer to [Use API Keys to Authenticate to Confluent Cloud](../security/authenticate/workload-identities/service-accounts/api-keys/overview.md#cloud-api-keys).
2. Validate that Confluent Cloud can be accessed from the machine where you are installing Control Center (Legacy).
   - Check connection by using `confluent kafka topic list`.
   - Try producing or consuming from the machine.
3. Install Control Center (Legacy) using the [documentation](/platform/current/control-center/installation/configure-control-center.html).
4. Configure Control Center (Legacy) with the Confluent Cloud specific settings. A minimum valid configuration is shown below. These settings
   are different from the standard Confluent Cloud configuration. Customize the `bootstrap.servers` and `confluent.controlcenter.streams.sasl.jaas.config`
   for your Confluent Cloud cluster.
   ```bash
   bootstrap.servers=<cloud-bootstrap-servers>
   confluent.controlcenter.streams.security.protocol=SASL_SSL
   confluent.controlcenter.streams.sasl.mechanism=PLAIN
   confluent.controlcenter.streams.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
     username="<api-key>" \
     password="<api-secret>";
   confluent.metrics.topic.max.message.bytes=8388608
   confluent.controlcenter.streams.ssl.endpoint.identification.algorithm=https
   ```

   #### IMPORTANT
   The `confluent.metrics.topic.max.message.bytes` property must be set to
   `8388608`. See
   [Control Center Cannot Connect to Confluent Cloud](/platform/current/control-center/installation/troubleshooting.html#c3-connect-ccloud-max-bytes) for details.
5. Configure data stream interceptors by following the [documentation](/platform/current/control-center/installation/clients.html)
   security configuration that must be added:
   ```bash
   confluent.monitoring.interceptor.security.protocol=SASL_SSL
   confluent.monitoring.interceptor.sasl.mechanism=PLAIN
   confluent.monitoring.interceptor.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<api-key>" password="<api-secret>";
   confluent.monitoring.interceptor.ssl.endpoint.identification.algorithm=https
   ```
6. (Optional) Add configs for Confluent Cloud Schema Registry per the example
   in [control-center-ccloud.delta](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs/control-center-ccloud.delta) on GitHub at [ccloud/examples/template_delta_configs](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs).
   The `schema.registry.url` for Control Center (Legacy) is specified using an HTTPS protocol
   prefix which requires an explicit `443` port, as shown in the example.
   ```bash
   # Confluent Schema Registry configuration for Confluent Control Center
   confluent.controlcenter.schema.registry.basic.auth.credentials.source=USER_INFO
   confluent.controlcenter.schema.registry.basic.auth.user.info=<SCHEMA_REGISTRY_API_KEY>:<SCHEMA_REGISTRY_API_SECRET>
   confluent.controlcenter.schema.registry.url=https://<SCHEMA_REGISTRY_ENDPOINT>:443
   ```


### Distributed Cluster

1. Create a distributed properties file named
   `my-connect-distributed.properties` in the config directory. The contents
   of this distributed properties file should resemble the following example.
   Note the security properties with `consumer.*` and `producer.*`
   prefixes.
   ```bash
   bootstrap.servers=<cloud-bootstrap-servers>

   group.id=connect-cluster

   key.converter=org.apache.kafka.connect.json.JsonConverter
   value.converter=org.apache.kafka.connect.json.JsonConverter
   key.converter.schemas.enable=false
   value.converter.schemas.enable=false

   internal.key.converter=org.apache.kafka.connect.json.JsonConverter
   internal.value.converter=org.apache.kafka.connect.json.JsonConverter
   internal.key.converter.schemas.enable=false
   internal.value.converter.schemas.enable=false

   # Connect clusters create three topics to manage offsets, configs, and status
   # information. Note that these contribute towards the total partition limit quota.
   offset.storage.topic=connect-offsets
   offset.storage.replication.factor=3
   offset.storage.partitions=3

   config.storage.topic=connect-configs
   config.storage.replication.factor=3

   status.storage.topic=connect-status
   status.storage.replication.factor=3

   offset.flush.interval.ms=10000

   ssl.endpoint.identification.algorithm=https
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
   username="<api-key>" password="<api-secret>";
   security.protocol=SASL_SSL

   consumer.ssl.endpoint.identification.algorithm=https
   consumer.sasl.mechanism=PLAIN
   consumer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
   username="<api-key>" password="<api-secret>";
   consumer.security.protocol=SASL_SSL

   producer.ssl.endpoint.identification.algorithm=https
   producer.sasl.mechanism=PLAIN
   producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
   username="<api-key>" password="<api-secret>";
   producer.security.protocol=SASL_SSL

   # Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
   # (connectors, converters, transformations).
   plugin.path=/usr/share/java,/Users/<username>/confluent-6.2.1/share/confluent-hub-components
   ```
2. (Optional) Add the configuration properties below to the
   `my-connect-distributed.properties` file. This allows connections to Confluent Cloud
   Schema Registry. For an example, see
   [connect-ccloud.delta](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs/connect-ccloud.delta)
   on the [ccloud/examples/template_delta_configs](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs).
   ```bash
   # Confluent Schema Registry for Kafka Connect
   value.converter=io.confluent.connect.avro.AvroConverter
   value.converter.basic.auth.credentials.source=USER_INFO
   value.converter.schema.registry.basic.auth.user.info=<SCHEMA_REGISTRY_API_KEY>:<SCHEMA_REGISTRY_API_SECRET>
   value.converter.schema.registry.url=https://<SCHEMA_REGISTRY_ENDPOINT>
   ```
3. Run Connect using the following command:
   ```bash
   ./bin/connect-distributed ./etc/my-connect-distributed.properties
   ```

   To test if the workers came up correctly, you can set up another file sink as follows. Create a file
   `my-file-sink.json` whose contents are as follows:
   ```text
   {
     "name": "my-file-sink",
     "config": {
       "connector.class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
       "tasks.max": 3,
       "topics": "page_visits",
       "file": "my_file.txt"
     }
   }
   ```

   #### IMPORTANT
   You must include the following properties in the connector configuration
   if you are using a self-managed connector that requires an enterprise
   license.
   ```text
   "confluent.topic.bootstrap.servers":"<cloud-bootstrap-servers>",
   "confluent.topic.sasl.jaas.config":
   "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",
   "confluent.topic.security.protocol":"SASL_SSL",
   "confluent.topic.sasl.mechanism":"PLAIN"
   ```

   #### IMPORTANT
   You must include the following configuration properties if you are using
   a self-managed connector that uses Reporter to write response back to
   Kafka (for example, the [Azure Functions Sink Connector for Confluent
   Platform](../../../kafka-connect-azure-functions/current/index.html)
   or the [Google Cloud Functions Sink Connector for Confluent Platform](../../../kafka-connect-gcp-functions/current/index.html)
   connector) .
   ```text
   "reporter.admin.bootstrap.servers":"<cloud-bootstrap-servers>",
   "reporter.admin.sasl.jaas.config":
   "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",
   "reporter.admin.security.protocol":"SASL_SSL",
   "reporter.admin.sasl.mechanism":"PLAIN",

   "reporter.producer.bootstrap.servers":"<cloud-bootstrap-servers>",
   "reporter.producer.sasl.jaas.config":
   "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",
   "reporter.producer.security.protocol":"SASL_SSL",
   "reporter.producer.sasl.mechanism":"PLAIN"
   ```

   #### IMPORTANT
   You must include the following properties in the connector configuration if you are using the following connectors:

   ### Debezium 2 and later

   ```text
   "schema.history.internal.kafka.bootstrap.servers": "<cloud-bootstrap-servers>",

   "schema.history.internal.consumer.security.protocol": "SASL_SSL",
   "schema.history.internal.consumer.ssl.endpoint.identification.algorithm": "https",
   "schema.history.internal.consumer.sasl.mechanism": "PLAIN",
   "schema.history.internal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",

   "schema.history.internal.producer.security.protocol": "SASL_SSL",
   "schema.history.internal.producer.ssl.endpoint.identification.algorithm": "https",
   "schema.history.internal.producer.sasl.mechanism": "PLAIN",
   "schema.history.internal.producer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";"
   ```

   ### Debezium 1.9 and earlier

   ```text
   "database.history.kafka.bootstrap.servers": "<cloud-bootstrap-servers>",

   "database.history.consumer.security.protocol": "SASL_SSL",
   "database.history.consumer.ssl.endpoint.identification.algorithm": "https",
   "database.history.consumer.sasl.mechanism": "PLAIN",
   "database.history.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",

   "database.history.producer.security.protocol": "SASL_SSL",
   "database.history.producer.ssl.endpoint.identification.algorithm": "https",
   "database.history.producer.sasl.mechanism": "PLAIN",
   "database.history.producer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";"
   ```

   ### Oracle XStream CDC

   ```text
   "schema.history.internal.kafka.bootstrap.servers": "<cloud-bootstrap-servers>",

   "schema.history.internal.consumer.security.protocol": "SASL_SSL",
   "schema.history.internal.consumer.ssl.endpoint.identification.algorithm": "https",
   "schema.history.internal.consumer.sasl.mechanism": "PLAIN",
   "schema.history.internal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",

   "schema.history.internal.producer.security.protocol": "SASL_SSL",
   "schema.history.internal.producer.ssl.endpoint.identification.algorithm": "https",
   "schema.history.internal.producer.sasl.mechanism": "PLAIN",
   "schema.history.internal.producer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",

   # Uncomment and include the following properties only if the connector is configured to use Kafka topics for signaling
   #"signal.kafka.bootstrap.servers": "<cloud-bootstrap-servers>",
   #"signal.consumer.security.protocol": "SASL_SSL",
   #"signal.consumer.ssl.endpoint.identification.algorithm": "https",
   #"signal.consumer.sasl.mechanism": "PLAIN",
   #"signal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";"
   ```
4. Post this connector config to the worker using the curl command:
   ```bash
   curl -s -H "Content-Type: application/json" -X POST -d @my-file-sink.json http://localhost:8083/connectors/ | jq .
   ```

   This should give the following response:
   ```text
   {
     "name": "my-file-sink",
     "config": {
       "connector.class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
       "tasks.max": "1",
       "topics": "page_visits",
       "file": "my_file",
       "name": "my-file-sink"
     },
     "tasks": [],
     "type": null
   }
   ```
5. Produce some records using Confluent Cloud and tail this file to check if the
   connectors were successfully created.


# Connect Self-Managed Kafka Streams to Confluent Cloud

You can connect Kafka Streams to your Confluent Platform Apache Kafka® cluster in Confluent Cloud.

**Prerequisites**

- [Confluent Platform](/platform/current/installation/index.html)

1. Use the Confluent CLI to log in to your Confluent Cloud cluster, and run the
   `confluent kafka cluster list` command to get the Kafka cluster ID.
   ```bash
   confluent kafka cluster list
   ```

   Your output should resemble:
   ```none
     Current |     ID     |    Name    | Type  | Cloud    |  Region  | Availability | Status
   ----------+------------+------------+-------+----------+----------+--------------+---------
     *       | lkc-a123b  | my-cluster | BASIC | azure    | westus2  | single-zone  | UP
   ```
2. Run the `confluent kafka cluster describe` command to get the endpoint for
   your Confluent Cloud cluster.
   ```bash
   confluent kafka cluster describe lkc-a123b
   ```

   Your output should resemble:
   ```text
    +----------------------+---------------------------------------------------------+
    | Current              | true                                                    |
    | ID                   | lkc-a123b                                               |
    | Name                 | wikiedits_cluster                                       |
    | Type                 | BASIC                                                   |
    | Ingress Limit (MB/s) |                                                     250 |
    | Egress Limit (MB/s)  |                                                     750 |
    | Storage              | 5 TB                                                    |
    | Cloud                | azure                                                   |
    | Region               | westus2                                                 |
    | Availability         | single-zone                                             |
    | Status               | UP                                                      |
    | Endpoint             | SASL_SSL://pkc-41973.westus2.azure.confluent.cloud:9092 |
    | REST Endpoint        | https://pkc-41973.westus2.azure.confluent.cloud:443     |
    | Topic Count          |                                                      30 |
    +----------------------+---------------------------------------------------------+
   ```

   Save the `Endpoint` value, which you’ll use in a later step.
3. Create a service account named `my-streams-app`. You must include a
   description.
   ```bash
   confluent iam service-account create my-streams-app --description "My Streams API and secrets service account."
   ```

   Your output should resemble:
   ```text
   +-------------+--------------------------------+
   | ID          | sa-ab01cd                      |
   | Name        | my-streams-app                 |
   | Description | My Streams API and secrets     |
   |             | service account.               |
   +-------------+--------------------------------+
   ```

   Save the service account ID, which you’ll use in later steps.
4. Create an API key and secret for service account `sa-ab01cd`. Be sure to
   replace the service account ID and Kafka cluster ID values shown here with your
   own:
   ```bash
   confluent api-key create --service-account sa-ab01cd --resource lkc-a123b
   ```

   Your output should resemble:
   ```text
   It may take a couple of minutes for the API key to be ready.
   Save the API key and secret. The secret is not retrievable later.
   +---------+------------------------------------------------------------------+
   | API Key | ABCXQHYDZXMMUDEF                                                 |
   | Secret  | aBCde3s54+4Xv36YKPLDKy2aklGr6x/ShUrEX5D1Te4AzRlphFlr6eghmPX81HTF |
   +---------+------------------------------------------------------------------+
   ```

   #### IMPORTANT
   **Save the API key and secret.** You require this information to
   configure your client applications. Be aware that this is the *only*
   time that you can access and view the key and secret.

To connect Streams to Confluent Cloud, update your [existing Streams configs](/platform/current/streams/developer-guide/config-streams.html) with
the properties described here.

1. Create a `java.util.Properties` instance.
2. Configure your streams application. Kafka and Kafka Streams configuration options must be configured
   in the `java.util.Properties` instance before using Streams. In this example you must configure the Confluent Cloud broker
   endpoints (`StreamsConfig.BOOTSTRAP_SERVERS_CONFIG`) and SASL config (`SASL_JAAS_CONFIG`)
   ```java
   import java.util.Properties;
   import org.apache.kafka.clients.producer.ProducerConfig;
   import org.apache.kafka.common.config.SaslConfigs;
   import org.apache.kafka.streams.StreamsConfig;

   Properties props = new Properties();
   // Comma-separated list of the Confluent Cloud broker endpoints. For example:
   // r0.great-app.confluent.aws.prod.cloud:9092,r1.great-app.confluent.aws.prod.cloud:9093,r2.great-app.confluent.aws.prod.cloud:9094
   props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "<broker-endpoint1, broker-endpoint2, broker-endpoint3>");
   props.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 3);
   props.put(StreamsConfig.SECURITY_PROTOCOL_CONFIG, "SASL_SSL");
   props.put(SaslConfigs.SASL_MECHANISM, "PLAIN");
   props.put(SaslConfigs.SASL_JAAS_CONFIG, "org.apache.kafka.common.security.plain.PlainLoginModule required \
   username=\"<api-key>\" password=\"<api-secret>\";");

   // Recommended performance/resilience settings
   props.put(StreamsConfig.producerPrefix(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG), 2147483647);
   props.put(StreamsConfig.producerPrefix(ProducerConfig.MAX_BLOCK_MS_CONFIG), 9223372036854775807);

   // Any further settings
   props.put(... , ...);
   ```
3. (Optional) Add configs for Confluent Cloud Schema Registry to your streams application per the example in [java_streams.delta](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs/java_streams.delta) on GitHub at [ccloud/examples/template_delta_configs](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs).
   ```java
   // Confluent Schema Registry for Java
   props.put("basic.auth.credentials.source", "USER_INFO");
   props.put("schema.registry.basic.auth.user.info", "<SCHEMA_REGISTRY_API_KEY>:<SCHEMA_REGISTRY_API_SECRET>");
   props.put("schema.registry.url", "https://<SCHEMA_REGISTRY_ENDPOINT>");
   ```

- For more information, see [Configuring a Streams Application](/platform/current/streams/developer-guide/config-streams.html).
- To view a working example of hybrid Apache Kafka® clusters from self-hosted to Confluent Cloud, see [cp-demo](/platform/current/tutorials/cp-demo/docs/index.html).
- For example configs for all Confluent Platform components and clients connecting to Confluent Cloud, see [template examples for components](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs).
- To look at all the code used in the Confluent Cloud demo, see the [Confluent Cloud demo examples](https://github.com/confluentinc/examples/tree/latest/ccloud).


## Quick Start

This quick start uses the DynamoDB connector to export data produced by the Avro console producer to DynamoDB.

Before you begin, you must create the user or IAM role running the connector with
[write and create access](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/authentication-and-access-control.html) to DynamoDB.

1. Install the connector through the [Confluent Hub Client](/kafka-connectors/self-managed/confluent-hub/client.html).
   ```bash
   # run from your CP installation directory
   confluent connect plugin install confluentinc/kafka-connect-aws-dynamodb:latest
   ```
2. Start the services using the Confluent CLI.
   ```bash
   confluent local start
   ```

   Every service starts in order, printing a message with its status.
   ```bash
   Starting Zookeeper
   Zookeeper is [UP]
   Starting Kafka
   Kafka is [UP]
   Starting Schema Registry
   Schema Registry is [UP]
   Starting Kafka REST
   Kafka REST is [UP]
   Starting Connect
   Connect is [UP]
   Starting KSQL Server
   KSQL Server is [UP]
   Starting Control Center
   Control Center is [UP]
   ```

   #### NOTE
   You must ensure the connector user has write access to DynamoDB and has
   deployed credentials [appropriately](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html).
   You can also pass more properties to the credentials provider. For
   details, refer to [AWS Credentials](https://docs.confluent.io/kafka-connect-s3-sink/current/index.html#aws-credentials).
3. Start the Avro console producer to import a few records with a simple schema
   in Kafka. Use the following command:
   ```bash
     ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic dynamodb_topic \
   --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'
   ```
4. Enter the following in the console producer:
   ```bash
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```

   The records are published to the Kafka topic `dynamodb_topic` in Avro format.
5. Find the region that the DynamoDB instance is running in (for example, `us-east-2`) and create a config file with the following contents. Save it as `quickstart-dynamodb.properties`.

   #### NOTE
   In the following example, a DynamoDB table called `dynamodb_topic` will be created in your DynamoDB instance.

   ```none
   name=dynamodb-sink
   connector.class=io.confluent.connect.aws.dynamodb.DynamoDbSinkConnector
   tasks.max=1
   topics=dynamodb_topic

   # use the region to populate the next two properties
   aws.dynamodb.region=<region>
   aws.dynamodb.endpoint=https://dynamodb.<region>.amazonaws.com

   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   ```
6. Start the DynamoDB connector by loading its configuration with the following command:
   ```bash
   confluent local load dynamodb-sink --config quickstart-dynamodb.properties

   {
      "name": "dynamodb-sink",
      "config": {
         "connector.class": "io.confluent.connect.aws.dynamodb.DynamoDbSinkConnector",
         "tasks.max": "1",
         "topics": "dynamodb_topic",
         "aws.dynamodb.region": "<region>",
         "aws.dynamodb.endpoint": "https://dynamodb.<region>.amazonaws.com",
         "confluent.topic.bootstrap.servers": "localhost:9092",
         "confluent.topic.replication.factor": "1",
         "name": "dynamodb-sink"
      },
      "tasks": [],
      "type": "sink"
   }
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands in production environments.
7. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status dynamodb-sink
   ```
8. After the connector has ingested some records, use the AWS CLI to check that the data is available in DynamoDB.
   ```bash
   aws dynamodb scan --table-name dynamodb_topic --region us-east-1
   ```

   You should see nine items with keys:
   ```bash
   {
       "Items": [
           {
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "0"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value1"
               }
           },{
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "1"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value2"
               }
           },{
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "2"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value3"
               }
           },{
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "3"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value4"
               }
           },{
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "4"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value5"
               }
           },{
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "5"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value6"
               }
           },{
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "6"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value7"
               }
           },{
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "7"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value8"
               }
           },{
               "partition": {
                   "N": "0"
               },"offset": {
                   "N": "8"
               },"name": {
                   "S": "f1"
               },"type": {
                   "S": "value9"
               }
           }
       ],
       "Count": 9,
       "ScannedCount": 9,
       "ConsumedCapacity": null
   }
   ```
9. Enter the following command to stop the Connect worker and all services:
   ```bash
   confluent local stop
   ```

   Your output should resemble:
   ```none
   Stopping Control Center
   Control Center is [DOWN]
   Stopping KSQL Server
   KSQL Server is [DOWN]
   Stopping Connect
   Connect is [DOWN]
   Stopping Kafka REST
   Kafka REST is [DOWN]
   Stopping Schema Registry
   Schema Registry is [DOWN]
   Stopping Kafka
   Kafka is [DOWN]
   Stopping Zookeeper
   Zookeeper is [DOWN]
   ```

   Or, enter the following command to stop all services and delete all generated data:
   ```bash
   confluent local destroy
   ```

   Your output should resemble:
   ```bash
   Stopping Control Center
   Control Center is [DOWN]
   Stopping KSQL Server
   KSQL Server is [DOWN]
   Stopping Connect
   Connect is [DOWN]
   Stopping Kafka REST
   Kafka REST is [DOWN]
   Stopping Schema Registry
   Schema Registry is [DOWN]
   Stopping Kafka
   Kafka is [DOWN]
   Stopping Zookeeper
   Zookeeper is [DOWN]
   Deleting: /var/folders/ty/rqbqmjv54rg_v10ykmrgd1_80000gp/T/confluent.PkQpsKfE
   ```


#### Reporter example

1. Run the demo app with the `basic-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=basic-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=ReporterExample
   topics=http-messages
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=BASIC
   connection.user=admin
   connection.password=password
   behavior.on.null.values=delete
   # reporter configurations
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1

   reporter.admin.bootstrap.servers=<host>.confluent.cloud:9092
   reporter.admin.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule /
   required username=<username> password=<password>
   reporter.admin.security.protocol=SASL_SSL
   reporter.admin.sasl.mechanism=PLAIN"

   reporter.producer.bootstrap.servers=<host>.confluent.cloud:9092
   reporter.producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule /
   required username=<username> password=<password>
   reporter.producer.security.protocol=SASL_SSL
   reporter.producer.sasl.mechanism=PLAIN"
   ```

   For additional information about Connect Reporter for secure
   environments, see [Kafka Connect
   Reporter](/platform/current/connect/security.html#kconnect-reporter).
3. Publish messages to the topic that have keys and values.
   ```bash
   confluent local produce http-messages --property parse.key=true --property key.separator=,
   > 1,message-value
   > 2,another-message
   ```
4. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart).
5. Consume the records from `success-responses` and `error-responses` topic
   to see the http operation response.
   ```bash
   kafkacat -C -b localhost:9092 -t success-responses -J |jq
   ```

   ```json
   {
     "topic": "success-responses",
     "partition": 0,
     "offset": 0,
     "tstype": "create",
     "ts": 1581579911854,
     "headers": [
       "input_record_offset",
       "0",
       "input_record_timestamp",
       "1581488456476",
       "input_record_partition",
       "0",
       "input_record_topic",
       "http-connect"
     ],
     "key": null,
     "payload": "{\"id\":1,\"message\":\"1,message-value\"}"
   }
   {
     "topic": "success-responses",
     "partition": 0,
     "offset": 1,
     "tstype": "create",
     "ts": 1581579911854,
     "headers": [
       "input_record_offset",
       "1",
       "input_record_timestamp",
       "1581488456476",
       "input_record_partition",
       "0",
       "input_record_topic",
       "http-connect"
     ],
     "key": null,
     "payload": "{\"id\":2,\"message\":\"2,message-value\"}"
   }
   ```

   In case of retryable errors (that is, errors with a 5xx status code), a
   response like the one shown below is included in the error-responses topic.
   ```bash
   kafkacat -C -b localhost:9092 -t error-responses -J |jq
   ```

   ```json
   {
     "topic": "error-responses",
     "partition": 0,
     "offset": 0,
     "tstype": "create",
     "ts": 1581579911854,
     "headers": [
       "input_record_offset",
       "0",
       "input_record_timestamp",
       "1581579931450",
       "input_record_partition",
       "0",
       "input_record_topic",
       "http-messages"
     ],
     "key": null,
     "payload": "Retry time lapsed, unable to process HTTP request. Error while processing HTTP request with Url : http://localhost:8080/api/messages, Payload : 6,test, Status code : 500, Reason Phrase : , Response Content : {\"timestamp\":\"2020-02-11T10:44:41.574+0000\",\"status\":500,\"error\":\"Internal Server Error\",\"message\":\"Unresolved compilation problem: \\n\\tlog cannot be resolved\\n\",\"path\":\"/api/messages\"}, "
   }
   ```


#### Distributed worker configuration

1. Create your `my-connect-distributed.properties` file based on the following example.
   ```properties
   bootstrap.servers=<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=io.confluent.connect.avro.AvroConverter

   ssl.endpoint.identification.algorithm=https
   security.protocol=SASL_SSL
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-api-key>" password="<kafka-api-secret>";
   request.timeout.ms=20000
   retry.backoff.ms=500

   producer.bootstrap.servers=<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>
   producer.ssl.endpoint.identification.algorithm=https
   producer.security.protocol=SASL_SSL
   producer.sasl.mechanism=PLAIN
   producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-api-key>" password="<kafka-api-secret>";
   producer.request.timeout.ms=20000
   producer.retry.backoff.ms=500

   consumer.bootstrap.servers=<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>
   consumer.ssl.endpoint.identification.algorithm=https
   consumer.security.protocol=SASL_SSL
   consumer.sasl.mechanism=PLAIN
   consumer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-api-key>" password="<kafka-api-secret>";
   consumer.request.timeout.ms=20000
   consumer.retry.backoff.ms=500

   offset.flush.interval.ms=10000
   offset.storage.file.filename=/tmp/connect.offsets
   group.id=connect-cluster
   offset.storage.topic=connect-offsets
   offset.storage.replication.factor=3
   offset.storage.partitions=3
   config.storage.topic=connect-configs
   config.storage.replication.factor=3
   status.storage.topic=connect-status
   status.storage.replication.factor=3

   # Schema Registry specific settings
   # We recommend you use Confluent Cloud Schema Registry if you run Oracle CDC Source against Confluent Cloud
   value.converter.basic.auth.credentials.source=USER_INFO
   value.converter.schema.registry.basic.auth.user.info=<schema-registry-api-key>:<schema-registry-api-secret>
   value.converter.schema.registry.url=<your-schema-registry-url e.g https://xyz.us-east-2.aws.confluent.cloud>

   # Confluent license settings
   confluent.topic.bootstrap.servers=<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>
   confluent.topic.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-api-key>" password="<kafka-api-secret>";
   confluent.topic.security.protocol=SASL_SSL
   confluent.topic.sasl.mechanism=PLAIN


   # Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
   # (connectors, converters, transformations). The list should consist of top level directories that include
   # any combination of:
   # a) directories immediately containing jars with plugins and their dependencies
   # b) uber-jars with plugins and their dependencies
   # c) directories immediately containing the package directory structure of classes of plugins and their dependencies
   # Examples:
   # plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
   plugin.path=/usr/share/java,<path-to>/confluent-6.0.0/share/confluent-hub-components
   ```
2. Start Kafka Connect with the following command:
   ```text
   <path-to-confluent-home>/bin/connect-distributed my-connect-distributed.properties
   ```


### Known issues

* When deploying CFK to Red Hat OpenShift with Red Hat’s Operator
  Lifecycle Manager using the OperatorHub, you must use OpenShift
  version 4.9 or higher.

  This OpenShift version restriction does not apply when deploying CFK to Red
  Hat OpenShift in the standard way without using the Red Hat Operator Lifecycle
  Manager.
* When CFK is deployed on an OpenShift cluster with Red Hat’s Operator
  LIfecycle Manager/OperatorHub, capturing the support bundle for CFK using
  the command, `kubectl confluent support-bundle --namespace <namespace>`,
  can fail with the following error message:
  ```text
  panic: runtime error: index out of range [0] with length 0
  ```
* If the ksqlDB REST endpoint is using the auto-generated certificates, the
  ksqlDB deployment that points to Confluent Cloud requires trusting the Let’s
  Encrypt CA.

  For this to work, you must provide a CA bundle through `cacerts.pem`
  that contains both (1) the Confluent Cloud CA and (2) the self-signed CA to the
  ksqlDB CR.
* When TLS is enabled, and when Confluent Control Center (Legacy) uses a different TLS certificate to
  communicate with MDS or Confluent Cloud Schema Registry, Control Center (Legacy) (used with Confluent Platform 7.x) cannot
  use an auto-generated TLS certificate to connect to MDS or Confluent Cloud Schema Registry. See
  [Troubleshooting Guide](co-troubleshooting.md#co-c3-mds-certificates) for a workaround.
* When deploying the Schema Registry and Kafka CRs simultaneously, Schema Registry could fail because
  it cannot create topics with a replication factor of 3. It is because the
  Kafka brokers have not fully started.

  The workaround is to delete the Schema Registry deployment and re-deploy once Kafka is
  fully up.
* When deploying an RBAC-enabled Kafka cluster in centralized mode, where
  another “secondary” Kafka is being used to store RBAC metadata, an error,
  “License Topic could not be created”, may return on the secondary Kafka
  cluster.
* A periodic Kubernetes TCP probe on ZooKeeper (in Confluent Platform 7.x) causes frequent warning messages
  “client has closed socket” when warning logs are enabled.
* REST Proxy configured with monitoring interceptors is missing the callback
  handler properties when RBAC is enabled. Interceptor would not work, and you
  would see an error message in the KafkaRestProxy log.

  As a workaround, manually add configuration overrides as shown in the
  following KafkaRestProxy CR:
  ```yaml
  configOverrides:
    server:
      - confluent.monitoring.interceptor.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
      - consumer.confluent.monitoring.interceptor.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
      - producer.confluent.monitoring.interceptor.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  ```
* When configuring source-initiated cluster links with CFK where the source
  cluster has TLS enabled, set the following in the Source mode ClusterLink CR,
  under the `spec.configs` section:
  * `local.security.protocol: SSL` and `local.listener.name: <SSL listener
    name>` for mTLS.
  * `local.security.protocol: SASL_SSL` for SASL authentication with TLS.

  For details about configuring source-initiated Cluster Linking, see
  [Configure the source-initiated cluster link on the source cluster](co-link-clusters.md#co-clusterlink-source-initiated-connection-source-mode).
* The CFK [support bundle plugin](co-troubleshooting.md#co-support-bundle) on Windows systems
  does not capture all the logs.

  As a workaround, specify the `--out-dir` flag in the `kubectl confluent
  support-bundle` command to provide the output location for the support
  bundle.
* When you have external access enabled with load balancer type for both Control Center
  and Prometheus, only `controlcenter-next-gen-prometheus-bootstrap-lb` gets
  created.

  The workaround is to enable Control Center external access first, and then add
  Prometheus external access. This will show both Control Center and Prometheus external
  `controlcenter-next-gen-bootstrap-lb` and
  `controlcenter-next-gen-prometheus-bootstrap-lb` get created.


## Preparation

Follow these guidelines when you prepare to upgrade.

* Read the [Release Notes for Confluent Platform 8.1](../release-notes/index.md#release-notes) and review the [Changelogs](../release-notes/changelog.md#cp-changelog) for your Confluent Platform components.
  : The release notes contain important information about noteworthy
    features, and changes to configurations that may impact your upgrade.
    Changelogs note updates to third-party
    components such as Jolokia or JMX exporters that might affect your system.
* Form a plan.
  : Read the documentation and draft an upgrade plan that matches your
    specific requirements and environment before starting the upgrade process. In
    other words, don’t start working through this guide on a live cluster. Read
    the guide entirely, make a plan, then execute the plan.
* Perform backups.
  : Before upgrading, always back up all configuration and unit files with their file
    permissions, ownership, and customizations.
    Confluent Platform may not run if the proper ownership isn’t preserved on configuration files. By default, configuration files are located
    in the `$CONFLUENT_HOME/etc` directory and are organized by component.
* Upgrade all components.
  : Upgrade your entire platform deployment so that all components are running
    the same version. Do not bridge component versions.
* Consider upgrade order.
  : The general recommended upgrade order is to upgrade your server-side components (brokers, controllers, and Confluent Control Center)
    before you upgrade your client applications. Newer brokers are designed to work with compliant older clients. As noted in
    the [KIP-896](https://cwiki.apache.org/confluence/display/KAFKA/KIP-896%3A+Remove+old+client+protocol+API+versions+in+Kafka+4.0) warning,
    this compatibility guarantee has changed. Confluent Platform 8.1 won’t communicate with clients using a protocol older than Kafka 2.1.0.


    Therefore, the upgrade process must be:
    1. Ensure you’re running Confluent Platform 7.7, 7.8, or 7.9.
    2. Identify and upgrade all non-compliant clients.
    3. Upgrade Confluent Control Center to a version compatible with 8.1.
    4. Upgrade your brokers, controllers, and other server-side components to 8.1.
    5. Upgrade your compliant clients to the 8.1 libraries as needed to access new features.

Clients include any application that uses Kafka producers or consumers, command-line tools, Schema Registry, REST Proxy, Kafka Connect, and Kafka Streams.

* Determine if clients are colocated with brokers.
  : Although not recommended, some deployments have clients co-located with
    brokers (on the same node). In these cases, brokers and clients
    share the same packages. In this colocation case, ensure
    all client processes are not upgraded until *all* Kafka brokers have been upgraded.
* Decide between a rolling upgrade or a downtime upgrade.
  : Confluent Platform supports
    both rolling upgrades, meaning you upgrade one broker at a time to avoid cluster downtime,
    and downtime upgrades meaning you take down the entire cluster, upgrade it, and bring
    everything back up.
* Use Confluent Control Center for monitoring for a rolling restart.
  : Consider using [Control Center](/control-center/current/installation/overview.html) to
    monitor broker status during the [rolling restart](../kafka/post-deployment.md#rolling-restart).
* Set a license string.
  : The Confluent Platform package includes Confluent Server by default and requires a
    `confluent.license` key in your `server.properties` file.
    The Confluent Server broker checks for a license during start up. You must
    supply a license string in each broker’s properties file using the
    `confluent.license` property as shown in the following code:
    ```none
    confluent.license=LICENCE_STRING_HERE_NO_QUOTES
    ```


    If you want to use the Kafka broker, download the `confluent-community` package.
    The Kafka broker is the default in all Debian or RHEL and CentOS packages.
* Run the correct version of Java.
  : Determine and install the appropriate Java version.
    See [Supported Java
    Versions](versions-interoperability.md#java-sys-req) for a list of Confluent Platform versions and the corresponding Java version support
    before you upgrade. For complete compatibility information, see the [Supported Versions and Interoperability for Confluent Platform](versions-interoperability.md#interoperability-versions).


### Print schema IDs with command line consumer utilities

You can use the `kafka-avro-console-consumer`, `kafka-protobuf-console-consumer`, and `kafka-json-schema-console-consumer` utilities
to get the schema IDs for all messages on a topic, or for a specified subset of messages. This can be useful for exploring or troubleshooting schemas.

To print schema IDs, run the consumer with `--property print.schema.ids=true` and `--property print.key=true`. The basic command syntax for Avro is as follows:

```bash
kafka-avro-console-consumer --bootstrap-server $BOOTSTRAP_SERVER \
--property basic.auth.credentials.source="USER_INFO" \
--property print.key=true --property print.schema.ids=true \
--property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \
--property schema.registry.url=$SCHEMA_REGISTRY_URL \
--consumer.config /Users/vicky/creds.config \
--topic <topic-name> --from-beginning \
--property schema.registry.basic.auth.user.info=$SR_APIKEY:$SR_APISECRET
```

Note that to run this command against Confluent Cloud, you must have an API key and secret for the Kafka cluster and for the Schema Registry cluster associated with the environment.
To specify the value for `$BOOTSTRAP_SERVER`, you must use the Endpoint URL on Confluent Cloud or the host and port as specified in your properties files for Confluent Platform.

- To find the Endpoint URL on Confluent Cloud to use as the value for $BOOTSTRAP_SERVER, on the Cloud Console navigate to **Cluster settings** and find the URL for **Bootstrap server** under **Endpoints**.
  Alternatively, use the Confluent CLI command [confluent kafka cluster describe](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/cluster/confluent_kafka_cluster_describe.html) to find the value given for **Endpoint**,
  minus the security protocol prefix. For Confluent Cloud, this will always be in the form of `URL:port`, such as `pkc-12576z.us-west2.gcp.confluent.cloud:9092`.
- The examples use shell environment variables to indicate values for `--bootstrap-server`, `schema.registry.url`, API key and secret, and so forth.
  You may want to store the values for these properties in local shell environment variables to make testing at the command line easier.
  (For example: `export API_KEY=xyz`.) You can check the contents of a variable with `echo $<VAR>` (for example, `echo $APIKEY`), then use it
  as such in subsequent commands and config files.
- The users’ credentials are in a local file called `creds.config`, which contains the following information:
  ```bash
  # Required connection configs for Kafka producer, consumer, and admin
  bootstrap.servers=<BOOTSTRAP_SERVER>
  security.protocol=SASL_SSL
  sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<CLUSTER_API_KEY>" password="<CLUSTER_API_SECRET>";
  sasl.mechanism=PLAIN
  # Required for correctness in Apache Kafka clients prior to 2.6
  client.dns.lookup=use_all_dns_ips

  # Best practice for higher availability in Apache Kafka clients prior to 3.0
  session.timeout.ms=45000

  # Best practice for Kafka producer to prevent data loss
  acks=all
  ```

The subsequent examples use basic authentication and API keys. To learn more about authentication on Confluent Cloud,
see [security/authenticate/workload-identities/service-accounts/api-keys/manage-api-keys.html#add-an-api-key](/cloud/current/security/authenticate/workload-identities/service-accounts/api-keys/manage-api-keys.html#add-an-api-key).
To learn more about authentication on Confluent Platform, see [Use HTTP Basic Authentication in Confluent Platform](/platform/current/security/authentication/http-basic-auth/overview.html).


#### IMPORTANT
If you are configuring this for Schema Registry or REST Proxy, you must prefix each parameter with
`confluent.license`. For example, `sasl.mechanism` becomes
`confluent.license.sasl.mechanism`. For additional information, see
[Configure license clients to authenticate to Kafka](../../../installation/license.md#kafka-rest-and-sasl-ssl-configs).

The new Producer and Consumer clients support security for Kafka versions 0.9.0 and higher.

If you are using the Kafka Streams API, you can read on how to configure equivalent
[SSL](/platform/current/clients/javadocs/javadoc/org/apache/kafka/common/config/SslConfigs.html) and
[SASL](/platform/current/clients/javadocs/javadoc/org/apache/kafka/common/config/SaslConfigs.html) parameters.

In the following configuration example, the underlying assumption is that client
authentication is required by the broker so that you can store it in a client properties file
`client-ssl.properties`. Because this stores passwords directly in the broker
configuration file, it is important to restrict access to these files using file
system permissions.

```bash
bootstrap.servers=kafka1:9093
security.protocol=SSL
ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
ssl.truststore.password=test1234
ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
ssl.keystore.password=test1234
ssl.key.password=test1234
```

Note that `ssl.truststore.password` is technically optional, but strongly
recommended. If a password is not set, access to the truststore is still
available, but integrity checking is disabled.

The following examples use `kafka-console-producer` and `kafka-console-consumer`,
and pass in the `client-ssl.properties` defined above:

```bash
kafka-console-producer --bootstrap-server kafka1:9093 --topic test --producer.config client-ssl.properties
kafka-console-consumer --bootstrap-server kafka1:9093 --topic test --consumer.config client-ssl.properties --from-beginning
```


# API Reference for Confluent Cloud


* [Confluent Cloud APIs](api.md)
* [Kafka Admin and Produce REST APIs](kafka-rest/kafka-rest-cc.md)
  * [Using Base64 Encoded Data and Credentials](kafka-rest/kafka-rest-cc.md#using-base64-encoded-data-and-credentials)
  * [Use cases](kafka-rest/kafka-rest-cc.md#use-cases)
  * [Cluster Admin API](kafka-rest/kafka-rest-cc.md#cluster-admin-api)
    * [Principal IDs for ACLs](kafka-rest/kafka-rest-cc.md#principal-ids-for-acls)
  * [Produce API](kafka-rest/kafka-rest-cc.md#produce-api)
    * [Streaming mode (recommended)](kafka-rest/kafka-rest-cc.md#streaming-mode-recommended)
    * [Non-streaming mode (not recommended)](kafka-rest/kafka-rest-cc.md#non-streaming-mode-not-recommended)
    * [Producing a single record to a topic](kafka-rest/kafka-rest-cc.md#producing-a-single-record-to-a-topic)
    * [Producing a batch of records to a topic](kafka-rest/kafka-rest-cc.md#producing-a-batch-of-records-to-a-topic)
    * [Data payload specification](kafka-rest/kafka-rest-cc.md#data-payload-specification)
  * [Connection bias and request limits](kafka-rest/kafka-rest-cc.md#connection-bias-and-request-limits)
  * [Suggested Resources](kafka-rest/kafka-rest-cc.md#suggested-resources)
* [Connect API](connectors/connect-api-section.md)
  * [Prerequisites](connectors/connect-api-section.md#prerequisites)
  * [Managed and Custom Connector API Examples](connectors/connect-api-section.md#managed-and-custom-connector-api-examples)
    * [Get a list of connectors](connectors/connect-api-section.md#get-a-list-of-connectors)
    * [Create a connector](connectors/connect-api-section.md#create-a-connector)
      * [Custom connector configuration payload](connectors/connect-api-section.md#custom-connector-configuration-payload)
      * [Raw JSON payload example](connectors/connect-api-section.md#raw-json-payload-example)
      * [JSON file payload example](connectors/connect-api-section.md#json-file-payload-example)
    * [List Java and Kafka runtimes](connectors/connect-api-section.md#list-java-and-ak-runtimes)
    * [Update (or create) a connector](connectors/connect-api-section.md#update-or-create-a-connector)
    * [Read a connector configuration](connectors/connect-api-section.md#read-a-connector-configuration)
    * [Migrate a connector configuration](connectors/connect-api-section.md#migrate-a-connector-configuration)
    * [Query a sink connector for metrics](connectors/connect-api-section.md#query-a-sink-connector-for-metrics)
    * [Delete a connector](connectors/connect-api-section.md#delete-a-connector)
  * [Fully-Managed and Custom Connector Plugin API Examples](connectors/connect-api-section.md#fully-managed-and-custom-connector-plugin-api-examples)
    * [Fully-managed connector plugin examples](connectors/connect-api-section.md#fully-managed-connector-plugin-examples)
      * [List fully-managed plugins](connectors/connect-api-section.md#list-fully-managed-plugins)
      * [Validate a fully-managed plugin](connectors/connect-api-section.md#validate-a-fully-managed-plugin)
    * [Custom Connector Plugin API examples](connectors/connect-api-section.md#custom-connector-plugin-api-examples)
      * [List custom plugins](connectors/connect-api-section.md#list-custom-plugins)
      * [Request a presigned URL](connectors/connect-api-section.md#request-a-presigned-url)
      * [Upload a custom plugin](connectors/connect-api-section.md#upload-a-custom-plugin)
      * [Create a custom plugin](connectors/connect-api-section.md#create-a-custom-plugin)
      * [Read a custom plugin](connectors/connect-api-section.md#read-a-custom-plugin)
      * [Update a custom plugin](connectors/connect-api-section.md#update-a-custom-plugin)
      * [Delete a custom plugin](connectors/connect-api-section.md#delete-a-custom-plugin)
    * [Custom Connector Plugin version API examples](connectors/connect-api-section.md#custom-connector-plugin-version-api-examples)
    * [Create a custom connector plugin version](connectors/connect-api-section.md#create-a-custom-connector-plugin-version)
    * [List custom connector plugin version](connectors/connect-api-section.md#list-custom-connector-plugin-version)
    * [Describe custom connector plugin version](connectors/connect-api-section.md#describe-custom-connector-plugin-version)
    * [Delete custom connector plugin version](connectors/connect-api-section.md#delete-custom-connector-plugin-version)
  * [Next Steps](connectors/connect-api-section.md#next-steps)
* [Client APIs](client-api.md)
  * [C++ Client API](https://docs.confluent.io/platform/current/clients/api-docs/librdkafka.html)
  * [Python Client API](https://docs.confluent.io/platform/current/clients/api-docs/confluent-kafka-python.html)
  * [Go Client API](https://docs.confluent.io/platform/current/clients/api-docs/confluent-kafka-go.html)
  * [.NET Client API](https://docs.confluent.io/platform/current/clients/api-docs/confluent-kafka-dotnet.html)
* [Provider Integration API](pi-api.md)
  * [Prerequisites](pi-api.md#prerequisites)
  * [Manage provider integration](pi-api.md#manage-provider-integration)
    * [List provider integrations](pi-api.md#list-provider-integrations)
    * [Register a provider integration](pi-api.md#register-a-provider-integration)
    * [Read a provider integration](pi-api.md#read-a-provider-integration)
    * [Delete a provider integration](pi-api.md#delete-a-provider-integration)
* [Flink REST API](flink/operate-and-deploy/flink-rest-api.md)
  * [Prerequisites](flink/operate-and-deploy/flink-rest-api.md#prerequisites)
  * [Rate limits](flink/operate-and-deploy/flink-rest-api.md#rate-limits)
  * [Private networking endpoints](flink/operate-and-deploy/flink-rest-api.md#private-networking-endpoints)
  * [Generate a Flink API key](flink/operate-and-deploy/flink-rest-api.md#generate-a-af-api-key)
  * [Manage statements](flink/operate-and-deploy/flink-rest-api.md#manage-statements)
    * [Flink SQL statement schema](flink/operate-and-deploy/flink-rest-api.md#flink-sql-statement-schema)
    * [Submit a statement](flink/operate-and-deploy/flink-rest-api.md#submit-a-statement)
    * [Get a statement](flink/operate-and-deploy/flink-rest-api.md#get-a-statement)
    * [List statements](flink/operate-and-deploy/flink-rest-api.md#list-statements)
    * [Update metadata for a statement](flink/operate-and-deploy/flink-rest-api.md#update-metadata-for-a-statement)
    * [Delete a statement](flink/operate-and-deploy/flink-rest-api.md#delete-a-statement)
  * [Manage compute pools](flink/operate-and-deploy/flink-rest-api.md#manage-compute-pools)
    * [List Flink compute pools](flink/operate-and-deploy/flink-rest-api.md#list-af-compute-pools)
    * [Create a Flink compute pool](flink/operate-and-deploy/flink-rest-api.md#create-a-af-compute-pool)
    * [Read a Flink compute pool](flink/operate-and-deploy/flink-rest-api.md#read-a-af-compute-pool)
    * [Update a Flink compute pool](flink/operate-and-deploy/flink-rest-api.md#update-a-af-compute-pool)
    * [Delete a Flink compute pool](flink/operate-and-deploy/flink-rest-api.md#delete-a-af-compute-pool)
  * [List Flink regions](flink/operate-and-deploy/flink-rest-api.md#list-af-regions)
  * [Manage Flink artifacts](flink/operate-and-deploy/flink-rest-api.md#manage-af-artifacts)
    * [List Flink artifacts](flink/operate-and-deploy/flink-rest-api.md#list-af-artifacts)
    * [Create a Flink artifact](flink/operate-and-deploy/flink-rest-api.md#create-a-af-artifact)
    * [Read an artifact](flink/operate-and-deploy/flink-rest-api.md#read-an-artifact)
    * [Update an artifact](flink/operate-and-deploy/flink-rest-api.md#update-an-artifact)
    * [Delete an artifact](flink/operate-and-deploy/flink-rest-api.md#delete-an-artifact)
  * [Manage UDF logging](flink/operate-and-deploy/flink-rest-api.md#manage-udf-logging)
    * [Enable logging](flink/operate-and-deploy/flink-rest-api.md#enable-logging)
    * [List UDF logs](flink/operate-and-deploy/flink-rest-api.md#list-udf-logs)
    * [Disable a UDF log](flink/operate-and-deploy/flink-rest-api.md#disable-a-udf-log)
    * [View log details](flink/operate-and-deploy/flink-rest-api.md#view-log-details)
    * [Update the logging level for a UDF log](flink/operate-and-deploy/flink-rest-api.md#update-the-logging-level-for-a-udf-log)
  * [Manage connections](flink/operate-and-deploy/flink-rest-api.md#manage-connections)
    * [Create a connection](flink/operate-and-deploy/flink-rest-api.md#create-a-connection)
    * [Delete a connection](flink/operate-and-deploy/flink-rest-api.md#delete-a-connection)
    * [List connections](flink/operate-and-deploy/flink-rest-api.md#list-connections)
  * [Related content](flink/operate-and-deploy/flink-rest-api.md#related-content)
* [Metrics API](https://api.telemetry.confluent.cloud/docs)
* [Stream Catalog REST API Usage](stream-governance/stream-catalog-rest-apis.md)
  * [Catalog API usage examples](stream-governance/stream-catalog-rest-apis.md#catalog-api-usage-examples)
    * [What’s in this guide](stream-governance/stream-catalog-rest-apis.md#what-s-in-this-guide)
    * [What this guide doesn’t cover](stream-governance/stream-catalog-rest-apis.md#what-this-guide-doesn-t-cover)
    * [Catalog API usage limitations and best practices](stream-governance/stream-catalog-rest-apis.md#catalog-api-usage-limitations-and-best-practices)
      * [Character limits on tag and business metadata definitions](stream-governance/stream-catalog-rest-apis.md#character-limits-on-tag-and-business-metadata-definitions)
      * [Rate limits on searches](stream-governance/stream-catalog-rest-apis.md#rate-limits-on-searches)
      * [Limits on topic listings](stream-governance/stream-catalog-rest-apis.md#limits-on-topic-listings)
      * [Global sorting of search results](stream-governance/stream-catalog-rest-apis.md#global-sorting-of-search-results)
      * [Business metadata support on Unified Stream Manager entities](stream-governance/stream-catalog-rest-apis.md#business-metadata-support-on-usm-entities)
    * [Setup and suggestions](stream-governance/stream-catalog-rest-apis.md#setup-and-suggestions)
    * [OAuth for Confluent Cloud Stream Catalog REST API](stream-governance/stream-catalog-rest-apis.md#oauth-for-ccloud-sg-catalog-rest-api)
    * [Entity types](stream-governance/stream-catalog-rest-apis.md#entity-types)
      * [How entities are identified](stream-governance/stream-catalog-rest-apis.md#how-entities-are-identified)
      * [Qualified name definitions](stream-governance/stream-catalog-rest-apis.md#qualified-name-definitions)
        * [Examples of qualified names](stream-governance/stream-catalog-rest-apis.md#examples-of-qualified-names)
      * [Entity APIs](stream-governance/stream-catalog-rest-apis.md#entity-apis)
    * [Searching](stream-governance/stream-catalog-rest-apis.md#searching)
    * [Tags API examples](stream-governance/stream-catalog-rest-apis.md#tags-api-examples)
      * [Create a tag](stream-governance/stream-catalog-rest-apis.md#create-a-tag)
        * [Create a generic tag applicable to a topic or any entity](stream-governance/stream-catalog-rest-apis.md#create-a-generic-tag-applicable-to-a-topic-or-any-entity)
      * [List all tags (with definitions)](stream-governance/stream-catalog-rest-apis.md#list-all-tags-with-definitions)
      * [Get tag definition](stream-governance/stream-catalog-rest-apis.md#get-tag-definition)
      * [Search fields by name](stream-governance/stream-catalog-rest-apis.md#search-fields-by-name)
      * [Search fields by tag](stream-governance/stream-catalog-rest-apis.md#search-fields-by-tag)
      * [Search schema record by name](stream-governance/stream-catalog-rest-apis.md#search-schema-record-by-name)
      * [Search schema by tag](stream-governance/stream-catalog-rest-apis.md#search-schema-by-tag)
      * [Tag a field in Avro](stream-governance/stream-catalog-rest-apis.md#tag-a-field-in-avro)
      * [Get the tag attributes from a field](stream-governance/stream-catalog-rest-apis.md#get-the-tag-attributes-from-a-field)
      * [Tag a schema version](stream-governance/stream-catalog-rest-apis.md#tag-a-schema-version)
      * [Get schemas with a given a subject name prefix](stream-governance/stream-catalog-rest-apis.md#get-schemas-with-a-given-a-subject-name-prefix)
      * [Delete a tag](stream-governance/stream-catalog-rest-apis.md#delete-a-tag)
      * [Topics](stream-governance/stream-catalog-rest-apis.md#topics)
        * [List all topics](stream-governance/stream-catalog-rest-apis.md#list-all-topics)
        * [Search for topics by name](stream-governance/stream-catalog-rest-apis.md#search-for-topics-by-name)
        * [Search for topics by tag](stream-governance/stream-catalog-rest-apis.md#search-for-topics-by-tag)
        * [Tag a topic](stream-governance/stream-catalog-rest-apis.md#tag-a-topic)
        * [Add a topic owner and email](stream-governance/stream-catalog-rest-apis.md#add-a-topic-owner-and-email)
        * [Create a tag for a topic](stream-governance/stream-catalog-rest-apis.md#create-a-tag-for-a-topic)
      * [Connectors](stream-governance/stream-catalog-rest-apis.md#connectors)
        * [List all connectors](stream-governance/stream-catalog-rest-apis.md#list-all-connectors)
        * [Search for connectors by name](stream-governance/stream-catalog-rest-apis.md#search-for-connectors-by-name)
        * [Tag a connector](stream-governance/stream-catalog-rest-apis.md#tag-a-connector)
    * [Business metadata API examples](stream-governance/stream-catalog-rest-apis.md#business-metadata-api-examples)
      * [Create a schema](stream-governance/stream-catalog-rest-apis.md#create-a-schema)
      * [Create your first business metadata definition](stream-governance/stream-catalog-rest-apis.md#create-your-first-business-metadata-definition)
      * [Create a business metadata definition for a topic](stream-governance/stream-catalog-rest-apis.md#create-a-business-metadata-definition-for-a-topic)
      * [Get all the business metadata definitions created so far](stream-governance/stream-catalog-rest-apis.md#get-all-the-business-metadata-definitions-created-so-far)
      * [Get a specific business metadata definition by its name](stream-governance/stream-catalog-rest-apis.md#get-a-specific-business-metadata-definition-by-its-name)
      * [Update business metadata definitions](stream-governance/stream-catalog-rest-apis.md#update-business-metadata-definitions)
      * [Add business metadata to a schema-related entity](stream-governance/stream-catalog-rest-apis.md#add-business-metadata-to-a-schema-related-entity)
      * [Add business metadata to a topic](stream-governance/stream-catalog-rest-apis.md#add-business-metadata-to-a-topic)
      * [Add business metadata to a connector](stream-governance/stream-catalog-rest-apis.md#add-business-metadata-to-a-connector)
      * [Get business metadata associated with an instance of an entity](stream-governance/stream-catalog-rest-apis.md#get-business-metadata-associated-with-an-instance-of-an-entity)
      * [Update business metadata associated with entity](stream-governance/stream-catalog-rest-apis.md#update-business-metadata-associated-with-entity)
      * [Search for business metadata associated with entity](stream-governance/stream-catalog-rest-apis.md#search-for-business-metadata-associated-with-entity)
      * [Remove business metadata associated with an entity](stream-governance/stream-catalog-rest-apis.md#remove-business-metadata-associated-with-an-entity)
      * [Delete business metadata definitions](stream-governance/stream-catalog-rest-apis.md#delete-business-metadata-definitions)
  * [Related content](stream-governance/stream-catalog-rest-apis.md#related-content)
* [GraphQL API](stream-governance/graphql.md)
  * [Overview](stream-governance/graphql.md#overview)
    * [What is it?](stream-governance/graphql.md#what-is-it)
    * [Why is it important?](stream-governance/graphql.md#why-is-it-important)
    * [When to use REST API and when to use GraphQL API](stream-governance/graphql.md#when-to-use-rest-api-and-when-to-use-graphql-api)
  * [Getting started](stream-governance/graphql.md#getting-started)
    * [GraphQL endpoint](stream-governance/graphql.md#graphql-endpoint)
    * [GraphQL schema](stream-governance/graphql.md#graphql-schema)
    * [Entity queries](stream-governance/graphql.md#entity-queries)
      * [Fetch list of entities](stream-governance/graphql.md#fetch-list-of-entities)
      * [Fetch nested entities using relationships](stream-governance/graphql.md#fetch-nested-entities-using-relationships)
      * [Filtering using the “where” argument](stream-governance/graphql.md#filtering-using-the-where-argument)
      * [Sort Using the “order_by” Argument](stream-governance/graphql.md#sort-using-the-order-by-argument)
      * [Pagination with the “limit” and “offset” Arguments](stream-governance/graphql.md#pagination-with-the-limit-and-offset-arguments)
      * [Filtering by tag with the “tags” argument](stream-governance/graphql.md#filtering-by-tag-with-the-tags-argument)
      * [Including deleted objects with the “deleted” argument](stream-governance/graphql.md#including-deleted-objects-with-the-deleted-argument)
    * [GraphQL API usage limitations and best practices](stream-governance/graphql.md#graphql-api-usage-limitations-and-best-practices)
      * [Global sorting of search results](stream-governance/graphql.md#global-sorting-of-search-results)
    * [API limits](stream-governance/graphql.md#api-limits)
      * [Query limits](stream-governance/graphql.md#query-limits)
      * [Time limits](stream-governance/graphql.md#time-limits)
      * [Rate limits](stream-governance/graphql.md#rate-limits)
  * [API reference](stream-governance/graphql.md#api-reference)
  * [Related content](stream-governance/graphql.md#related-content)
* [Service Quotas API](quotas/quotas.md)
  * [Get an API key and secret](quotas/quotas.md#get-an-api-key-and-secret)
  * [Service Quotas API endpoints](quotas/quotas.md#quotas-api-endpoints)
  * [Query a Service Quotas endpoint](quotas/quotas.md#query-a-service-quotas-endpoint)
    * [Paged responses](quotas/quotas.md#paged-responses)
    * [RBAC Model](quotas/quotas.md#rbac-model)
  * [Example requests](quotas/quotas.md#example-requests)
  * [Filtering](quotas/quotas.md#filtering)
  * [Query for scopes](quotas/quotas.md#query-for-scopes)
  * [Organization quotas](quotas/quotas.md#organization-quotas)
    * [Max BYOK keys per organization](quotas/quotas.md#max-byok-keys-per-organization)
    * [Max API keys scoped for resource management for an organization](quotas/quotas.md#max-api-keys-scoped-for-resource-management-for-an-organization)
    * [Max environments for an organization](quotas/quotas.md#max-environments-for-an-organization)
    * [Max Kafka clusters for an organization](quotas/quotas.md#max-ak-clusters-for-an-organization)
    * [Max service accounts for an organization](quotas/quotas.md#max-service-accounts-for-an-organization)
    * [Max user accounts for an organization](quotas/quotas.md#max-user-accounts-for-an-organization)
    * [Max pending invitations for an organization](quotas/quotas.md#max-pending-invitations-for-an-organization)
    * [Max audit log consumer API keys per organization](quotas/quotas.md#max-audit-log-consumer-api-keys-per-organization)
    * [Max Kafka cluster provisioning requests per day](quotas/quotas.md#max-ak-cluster-provisioning-requests-per-day)
  * [Service account quotas](quotas/quotas.md#service-account-quotas)
    * [Maximum API keys scoped for resource management per service account](quotas/quotas.md#maximum-api-keys-scoped-for-resource-management-per-service-account)
    * [Max Kafka API keys per service account](quotas/quotas.md#max-ak-api-keys-per-service-account)
  * [User account quotas](quotas/quotas.md#user-account-quotas)
    * [Maximum API keys scoped for resource management per user](quotas/quotas.md#maximum-api-keys-scoped-for-resource-management-per-user)
    * [Max Cluster API keys per user](quotas/quotas.md#max-cluster-api-keys-per-user)
  * [Environment quotas](quotas/quotas.md#environment-quotas)
    * [Get your environment ID](quotas/quotas.md#get-your-environment-id)
    * [Max clusters for an environment](quotas/quotas.md#max-clusters-for-an-environment)
    * [Max cluster CKUs for an environment](quotas/quotas.md#max-cluster-ckus-for-an-environment)
    * [Max pending clusters for an environment](quotas/quotas.md#max-pending-clusters-for-an-environment)
    * [Max ksqlDB clusters for an environment](quotas/quotas.md#max-ksqldb-clusters-for-an-environment)
  * [Kafka cluster quotas](quotas/quotas.md#ak-cluster-quotas)
    * [Get the Kafka cluster ID](quotas/quotas.md#get-the-ak-cluster-id)
    * [Max API keys per Kafka cluster](quotas/quotas.md#max-api-keys-per-ak-cluster)
    * [Max private links per Kafka cluster](quotas/quotas.md#max-private-links-per-ak-cluster)
    * [Max peering connections per Kafka cluster](quotas/quotas.md#max-peering-connections-per-ak-cluster)
    * [Max CKUs per Kafka cluster](quotas/quotas.md#max-ckus-per-ak-cluster)
  * [Network quotas](quotas/quotas.md#network-quotas)
    * [Max peering connections per network](quotas/quotas.md#max-peering-connections-per-network)
    * [Max private link connections per network](quotas/quotas.md#max-private-link-connections-per-network)
  * [Related content](quotas/quotas.md#related-content)


## Can I access logs for Confluent Cloud services?

Internal service logs for Confluent Cloud managed services (such as Kafka brokers, Schema Registry,
and other infrastructure components) are not directly accessible to customers,
but there are several tools and approaches to help you debug and monitor your
streaming applications.

**General monitoring and debugging tools:**

The [Confluent Cloud Metrics](monitoring/metrics-api.md#metrics-api) provides actionable operational metrics about your Confluent Cloud
deployment. The Confluent Cloud Console shows cluster activity and usage relative to your
cluster’s capacity. The Cloud Console also includes [topic management](topics/overview.md#cloud-topics-manage) and [consumer lag monitoring](monitoring/monitor-lag.md#cloud-monitoring-lag).
[Build Streaming Applications](client-apps/index.md#ccloud-best-practices) details best practices for configuring, monitoring, and
debugging Kafka clients.

**Component-specific logging options:**

For some Confluent Cloud services, specific logging and monitoring capabilities are available:

- **Audit logs**: [Confluent Cloud audit logs](monitoring/audit-logging/cloud-audit-log-concepts.md#cloud-audit-logs) track administrative
  and data plane activities within your organization.
- **Connector events**: [View connector events](connectors/logging-cloud-connectors.md#ccloud-connector-logging) to monitor
  and troubleshoot your connectors.
- **Flink user-defined functions**: [Enable logging in Flink UDFs](flink/how-to-guides/enable-udf-logging.md#flink-sql-enable-udf-logging)
  for custom application debugging.
- **ksqlDB processing logs**: Monitor ksqlDB application health using
  [ksqlDB processing logs](ksqldb/monitoring-ksqldb.md#cloud-ksql-monitor).

For comprehensive monitoring guidance, see [Confluent Cloud Metrics](monitoring/metrics-api.md#metrics-api).


## Kafka

* [PR-20633](https://github.com/apache/kafka/pull/20633) - KAFKA-19748: Fix metrics leak in Kafka Streams (#20633)
* [PR-20618](https://github.com/apache/kafka/pull/20618) - KAFKA-19690 Add epoch check before verification guard check to prevent unexpected fatal error (#20618)
* [PR-20583](https://github.com/apache/kafka/pull/20583) - [MINOR] Cleaning ignored streams test (#20583)
* [PR-20604](https://github.com/apache/kafka/pull/20604) - KAFKA-19719 –no-initial-controllers should not assume kraft.version=1 (#20604)
* [PR-19961](https://github.com/apache/kafka/pull/19961) - KAFKA-19390: Call safeForceUnmap() in AbstractIndex.resize() on Linux to prevent stale mmap of index files (#19961)
* [PR-20591](https://github.com/apache/kafka/pull/20591) - KAFKA-19732, KAFKA-19716: Clear out coordinator snapshots periodically while loading (#20591)
* [PR-20581](https://github.com/apache/kafka/pull/20581) - KAFKA-19546: Rebalance should be triggered by subscription change during group protocol downgrade (#20581)
* [PR-20519](https://github.com/apache/kafka/pull/20519) - KAFKA-19695: Fix bug in redundant offset calculation. (#20516) (#20519)
* [PR-20512](https://github.com/apache/kafka/pull/20512) - KAFKA-19679: Fix NoSuchElementException in oldest open iterator metric (#20512)
* [PR-20470](https://github.com/apache/kafka/pull/20470) - KAFKA-19668: processValue() must be declared as value-changing operation (#20470)
* [13f70256](https://github.com/apache/kafka/commit/13f70256db3c994c590e5d262a7cc50b9e973204) - Bump version to 4.1.0
* [70dd1ca2](https://github.com/apache/kafka/commit/70dd1ca2cab81f78c68782659db1d8453b1de5d6) - Revert “Bump version to 4.1.0”
* [PR-20405](https://github.com/apache/kafka/pull/20405) - KAFKA-19642 Replace dynamicPerBrokerConfigs with dynamicDefaultConfigs (#20405)
* [PR-1777](https://github.com/confluentinc/kafka/pull/1777) - KSECURITY-2558: Bump jetty to version 12.0.25 in 4.1
* [PR-20070](https://github.com/apache/kafka/pull/20070) - KAFKA-19429: Deflake streams_smoke_test, again (#20070)
* [PR-20398](https://github.com/apache/kafka/pull/20398) - Revert “KAFKA-13722: remove usage of old ProcessorContext  (#18292)” (#20398)
* [PR-1765](https://github.com/confluentinc/kafka/pull/1765) - DPA-1801 Add run_tags to worker-ami and aws-packer
* [PR-1746](https://github.com/confluentinc/kafka/pull/1746) - Change ci_tools import path
* [23b64404](https://github.com/apache/kafka/commit/23b64404ae7ba98d89a2d456991abaf2f32af35f) - Bump version to 4.1.0
* [6340f437](https://github.com/apache/kafka/commit/6340f437cd2d15be4180febb9505437266080002) - Revert “Bump version to 4.1.0”
* [de16dd10](https://github.com/apache/kafka/commit/de16dd103af93bb68a329987ff19469941f85cbc) - KAFKA-19581: Temporary fix for Streams system tests
* [PR-20269](https://github.com/apache/kafka/pull/20269) - KAFKA-19576 Fix typo in state-change log filename after rotate (#20269)
* [PR-20274](https://github.com/apache/kafka/pull/20274) - KAFKA-19529: State updater sensor names should be unique (#20262) (#20274)
* [PR-1708](https://github.com/confluentinc/kafka/pull/1708) - DPA-1675: In case of infra failure in ccs-kafka tag that as infra failure in testbreak
* [PR-20165](https://github.com/apache/kafka/pull/20165) - KAFKA-19501 Update OpenJDK base image from buster to bullseye (#20165)
* [e14d849c](https://github.com/apache/kafka/commit/e14d849cbf8836cc9e4a592342baf19a1fbd93c9) - Bump version to 4.1.0
* [PR-20200](https://github.com/apache/kafka/pull/20200) - KAFKA-19522: avoid electing fenced lastKnownLeader (#20200)
* [PR-20196](https://github.com/apache/kafka/pull/20196) - KAFKA-19520 Bump Commons-Lang for CVE-2025-48924 (#20196)
* [PR-20040](https://github.com/apache/kafka/pull/20040) - KAFKA-19427 Allow the coordinator to grow its buffer dynamically (#20040)
* [PR-20166](https://github.com/apache/kafka/pull/20166) - KAFKA-19504: Remove unused metrics reporter initialization in KafkaAdminClient (#20166)
* [PR-20151](https://github.com/apache/kafka/pull/20151) - KAFKA-19495: Update config for native image (v4.1.0) (#20151)
* [610f0765](https://github.com/apache/kafka/commit/610f076542e1ac177c4b97ea7d6ca1335f9a3065) - Bump version to 4.1.0
* [PR-1684](https://github.com/confluentinc/kafka/pull/1684) - DPA-1489 migrate from vagrant to terraform
* [PR-1693](https://github.com/confluentinc/kafka/pull/1693) - Revert “Temporarily disable artifact publishing for the 4.1 branch.”
* [57e81f20](https://github.com/apache/kafka/commit/57e81f201055b58f94febf0509bfc8acba632854) - Bump version to 4.1.0
* [PR-20071](https://github.com/apache/kafka/pull/20071) - KAFKA-19184: Add documentation for upgrading the kraft version (#20071)
* [PR-20116](https://github.com/apache/kafka/pull/20116) - KAFKA-19444: Add back JoinGroup v0 & v1 (#20116)
* [PR-19964](https://github.com/apache/kafka/pull/19964) - KAFKA-19397: Ensure consistent metadata usage in produce request and response (#19964)
* [PR-19971](https://github.com/apache/kafka/pull/19971) - KAFKA-19042 Move ProducerSendWhileDeletionTest to client-integration-tests module (#19971)
* [PR-20100](https://github.com/apache/kafka/pull/20100) - KAFKA-19453: Ignore group not found in share group record replay (#20100)
* [PR-20025](https://github.com/apache/kafka/pull/20025) - KAFKA-19152: Add top-level documentation for OAuth flows (#20025)
* [PR-20029](https://github.com/apache/kafka/pull/20029) - KAFKA-19379: Basic upgrade guide for KIP-1071 EA (#20029)
* [PR-20062](https://github.com/apache/kafka/pull/20062) - KAFKA-19445: Fix coordinator runtime metrics sharing sensors (#20062)
* [PR-19704](https://github.com/apache/kafka/pull/19704) - KAFKA-19246; OffsetFetch API does not return group level errors correctly with version 1 (#19704)
* [PR-19985](https://github.com/apache/kafka/pull/19985) - KAFKA-19414: Remove 2PC public APIs from 4.1 until release (KIP-939) (#19985)
* [PR-1672](https://github.com/confluentinc/kafka/pull/1672) - DPA-1593 exclude newly added files to fix build
* [PR-1663](https://github.com/confluentinc/kafka/pull/1663) - DPA-1593 add cloudwatch metrics to view cpu, memory and disk usage
* [PR-20022](https://github.com/apache/kafka/pull/20022) - KAFKA-19398: (De)Register oldest-iterator-open-since-ms metric dynamically (#20022)
* [PR-20033](https://github.com/apache/kafka/pull/20033) - KAFKA-19383: Handle the deleted topics when applying ClearElrRecord (#20033)
* [PR-19745](https://github.com/apache/kafka/pull/19745) - KAFKA-19294: Fix BrokerLifecycleManager RPC timeouts (#19745)
* [PR-19974](https://github.com/apache/kafka/pull/19974) - KAFKA-19411: Fix deleteAcls bug which allows more deletions than max records per user op (#19974)
* [PR-19972](https://github.com/apache/kafka/pull/19972) - KAFKA-19407 Fix potential IllegalStateException when appending to timeIndex (#19972)
* [PR-1659](https://github.com/confluentinc/kafka/pull/1659) - Reapply “KAFKA-18296 Remove deprecated KafkaBasedLog constructor (#18
* [PR-20019](https://github.com/apache/kafka/pull/20019) - KAFKA-19429: Deflake streams_smoke_test (#20019)
* [PR-19999](https://github.com/apache/kafka/pull/19999) - KAFKA-19421: Deflake streams_broker_down_resilience_test (#19999)
* [PR-20004](https://github.com/apache/kafka/pull/20004) - KAFKA-19422: Deflake streams_application_upgrade_test (#20004)
* [PR-20005](https://github.com/apache/kafka/pull/20005) - KAFKA-19423: Deflake streams_broker_bounce_test (#20005)
* [PR-19983](https://github.com/apache/kafka/pull/19983) - KAFKA-19356: Prevent new consumer fetch assigned partitions not in explicit subscription  (#19983)
* [PR-19917](https://github.com/apache/kafka/pull/19917) - KAFKA-19297: Refactor AsyncKafkaConsumer’s use of Java Streams APIs in critical sections (#19917)
* [PR-19981](https://github.com/apache/kafka/pull/19981) - KAFKA-19413: Extended AuthorizerIntegrationTest to cover StreamsGroupDescribe (#19981)
* [PR-19978](https://github.com/apache/kafka/pull/19978) - KAFKA-19412: Extended AuthorizerIntegrationTest to cover StreamsGroupHeartbeat (#19978)
* [PR-19976](https://github.com/apache/kafka/pull/19976) - KAFKA-19367: Follow up bug fix (#19976)
* [PR-19800](https://github.com/apache/kafka/pull/19800) - KAFKA-14145; Faster KRaft HWM replication (#19800)
* [PR-1655](https://github.com/confluentinc/kafka/pull/1655) - Add back deprecated constructors in KafkaBasedLog
* [PR-19938](https://github.com/apache/kafka/pull/19938) - KAFKA-19153: Add OAuth integration tests (#19938)
* [PR-19910](https://github.com/apache/kafka/pull/19910) - KAFKA-19367: Fix InitProducerId with TV2 double-increments epoch if ongoing transaction is aborted (#19910)
* [PR-19814](https://github.com/apache/kafka/pull/19814) - KAFKA-18117; KAFKA-18729: Use assigned topic IDs to avoid full metadata requests on broker-side regex (#19814)
* [PR-19904](https://github.com/apache/kafka/pull/19904) - KAFKA-18961: Time-based refresh for server-side RE2J regex (#19904)
* [PR-19939](https://github.com/apache/kafka/pull/19939) - KAFKA-19359: force bump commons-beanutils for CVE-2025-48734 (#19939)
* [b311ac7d](https://github.com/apache/kafka/commit/b311ac7dd5bce649fd5bd83a948f95c8c468a9aa) - Temporarily disable artifact publishing for the 4.1 branch.
* [PR-19607](https://github.com/apache/kafka/pull/19607) - KAFKA-19221 Propagate IOException on LogSegment#close (#19607)
* [PR-19928](https://github.com/apache/kafka/pull/19928) - KAFKA-19389: Fix memory consumption for completed share fetch requests (#19928)
* [PR-19895](https://github.com/apache/kafka/pull/19895) - KAFKA-19244: Add support for kafka-streams-groups.sh options (delete offsets) [4/N] (#19895)
* [PR-19908](https://github.com/apache/kafka/pull/19908) - KAFKA-19376: Throw an error message if any unsupported feature is used with KIP-1071 (#19908)
* [PR-19936](https://github.com/apache/kafka/pull/19936) - KAFKA-19392 Fix metadata.log.segment.ms not being applied (#19936)
* [PR-19919](https://github.com/apache/kafka/pull/19919) - KAFKA-19382:Upgrade junit from 5.10 to 5.13 (#19919)
* [PR-19929](https://github.com/apache/kafka/pull/19929) - KAFKA-18486 Remove becomeLeaderOrFollower from readFromLogWithOffsetOutOfRange and other related methods. (#19929)
* [PR-19931](https://github.com/apache/kafka/pull/19931) - KAFKA-19283: Update transaction exception handling documentation (#19931)
* [PR-19832](https://github.com/apache/kafka/pull/19832) - KAFKA-19271: allow intercepting internal method call (#19832)
* [PR-19918](https://github.com/apache/kafka/pull/19918) - KAFKA-19386: Correcting ExpirationReaper thread names from Purgatory (#19918)
* [PR-19817](https://github.com/apache/kafka/pull/19817) - KAFKA-19334 MetadataShell execution unintentionally deletes lock file (#19817)
* [PR-19922](https://github.com/apache/kafka/pull/19922) - KAFKA-18486 Remove ReplicaManager#becomeLeaderOrFollower from testReplicaAlterLogDirs (#19922)
* [PR-19827](https://github.com/apache/kafka/pull/19827) - KAFKA-19042 Move PlaintextConsumerSubscriptionTest to client-integration-tests module (#19827)
* [PR-19883](https://github.com/apache/kafka/pull/19883) - KAFKA-18486 Update testExceptionWhenUnverifiedTransactionHasMultipleProducerIds  (#19883)
* [PR-19890](https://github.com/apache/kafka/pull/19890) - KAFKA-18486 Update activeProducerState wih KRaft mechanism in ReplicaManagerTest (#19890)
* [PR-19879](https://github.com/apache/kafka/pull/19879) - KAFKA-14895 [1/N] Move AddPartitionsToTxnManager files to java (#19879)
* [PR-19915](https://github.com/apache/kafka/pull/19915) - KAFKA-19295: Remove AsyncKafkaConsumer event ID generation (#19915)
* [PR-19902](https://github.com/apache/kafka/pull/19902) - KAFKA-18202: Add rejection for non-zero sequences in TV2 (KIP-890) (#19902)
* [PR-15913](https://github.com/apache/kafka/pull/15913) - KAFKA-19309 : Add transaction client template code in kafka examples (#15913)
* [PR-19900](https://github.com/apache/kafka/pull/19900) - KAFKA-19369: Add group.share.assignors config and integration test (#19900)
* [PR-19815](https://github.com/apache/kafka/pull/19815) - KAFKA-19290: Exploit mapKey optimisation in protocol requests and responses (wip) (#19815)
* [PR-19889](https://github.com/apache/kafka/pull/19889) - KAFKA-18913: Start state updater in task manager (#19889)
* [PR-19907](https://github.com/apache/kafka/pull/19907) - KAFKA-19370: Create JMH benchmark for share group assignor (#19907)
* [PR-19773](https://github.com/apache/kafka/pull/19773) - KAFKA-19042 Move PlaintextConsumerAssignTest to clients-integration-tests module (#19773)
* [PR-18739](https://github.com/apache/kafka/pull/18739) - KAFKA-16505: Add source raw key and value (#18739)
* [PR-19903](https://github.com/apache/kafka/pull/19903) - KAFKA-19373 Fix protocol name comparison (#19903)
* [PR-18325](https://github.com/apache/kafka/pull/18325) - KAFKA-19248: Multiversioning in Kafka Connect - Plugin Loading Isolation Tests (#18325)
* [PR-19844](https://github.com/apache/kafka/pull/19844) - KAFKA-18042: Reject the produce request with lower producer epoch early (KIP-890) (#19844)
* [PR-19901](https://github.com/apache/kafka/pull/19901) - KAFKA-19372: StreamsGroup not subscribed to a topic when empty (#19901)
* [PR-19722](https://github.com/apache/kafka/pull/19722) - KAFKA-19044: Handle tasks that are not present in the current topology (#19722)
* [PR-19856](https://github.com/apache/kafka/pull/19856) - KAFKA-17747: [7/N] Add consumer group integration test for rack aware assignment (#19856)
* [PR-19898](https://github.com/apache/kafka/pull/19898) - KAFKA-19347 Deduplicate ACLs when creating (#19898)
* [PR-19872](https://github.com/apache/kafka/pull/19872) - KAFKA-19328: SharePartitionManagerTest testMultipleConcurrentShareFetches doAnswer chaining needs verification (#19872)
* [PR-19758](https://github.com/apache/kafka/pull/19758) - KAFKA-19244: Add support for kafka-streams-groups.sh options (delete all groups) [2/N]  (#19758)
* [PR-19754](https://github.com/apache/kafka/pull/19754) - KAFKA-18573: Add support for OAuth jwt-bearer grant type (#19754)
* [PR-19656](https://github.com/apache/kafka/pull/19656) - KAFKA-19250 : txnProducer.abortTransaction() API should not return abortable exception (#19656)
* [PR-19522](https://github.com/apache/kafka/pull/19522) - KAFKA-19176: Update Transactional producer to translate retriable into abortable exceptions (#19522)
* [PR-19796](https://github.com/apache/kafka/pull/19796) - KAFKA-17747: [6/N] Replace subscription metadata with metadata hash in share group (#19796)
* [PR-19802](https://github.com/apache/kafka/pull/19802) - KAFKA-17747: [5/N] Replace subscription metadata with metadata hash in stream group (#19802)
* [PR-19861](https://github.com/apache/kafka/pull/19861) - KAFKA-19338: Error on read/write of uninitialized share part. (#19861)
* [PR-19878](https://github.com/apache/kafka/pull/19878) - KAFKA-19358: Updated share_consumer_test.py tests to use set_group_offset_reset_strategy (#19878)
* [PR-19849](https://github.com/apache/kafka/pull/19849) - KAFKA-19349 Move CreateTopicsRequestWithPolicyTest to clients-integration-tests (#19849)
* [PR-19877](https://github.com/apache/kafka/pull/19877) - KAFKA-18904: [4/N] Add ListClientMetricsResources metric if request is v0 ListConfigResources (#19877)
* [PR-19811](https://github.com/apache/kafka/pull/19811) - KAFKA-19320: Added share_consume_bench_test.py system tests (#19811)
* [PR-19831](https://github.com/apache/kafka/pull/19831) - KAFKA-16894: share.version becomes stable feature for preview (#19831)
* [PR-19836](https://github.com/apache/kafka/pull/19836) - KAFKA-19321: Added share_consumer_performance.py and related system tests (#19836)
* [PR-19866](https://github.com/apache/kafka/pull/19866) - KAFKA-19355 Remove interBrokerListenerName from ClusterControlManager (#19866)
* [PR-19327](https://github.com/apache/kafka/pull/19327) - KAFKA-19053 Remove FetchResponse#of which is not used in production … (#19327)
* [PR-19728](https://github.com/apache/kafka/pull/19728) - KAFKA-19284 Add documentation to clarify the behavior of null values for all partitionsToOffsetAndMetadata methods. (#19728)
* [PR-19864](https://github.com/apache/kafka/pull/19864) - KAFKA-19311 Document commitAsync behavioral differences between Classic and Async Consumer (#19864)
* [PR-19685](https://github.com/apache/kafka/pull/19685) - KAFKA-19042 Move GroupAuthorizerIntegrationTest to clients-integration-tests module (#19685)
* [PR-19651](https://github.com/apache/kafka/pull/19651) - KAFKA-19042 Move BaseConsumerTest, SaslPlainPlaintextConsumerTest to client-integration-tests module (#19651)
* [PR-19846](https://github.com/apache/kafka/pull/19846) - KAFKA-19346: Move LogReadResult to server module (#19846)
* [PR-19855](https://github.com/apache/kafka/pull/19855) - KAFKA-19351: AsyncConsumer#commitAsync should copy the input offsets (#19855)
* [PR-19810](https://github.com/apache/kafka/pull/19810) - KAFKA-19042 move ConsumerWithLegacyMessageFormatIntegrationTest to clients-integration-tests module (#19810)
* [PR-19808](https://github.com/apache/kafka/pull/19808) - KAFKA-18904: kafka-configs.sh return resource doesn’t exist message [3/N] (#19808)
* [PR-19714](https://github.com/apache/kafka/pull/19714) - KAFKA-19082:[4/4] Complete Txn Client Side Changes (KIP-939) (#19714)
* [PR-19404](https://github.com/apache/kafka/pull/19404) - KAFKA-6629: parameterise SegmentedCacheFunctionTest for session key schemas (#19404)
* [PR-19843](https://github.com/apache/kafka/pull/19843) - KAFKA-19337: Write state writes snapshot for higher state epoch. (#19843)
* [PR-19741](https://github.com/apache/kafka/pull/19741) - KAFKA-19056 Rewrite EndToEndClusterIdTest in Java and move it to the server module (#19741)
* [PR-19774](https://github.com/apache/kafka/pull/19774) - KAFKA-19316: added share_group_command_test.py system tests (#19774)
* [PR-19838](https://github.com/apache/kafka/pull/19838) - KAFKA-19344: Replace desc.assignablePartitions with spec.isPartitionAssignable. (#19838)
* [PR-19840](https://github.com/apache/kafka/pull/19840) - KAFKA-19347 Don’t update timeline data structures in createAcls (#19840)
* [PR-19826](https://github.com/apache/kafka/pull/19826) - KAFKA-19342: Authorization tests for alter share-group offsets (#19826)
* [PR-19818](https://github.com/apache/kafka/pull/19818) - KAFKA-19335: Membership managers send negative epoch in JOINING (#19818)
* [PR-19778](https://github.com/apache/kafka/pull/19778) - KAFKA-19285: Added more tests in SharePartitionManagerTest (#19778)
* [PR-19823](https://github.com/apache/kafka/pull/19823) - KAFKA-19310: (MINOR) Missing mocks for DelayedShareFetchTest tests related to Memory Records slicing (#19823)
* [PR-19761](https://github.com/apache/kafka/pull/19761) - KAFKA-17747: [4/N] Replace subscription metadata with metadata hash in consumer group (#19761)
* [PR-19835](https://github.com/apache/kafka/pull/19835) - KAFKA-19336 Upgrade Jackson to 2.19.0 (#19835)
* [PR-19744](https://github.com/apache/kafka/pull/19744) - KAFKA-19154; Offset Fetch API should return INVALID_OFFSET if requested topic id does not match persisted one (#19744)
* [PR-19812](https://github.com/apache/kafka/pull/19812) - KAFKA-19330 Change MockSerializer/Deserializer to use String serializer instead of byte[] (#19812)
* [PR-19790](https://github.com/apache/kafka/pull/19790) - KAFKA-18687: Setting the subscriptionMetadata during conversion to consumer group (#19790)
* [PR-19786](https://github.com/apache/kafka/pull/19786) - KAFKA-19268 Missing mocks for SharePartitionManagerTest tests (#19786)
* [PR-19798](https://github.com/apache/kafka/pull/19798) - KAFKA-19322 Remove the DelayedOperation constructor that accepts an external lock (#19798)
* [PR-19779](https://github.com/apache/kafka/pull/19779) - KAFKA-19300 AsyncConsumer#unsubscribe always timeout due to GroupAuthorizationException (#19779)
* [PR-19093](https://github.com/apache/kafka/pull/19093) - KAFKA-18424: Consider splitting PlaintextAdminIntegrationTest#testConsumerGroups (#19093)
* [PR-19371](https://github.com/apache/kafka/pull/19371) - KAFKA-19080 The constraint on segment.ms is not enforced at topic level (#19371)
* [PR-19681](https://github.com/apache/kafka/pull/19681) - KAFKA-19034 [1/N] Rewrite RemoteTopicCrudTest by ClusterTest and move it to storage module (#19681)
* [PR-19759](https://github.com/apache/kafka/pull/19759) - KAFKA-19312 Avoiding concurrent execution of onComplete and tryComplete (#19759)
* [PR-19767](https://github.com/apache/kafka/pull/19767) - KAFKA-19313 Replace LogOffsetMetadata#UNIFIED_LOG_UNKNOWN_OFFSET by UnifiedLog.UNKNOWN_OFFSET (#19767)
* [PR-19747](https://github.com/apache/kafka/pull/19747) - KAFKA-18345; Wait the entire election timeout on election loss (#19747)
* [PR-19687](https://github.com/apache/kafka/pull/19687) - KAFKA-19260 Move LoggingController to server module (#19687)
* [PR-18929](https://github.com/apache/kafka/pull/18929) - KAFKA-16717 [2/N]: Add AdminClient.alterShareGroupOffsets (#18929)
* [PR-19729](https://github.com/apache/kafka/pull/19729) - KAFKA-19069 DumpLogSegments does not dump the LEADER_CHANGE record (#19729)
* [PR-19781](https://github.com/apache/kafka/pull/19781) - KAFKA-19204: Add timestamp to share state metadata init maps [1/N] (#19781)
* [PR-19582](https://github.com/apache/kafka/pull/19582) - KAFKA-19042 Move PlaintextConsumerPollTest to client-integration-tests module (#19582)
* [PR-19763](https://github.com/apache/kafka/pull/19763) - KAFKA-19314 Remove unnecessary code of closing snapshotWriter (#19763)
* [PR-19743](https://github.com/apache/kafka/pull/19743) - KAFKA-18904: Add Admin#listConfigResources [2/N] (#19743)
* [PR-19757](https://github.com/apache/kafka/pull/19757) - KAFKA-19291: Increase the timeout of remote storage share fetch requests in purgatory (#19757)
* [PR-18951](https://github.com/apache/kafka/pull/18951) - KAFKA-4650: Add unit tests for GraphNode class (#18951)
* [PR-19749](https://github.com/apache/kafka/pull/19749) - KAFKA-19287 document all group coordinator metrics (#19749)
* [PR-19731](https://github.com/apache/kafka/pull/19731) - KAFKA-18783 : Extend InvalidConfigurationException related exceptions (#19731)
* [PR-19658](https://github.com/apache/kafka/pull/19658) - KAFKA-18345; Prevent livelocked elections (#19658)
* [PR-1627](https://github.com/confluentinc/kafka/pull/1627) - Trigger cp-jar-build to verify CP packaging in after_pipeline job
* [PR-19755](https://github.com/apache/kafka/pull/19755) - KAFKA-19302 Move ReplicaState and Replica to server module (#19755)
* [PR-19389](https://github.com/apache/kafka/pull/19389) - KAFKA-19042 Move PlaintextConsumerCommitTest to client-integration-tests module (#19389)
* [PR-19611](https://github.com/apache/kafka/pull/19611) - KAFKA-17747: [3/N] Get rid of TopicMetadata in SubscribedTopicDescriberImpl (#19611)
* [PR-19700](https://github.com/apache/kafka/pull/19700) - KAFKA-19202: Enable KIP-1071 in streams_eos_test (#19700)
* [PR-19717](https://github.com/apache/kafka/pull/19717) - KAFKA-19280: Fix NoSuchElementException in UnifiedLog (#19717)
* [PR-19691](https://github.com/apache/kafka/pull/19691) - KAFKA-19256: Only send IQ metadata on assignment changes (#19691)
* [PR-19708](https://github.com/apache/kafka/pull/19708) - KAFKA-19226: Added test_console_share_consumer.py (#19708)
* [PR-19683](https://github.com/apache/kafka/pull/19683) - KAFKA-19141; Persist topic id in OffsetCommit record (#19683)
* [PR-19697](https://github.com/apache/kafka/pull/19697) - KAFKA-19271:  Add internal ConsumerWrapper (#19697)
* [PR-1625](https://github.com/confluentinc/kafka/pull/1625) - Increase timeout for Connect tests
* [PR-19734](https://github.com/apache/kafka/pull/19734) - KAFKA-19217: Fix ShareConsumerTest.testComplexConsumer flakiness. (#19734)
* [PR-19507](https://github.com/apache/kafka/pull/19507) - KAFKA-19171: Kafka Streams crashes with UnsupportedOperationException (#19507)
* [PR-19709](https://github.com/apache/kafka/pull/19709) - KAFKA-19267 the min version used by ListOffsetsRequest should be 1 rather than 0 (#19709)
* [PR-19580](https://github.com/apache/kafka/pull/19580) - KAFKA-19208: KStream-GlobalKTable join should not drop left-null-key record (#19580)
* [PR-19493](https://github.com/apache/kafka/pull/19493) - KAFKA-18904: [1/N] Change ListClientMetricsResources API to ListConfigResources (#19493)
* [PR-19713](https://github.com/apache/kafka/pull/19713) - KAFKA-19274; Group Coordinator Shards are not unloaded when \_\_consumer_offsets topic is deleted (#19713)
* [PR-19701](https://github.com/apache/kafka/pull/19701) - KAFKA-19231-1: Handle fetch request when share session cache is full (#19701)
* [PR-19721](https://github.com/apache/kafka/pull/19721) - KAFKA-19281: Add share enable flag to periodic jobs. (#19721)
* [PR-19523](https://github.com/apache/kafka/pull/19523) - KAFKA-17747: [2/N] Add compute topic and group hash (#19523)
* [PR-19698](https://github.com/apache/kafka/pull/19698) - KAFKA-19269 Unexpected error .. should not happen when the delete.topic.enable is false (#19698)
* [PR-19718](https://github.com/apache/kafka/pull/19718) - KAFKA-19270: Remove Optional from ClusterInstance#controllerListenerName() return type (#19718)
* [PR-19539](https://github.com/apache/kafka/pull/19539) - KAFKA-19082:[3/4] Add prepare txn method (KIP-939) (#19539)
* [PR-19586](https://github.com/apache/kafka/pull/19586) - KAFKA-18666: Controller-side monitoring for broker shutdown and startup (#19586)
* [PR-19635](https://github.com/apache/kafka/pull/19635) - KAFKA-19234: broker should return UNAUTHORIZATION error for non-existing topic in produce request (#19635)
* [PR-19702](https://github.com/apache/kafka/pull/19702) - KAFKA-19273 Ensure the delete policy is configured when the tiered storage is enabled (#19702)
* [PR-19553](https://github.com/apache/kafka/pull/19553) - KAFKA-19091 Fix race condition in DelayedFutureTest (#19553)
* [PR-19666](https://github.com/apache/kafka/pull/19666) - KAFKA-19116, KAFKA-19258: Handling share group member change events (#19666)
* [PR-19569](https://github.com/apache/kafka/pull/19569) - KAFKA-19206 ConsumerNetworkThread.cleanup() throws NullPointerException if initializeResources() previously failed (#19569)
* [PR-19712](https://github.com/apache/kafka/pull/19712) - KAFKA-19275 client-state and thread-state metrics are always “Unavailable” (#19712)
* [PR-19630](https://github.com/apache/kafka/pull/19630) - KAFKA-19145 Move LeaderEndPoint to Server module (#19630)
* [PR-19622](https://github.com/apache/kafka/pull/19622) - KAFKA-18847: Refactor OAuth layer to improve reusability 1/N (#19622)
* [PR-19677](https://github.com/apache/kafka/pull/19677) - KAFKA-18688: Fix uniform homogeneous assignor stability (#19677)
* [PR-19659](https://github.com/apache/kafka/pull/19659) - KAFKA-19253: Improve metadata handling for share version using feature listeners (1/N) (#19659)
* [PR-19559](https://github.com/apache/kafka/pull/19559) - KAFKA-19201: Handle deletion of user topics part of share partitions. (#19559)
* [PR-19515](https://github.com/apache/kafka/pull/19515) - KAFKA-14691; Add TopicId to OffsetFetch API (#19515)
* [PR-19705](https://github.com/apache/kafka/pull/19705) - KAFKA-19245: Updated default locks config for share group (#19705)
* [PR-19496](https://github.com/apache/kafka/pull/19496) - KAFKA-19163: Avoid deleting groups with pending transactional offsets (#19496)
* [PR-1554](https://github.com/confluentinc/kafka/pull/1554) - Chore: update repo by service bot
* [PR-19644](https://github.com/apache/kafka/pull/19644) - KAFKA-18905; Disable idempotent producer to remove test flakiness  (#19644)
* [PR-19631](https://github.com/apache/kafka/pull/19631) - KAFKA-19242: Fix commit bugs caused by race condition during rebalancing. (#19631)
* [PR-19497](https://github.com/apache/kafka/pull/19497) - KAFKA-19160;KAFKA-19164; Improve performance of fetching stable offsets (#19497)
* [PR-19633](https://github.com/apache/kafka/pull/19633) - KAFKA-18695 Remove quorum=kraft and kip932 from all integration tests (#19633)
* [PR-19673](https://github.com/apache/kafka/pull/19673) - KAFKA-19264 Remove fallback for thread pool sizes in RemoteLogManagerConfig (#19673)
* [PR-19346](https://github.com/apache/kafka/pull/19346) - KAFKA-19068 Eliminate the duplicate type check in creating ControlRecord (#19346)
* [PR-19543](https://github.com/apache/kafka/pull/19543) - KAFKA-19109 Don’t print null in kafka-metadata-quorum describe status (#19543)
* [PR-19650](https://github.com/apache/kafka/pull/19650) - KAFKA-19220 Add tests to ensure the internal configs don’t return by public APIs by default (#19650)
* [PR-1623](https://github.com/confluentinc/kafka/pull/1623) - KBROKER-295: Ignore failing quota_test
* [PR-1622](https://github.com/confluentinc/kafka/pull/1622) - KBROKER-295: Ignore failing quota_test
* [PR-19508](https://github.com/apache/kafka/pull/19508) - KAFKA-17897: Deprecate Admin.listConsumerGroups [2/N] (#19508)
* [PR-19657](https://github.com/apache/kafka/pull/19657) - KAFKA-19209: Clarify index.interval.bytes impact on offset and time index (#19657)
* [PR-18391](https://github.com/apache/kafka/pull/18391) - KAFKA-18115; Fix for loading big files while performing load tests (#18391)
* [PR-19608](https://github.com/apache/kafka/pull/19608) - KAFKA-19182 Move SchedulerTest to server module (#19608)
* [PR-19568](https://github.com/apache/kafka/pull/19568) - KAFKA-19087 Move TransactionState to transaction-coordinator module (#19568)
* [PR-19581](https://github.com/apache/kafka/pull/19581) - KAFKA-18855 Slice API for MemoryRecords (#19581)
* [PR-19590](https://github.com/apache/kafka/pull/19590) - KAFKA-19212: Correct the unclean leader election metric calculation (#19590)
* [PR-19609](https://github.com/apache/kafka/pull/19609) - KAFKA-19214: Clean up use of Optionals in RequestManagers.entries() (#19609)
* [PR-19640](https://github.com/apache/kafka/pull/19640) - KAFKA-19241: Updated tests in ShareFetchAcknowledgeRequestTest to reuse the socket for subsequent requests (#19640)
* [PR-19598](https://github.com/apache/kafka/pull/19598) - KAFKA-19215: Handle share partition fetch lock cleanly using tokens (#19598)
* [PR-19625](https://github.com/apache/kafka/pull/19625) - KAFKA-19202: Enable KIP-1071 in streams_standby_replica_test.py (#19625)
* [PR-19602](https://github.com/apache/kafka/pull/19602) - KAFKA-19218: Add missing leader epoch to share group state summary response (#19602)
* [PR-19574](https://github.com/apache/kafka/pull/19574) - KAFKA-19207 Move ForwardingManagerMetrics and ForwardingManagerMetricsTest to server module (#19574)
* [PR-19528](https://github.com/apache/kafka/pull/19528) - KAFKA-19170 Move MetricsDuringTopicCreationDeletionTest to client-integration-tests module (#19528)
* [PR-19612](https://github.com/apache/kafka/pull/19612) - KAFKA-19227: Piggybacked share fetch acknowledgements performance issue (#19612)
* [PR-19639](https://github.com/apache/kafka/pull/19639) - KAFKA-19216: Eliminate flakiness in kafka.server.share.SharePartitionTest (#19639)
* [PR-19592](https://github.com/apache/kafka/pull/19592) - KAFKA-19133: Support fetching for multiple remote fetch topic partitions in a single share fetch request (#19592)
* [PR-19641](https://github.com/apache/kafka/pull/19641) - KAFKA-19240 Move MetadataVersionIntegrationTest to clients-integration-tests module (#19641)
* [PR-19619](https://github.com/apache/kafka/pull/19619) - KAFKA-19232: Handle Share session limit reached exception in clients. (#19619)
* [PR-19629](https://github.com/apache/kafka/pull/19629) - KAFKA-19131: Adjust remote storage reader thread maximum pool size to avoid illegal argument (#19629)
* [PR-19393](https://github.com/apache/kafka/pull/19393) - KAFKA-19060 Documented null edge cases in the Clients API JavaDoc (#19393)
* [PR-19578](https://github.com/apache/kafka/pull/19578) - KAFKA-19205: inconsistent result of beginningOffsets/endoffset between classic and async consumer with 0 timeout (#19578)
* [PR-19571](https://github.com/apache/kafka/pull/19571) - KAFKA-18267 Add unit tests for CloseOptions (#19571)
* [PR-19603](https://github.com/apache/kafka/pull/19603) - KAFKA-19204: Allow persister retry of initializing topics. (#19603)
* [PR-1621](https://github.com/confluentinc/kafka/pull/1621) - Dexcom fix master
* [PR-1620](https://github.com/confluentinc/kafka/pull/1620) - Dexcom fix 4.0
* [PR-19475](https://github.com/apache/kafka/pull/19475) - KAFKA-19146 Merge OffsetAndEpoch from raft to server-common (#19475)
* [PR-19606](https://github.com/apache/kafka/pull/19606) - KAFKA-16894 Correct definition of ShareVersion (#19606)
* [PR-19355](https://github.com/apache/kafka/pull/19355) - KAFKA-19073 add transactional ID pattern filter to ListTransactions (#19355)
* [PR-19430](https://github.com/apache/kafka/pull/19430) - KAFKA-17541:[1/2] Improve handling of delivery count  (#19430)
* [PR-19329](https://github.com/apache/kafka/pull/19329) - KAFKA-19015: Remove share session from cache on share consumer connection drop (#19329)
* [PR-19540](https://github.com/apache/kafka/pull/19540) - KAFKA-19169: Enhance AuthorizerIntegrationTest for share group APIs (#19540)
* [PR-19604](https://github.com/apache/kafka/pull/19604) - KAFKA-19202: Enable KIP-1071 in streams_relational_smoke_test (#19604)
* [PR-19587](https://github.com/apache/kafka/pull/19587) - KAFKA-16718-4/n: ShareGroupCommand changes for DeleteShareGroupOffsets admin call (#19587)
* [PR-19601](https://github.com/apache/kafka/pull/19601) - KAFKA-19210: resolved the flakiness in testShareGroupHeartbeatInitializeOnPartitionUpdate (#19601)
* [PR-19542](https://github.com/apache/kafka/pull/19542) - KAFKA-16894: Exploit share feature [3/N] (#19542)
* [PR-19594](https://github.com/apache/kafka/pull/19594) - KAFKA-19202: Enable KIP-1071 in streams_broker_down_resilience_test (#19594)
* [PR-19509](https://github.com/apache/kafka/pull/19509) - KAFKA-19173: Add Feature for “streams” group (#19509)
* [PR-19191](https://github.com/apache/kafka/pull/19191) - KAFKA-18760: Deprecate Optional<String> and return String from public Endpoint#listener (#19191)
* [PR-19519](https://github.com/apache/kafka/pull/19519) - KAFKA-19139 Plugin#wrapInstance should use LinkedHashMap instead of Map (#19519)
* [PR-19588](https://github.com/apache/kafka/pull/19588) - KAFKA-19135 Migrate initial IQ support for KIP-1071 from feature branch to trunk (#19588)
* [PR-15968](https://github.com/apache/kafka/pull/15968) - KAFKA-10551: Add topic id support to produce request and response (#15968)
* [PR-19470](https://github.com/apache/kafka/pull/19470) - KAFKA-19082: [2/4] Add preparedTxnState class to Kafka Producer (KIP-939) (#19470)
* [PR-19584](https://github.com/apache/kafka/pull/19584) - KAFKA-19202: Enable KIP-1071 in streams_broker_bounce_test.py (#19584)
* [PR-19593](https://github.com/apache/kafka/pull/19593) - KAFKA-19181-2: Increased offsets.commit.timeout.ms value as a temporary solution for the system test test_broker_failure failure (#19593)
* [PR-19560](https://github.com/apache/kafka/pull/19560) - KAFKA-19202: Enable KIP-1071 in streams_smoke_test.py (#19560)
* [PR-19555](https://github.com/apache/kafka/pull/19555) - KAFKA-19195: Only send the right group ID subset to each GC shard (#19555)
* [PR-19535](https://github.com/apache/kafka/pull/19535) - KAFKA-19183 Replace Pool with ConcurrentHashMap (#19535)
* [PR-19529](https://github.com/apache/kafka/pull/19529) - KAFKA-19178 Replace Vector by ArrayList for PluginClassLoader#getResources (#19529)
* [PR-19478](https://github.com/apache/kafka/pull/19478) - KAFKA-16718-3/n: Added the ShareGroupStatePartitionMetadata record during deletion of share group offsets (#19478)
* [PR-19520](https://github.com/apache/kafka/pull/19520) - KAFKA-19042 Move PlaintextConsumerFetchTest to client-integration-tests module (#19520)
* [PR-19532](https://github.com/apache/kafka/pull/19532) - KAFKA-19131: Adjust remote storage reader thread maximum pool size to avoid illegal argument (#19532)
* [PR-19504](https://github.com/apache/kafka/pull/19504) - KAFKA-17747: [1/N] Add MetadataHash field to Consumer/Share/StreamGroupMetadataValue (#19504)
* [PR-19544](https://github.com/apache/kafka/pull/19544) - KAFKA-19190: Handle shutdown application correctly (#19544)
* [PR-19552](https://github.com/apache/kafka/pull/19552) - KAFKA-19198: Resolve NPE when topic assigned in share group is deleted (#19552)
* [PR-19548](https://github.com/apache/kafka/pull/19548) - KAFKA-19195: Only send the right group ID subset to each GC shard (#19548)
* [PR-19450](https://github.com/apache/kafka/pull/19450) - KAFKA-19128: Kafka Streams should not get offsets when close dirty (#19450)
* [PR-19545](https://github.com/apache/kafka/pull/19545) - KAFKA-19192; Old bootstrap checkpoint files cause problems updated servers (#19545)
* [PR-17988](https://github.com/apache/kafka/pull/17988) - KAFKA-18988: Connect Multiversion Support (Updates to status and metrics) (#17988)
* [PR-19429](https://github.com/apache/kafka/pull/19429) - KAFKA-19082: [1/4] Add client config for enable2PC and overloaded initProducerId (KIP-939) (#19429)
* [PR-19536](https://github.com/apache/kafka/pull/19536) - KAFKA-18889: Make records in ShareFetchResponse non-nullable (#19536)
* [PR-19457](https://github.com/apache/kafka/pull/19457) - KAFKA-19110: Add missing unit test for Streams-consumer integration (#19457)
* [PR-19440](https://github.com/apache/kafka/pull/19440) - KAFKA-15767 Refactor TransactionManager to avoid use of ThreadLocal (#19440)
* [PR-19453](https://github.com/apache/kafka/pull/19453) - KAFKA-19124: Follow up on code improvements (#19453)
* [PR-19443](https://github.com/apache/kafka/pull/19443) - KAFKA-18170: Add scheduled job to snapshot cold share partitions. (#19443)
* [PR-19505](https://github.com/apache/kafka/pull/19505) - KAFKA-19156: Streamlined share group configs, with usage in ShareSessionCache (#19505)
* [PR-19541](https://github.com/apache/kafka/pull/19541) - KAFKA-19181: removed assertions in test_share_multiple_partitions as a result of change in assignor algorithm (#19541)
* [PR-19461](https://github.com/apache/kafka/pull/19461) - KAFKA-14690; Add TopicId to OffsetCommit API (#19461)
* [PR-19416](https://github.com/apache/kafka/pull/19416) - KAFKA-16538; Enable upgrading kraft version for existing clusters (#19416)
* [PR-18673](https://github.com/apache/kafka/pull/18673) - KAFKA-18572: Update Kafka Streams metric documenation (#18673)
* [PR-19500](https://github.com/apache/kafka/pull/19500) - KAFKA-19159: Removed time based evictions for share sessions (#19500)
* [PR-19378](https://github.com/apache/kafka/pull/19378) - KAFKA-19057: Stabilize KIP-932 RPCs for AK 4.1 (#19378)
* [PR-19518](https://github.com/apache/kafka/pull/19518) - KAFKA-19166: Fix RC tag in release script (#19518)
* [PR-19525](https://github.com/apache/kafka/pull/19525) - KAFKA-19179: remove the dot from thread_dump_url (#19525)
* [PR-19437](https://github.com/apache/kafka/pull/19437) - KAFKA-19019: Add support for remote storage fetch for share groups (#19437)
* [PR-17099](https://github.com/apache/kafka/pull/17099) - KAFKA-8830 make Record Headers available in onAcknowledgement (#17099)
* [PR-19526](https://github.com/apache/kafka/pull/19526) - KAFKA-19180 Fix the hanging testPendingTaskSize (#19526)
* [PR-19302](https://github.com/apache/kafka/pull/19302) - KAFKA-14487: Move LogManager static methods/fields to storage module (#19302)
* [PR-19487](https://github.com/apache/kafka/pull/19487) - KAFKA-18854 remove DynamicConfig inner class  (#19487)
* [PR-19286](https://github.com/apache/kafka/pull/19286) - KAFKA-18891: Add KIP-877 support to RemoteLogMetadataManager and RemoteStorageManager (#19286)
* [PR-19462](https://github.com/apache/kafka/pull/19462) - KAFKA-17184: Fix the error thrown while accessing the RemoteIndexCache (#19462)
* [PR-19477](https://github.com/apache/kafka/pull/19477) - KAFKA-17897 Deprecate Admin.listConsumerGroups (#19477)
* [PR-18926](https://github.com/apache/kafka/pull/18926) - KAFKA-18332 fix ClassDataAbstractionCoupling problem in KafkaRaftClientTest(1/2) (#18926)
* [PR-19465](https://github.com/apache/kafka/pull/19465) - KAFKA-19136 Move metadata-related configs from KRaftConfigs to MetadataLogConfig (#19465)
* [PR-19503](https://github.com/apache/kafka/pull/19503) - KAFKA-19157: added group.share.max.share.sessions config (#19503)
* [PR-1614](https://github.com/confluentinc/kafka/pull/1614) - CONFLUENT: Fix tools-log4j files in the scripts
* [PR-1613](https://github.com/confluentinc/kafka/pull/1613) - CONFLUENT: Fix tools-log4j files names in the scripts
* [PR-19474](https://github.com/apache/kafka/pull/19474) - KAFKA-14523: Move kafka.log.remote classes to storage (#19474)
* [PR-19491](https://github.com/apache/kafka/pull/19491) - KAFKA-19162: Topology metadata contains non-deterministically ordered topic configs (#19491)
* [PR-19394](https://github.com/apache/kafka/pull/19394) - KAFKA-19054: StreamThread exception handling with SHUTDOWN_APPLICATION may trigger a tight loop with MANY logs (#19394)
* [PR-19454](https://github.com/apache/kafka/pull/19454) - KAFKA-19130: Do not add fenced brokers to BrokerRegistrationTracker on startup (#19454)
* [PR-19460](https://github.com/apache/kafka/pull/19460) - KAFKA-19002 Rewrite ListOffsetsIntegrationTest and move it to clients-integration-test (#19460)
* [PR-19492](https://github.com/apache/kafka/pull/19492) - KAFKA-19158: Add SHARE_SESSION_LIMIT_REACHED error code (#19492)
* [PR-19488](https://github.com/apache/kafka/pull/19488) - KAFKA-19147: Start authorizer before group coordinator to ensure coordinator authorizes regex topics (#19488)
* [PR-19298](https://github.com/apache/kafka/pull/19298) - KAFKA-19042 Move PlaintextConsumerCallbackTest to client-integration-tests module (#19298)
* [PR-19472](https://github.com/apache/kafka/pull/19472) - KAFKA-13610: Deprecate log.cleaner.enable configuration (#19472)
* [PR-19050](https://github.com/apache/kafka/pull/19050) - KAFKA-18888: Add KIP-877 support to Authorizer (#19050)
* [PR-19420](https://github.com/apache/kafka/pull/19420) - KAFKA-18983 Ensure all README.md(s) are mentioned by the root README.md (#19420)
* [PR-19433](https://github.com/apache/kafka/pull/19433) - KAFKA-18288: Add support kafka-streams-groups.sh –describe (#19433)
* [PR-19464](https://github.com/apache/kafka/pull/19464) - KAFKA-19137 Use StandardCharsets.UTF_8 instead of StandardCharsets.UTF_8.name() (#19464)
* [PR-19364](https://github.com/apache/kafka/pull/19364) - KAFKA-15370: ACL changes to support 2PC (KIP-939) (#19364)
* [PR-19417](https://github.com/apache/kafka/pull/19417) - KAFKA-18900: Implement share.acknowledgement.mode to choose acknowledgement mode (#19417)
* [PR-19463](https://github.com/apache/kafka/pull/19463) - KAFKA-18629: Account for existing deleting topics in share group delete. (#19463)
* [PR-19319](https://github.com/apache/kafka/pull/19319) - KAFKA-19042 Move ProducerCompressionTest, ProducerFailureHandlingTest, and ProducerIdExpirationTest to client-integration-tests module (#19319)
* [PR-19391](https://github.com/apache/kafka/pull/19391) - KAFKA-14523: Decouple RemoteLogManager and Partition (#19391)
* [PR-19469](https://github.com/apache/kafka/pull/19469) - KAFKA-18172 Move RemoteIndexCacheTest to the storage module (#19469)
* [PR-19426](https://github.com/apache/kafka/pull/19426) - KAFKA-19119 Move ApiVersionManager/SimpleApiVersionManager to server (#19426)
* [PR-19439](https://github.com/apache/kafka/pull/19439) - KAFKA-19121 Move AddPartitionsToTxnConfig and TransactionStateManagerConfig out of KafkaConfig (#19439)
* [PR-19424](https://github.com/apache/kafka/pull/19424) - KAFKA-19113: Migrate DelegationTokenManager to server module (#19424)
* [PR-19347](https://github.com/apache/kafka/pull/19347) - KAFKA-19027 Replace ConsumerGroupCommandTestUtils#generator by ClusterTestDefaults (#19347)
* [PR-19431](https://github.com/apache/kafka/pull/19431) - KAFKA-19115: Utilize initialized topics info to verify delete share group offsets (#19431)
* [PR-19419](https://github.com/apache/kafka/pull/19419) - KAFKA-15371 MetadataShell is stuck when bootstrapping (#19419)
* [PR-19345](https://github.com/apache/kafka/pull/19345) - KAFKA-19071: Fix doc for remote.storage.enable (#19345)
* [PR-19374](https://github.com/apache/kafka/pull/19374) - KAFKA-19030 Remove metricNamePrefix from RequestChannel (#19374)
* [PR-19387](https://github.com/apache/kafka/pull/19387) - KAFKA-14485: Move LogCleaner to storage module (#19387)
* [PR-19293](https://github.com/apache/kafka/pull/19293) - KAFKA-16894: Define feature to enable share groups (#19293)
* [PR-19436](https://github.com/apache/kafka/pull/19436) - KAFKA-19127: Integration test for altering and describing streams group configs (#19436)
* [PR-19441](https://github.com/apache/kafka/pull/19441) - KAFKA-19103 Remove OffsetConfig (#19441)
* [PR-19438](https://github.com/apache/kafka/pull/19438) - KAFKA-19118: Enable KIP-1071 in StandbyTaskCreationIntegrationTest (#19438)
* [PR-19423](https://github.com/apache/kafka/pull/19423) - KAFKA-18286: Implement support for streams groups in kafka-groups.sh (#19423)
* [PR-19289](https://github.com/apache/kafka/pull/19289) - KAFKA-19042 Move TransactionsWithMaxInFlightOneTest to client-integration-tests module (#19289)
* [PR-19410](https://github.com/apache/kafka/pull/19410) - KAFKA-19101 Remove ControllerMutationQuotaManager#throttleTimeMs unused parameter (#19410)
* [PR-19354](https://github.com/apache/kafka/pull/19354) - KAFKA-18782: Extend ApplicationRecoverableException related exceptions (#19354)
* [PR-19363](https://github.com/apache/kafka/pull/19363) - KAFKA-18629: Utilize share group partition metadata for delete group. (#19363)
* [PR-19421](https://github.com/apache/kafka/pull/19421) - KAFKA-19124: Use consumer background event queue for Streams events (#19421)
* [PR-19425](https://github.com/apache/kafka/pull/19425) - KAFKA-19118: Enable KIP-1071 in InternalTopicIntegrationTest (#19425)
* [PR-19432](https://github.com/apache/kafka/pull/19432) - KAFKA-18170: Add create and write timestamp fields in share snapshot [1/N] (#19432)
* [PR-19167](https://github.com/apache/kafka/pull/19167) - KAFKA-18935: Ensure brokers do not return null records in FetchResponse (#19167)
* [PR-19261](https://github.com/apache/kafka/pull/19261) - KAFKA-16729: Support isolation level for share consumer (#19261)
* [PR-19188](https://github.com/apache/kafka/pull/19188) - KAFKA-18962: Fix onBatchRestored call in GlobalStateManagerImpl (#19188)
* [83f6a1d7](https://github.com/apache/kafka/commit/83f6a1d7e6dfce4a78e1192a8fecf523b39ddaab) - KAFKA-18991; Missing change for cherry-pick
* [PR-19223](https://github.com/apache/kafka/pull/19223) - KAFKA-18991: FetcherThread should match leader epochs between fetch request and fetch state (#19223)
* [PR-19422](https://github.com/apache/kafka/pull/19422) - KAFKA-18287: Add support for kafka-streams-groups.sh –list (#19422)
* [PR-18852](https://github.com/apache/kafka/pull/18852) - KAFKA-18723; Better handle invalid records during replication (#18852)
* [PR-19377](https://github.com/apache/kafka/pull/19377) - KAFKA-19037: Integrate consumer-side code with Streams (#19377)
* [PR-1611](https://github.com/confluentinc/kafka/pull/1611) - Fix build failure (#1582)
* [PR-19390](https://github.com/apache/kafka/pull/19390) - KAFKA-19090: Move DelayedFuture and DelayedFuturePurgatory to server module (#19390)
* [PR-19213](https://github.com/apache/kafka/pull/19213) - KAFKA-18984: Reset interval.ms By Using kafka-client-metrics.sh (#19213)
* [PR-18976](https://github.com/apache/kafka/pull/18976) - KAFKA-16718-2/n: KafkaAdminClient and GroupCoordinator implementation for DeleteShareGroupOffsets RPC (#18976)
* [PR-19384](https://github.com/apache/kafka/pull/19384) - KAFKA-19093 Change the “Handler on Broker” to “Handler on Controller” for controller server (#19384)
* [PR-19296](https://github.com/apache/kafka/pull/19296) - KAFKA-19047: Allow quickly re-registering brokers that are in controlled shutdown (#19296)
* [PR-19413](https://github.com/apache/kafka/pull/19413) - KAFKA-19099 Remove GroupSyncKey, GroupJoinKey, and MemberKey (#19413)
* [PR-19068](https://github.com/apache/kafka/pull/19068) - KAFKA-18892: Add KIP-877 support for ClientQuotaCallback (#19068)
* [PR-19406](https://github.com/apache/kafka/pull/19406) - KAFKA-19100: Use ProcessRole instead of String in AclApis (#19406)
* [PR-19398](https://github.com/apache/kafka/pull/19398) - KAFKA-19098 Remove lastOffset from PartitionResponse (#19398)
* [PR-19359](https://github.com/apache/kafka/pull/19359) - KAFKA-19077: Propagate shutdownRequested field (#19359)
* [PR-19219](https://github.com/apache/kafka/pull/19219) - KAFKA-19001: Use streams group-level configurations in heartbeat (#19219)
* [PR-19369](https://github.com/apache/kafka/pull/19369) - KAFKA-19084: Port KAFKA-16224, KAFKA-16764 for ShareConsumers (#19369)
* [PR-19392](https://github.com/apache/kafka/pull/19392) - KAFKA-19076 replace String by Supplier<String> for UnifiedLog#maybeHandleIOException (#19392)
* [PR-17614](https://github.com/apache/kafka/pull/17614) - KAFKA-16758: Extend Consumer#close with an option to leave the group or not (#17614)
* [PR-19303](https://github.com/apache/kafka/pull/19303) - KAFKA-16407: Fix foreign key INNER join on change of FK from/to a null value (#19303)
* [PR-19242](https://github.com/apache/kafka/pull/19242) - KAFKA-19013 Reformat PR body to 72 characters (#19242)
* [PR-19288](https://github.com/apache/kafka/pull/19288) - KAFKA-19042 Move TransactionsExpirationTest to client-integration-tests module (#19288)
* [PR-19357](https://github.com/apache/kafka/pull/19357) - KAFKA-19074 Remove the cached responseData from ShareFetchResponse (#19357)
* [PR-19285](https://github.com/apache/kafka/pull/19285) - KAFKA-14523: Move DelayedRemoteListOffsets to the storage module (#19285)
* [PR-19323](https://github.com/apache/kafka/pull/19323) - KAFKA-13747: refactor TopologyTest to test different store type parametrized (#19323)
* [PR-19370](https://github.com/apache/kafka/pull/19370) - KAFKA-19085: SharePartitionManagerTest testMultipleConcurrentShareFetches throws silent exception and works incorrectly (#19370)
* [PR-19218](https://github.com/apache/kafka/pull/19218) - KAFKA-7952: use in memory stores for KTable test (#19218)
* [PR-19328](https://github.com/apache/kafka/pull/19328) - KAFKA-18761: [2/N] List share group offsets with state and auth (#19328)
* [PR-19005](https://github.com/apache/kafka/pull/19005) - KAFKA-18713: Fix FK Left-Join result race condition (#19005)
* [PR-19269](https://github.com/apache/kafka/pull/19269) - KAFKA-18067: Add a flag to disable producer reset during active task creator shutting down (#19269)
* [PR-19320](https://github.com/apache/kafka/pull/19320) - KAFKA-19055 Cleanup the 0.10.x information from clients module (#19320)
* [PR-19348](https://github.com/apache/kafka/pull/19348) - KAFKA-19075: Included other share group dynamic configs in extractShareGroupConfigMap method in ShareGroupConfig (#19348)
* [PR-19333](https://github.com/apache/kafka/pull/19333) - KAFKA-19064: Handle exceptions from deferred events in coordinator (#19333)
* [PR-19339](https://github.com/apache/kafka/pull/19339) - KAFKA-18827: Incorporate initializing topics in share group heartbeat [4/N] (#19339)
* [PR-19111](https://github.com/apache/kafka/pull/19111) - KAFKA-18923: resource leak in RSM fetchIndex inputStream (#19111)
* [PR-19317](https://github.com/apache/kafka/pull/19317) - KAFKA-18949 add consumer protocol to testDeleteRecordsAfterCorruptRecords (#19317)
* [PR-19324](https://github.com/apache/kafka/pull/19324) - KAFKA-19058 Running the streams/streams-scala module tests produces a streams-scala.log (#19324)
* [PR-19276](https://github.com/apache/kafka/pull/19276) - KAFKA-19003: Add forceTerminateTransaction command to CLI tools (#19276)
* [PR-19226](https://github.com/apache/kafka/pull/19226) - KAFKA-19004 Move DelayedDeleteRecords to server-common module (#19226)
* [PR-18953](https://github.com/apache/kafka/pull/18953) - KAFKA-18826: Add global thread metrics  (#18953)
* [PR-19343](https://github.com/apache/kafka/pull/19343) - KAFKA-19016: Updated the retention behaviour of share groups to retain them forever (#19343)
* [PR-19344](https://github.com/apache/kafka/pull/19344) - KAFKA-19072: Add system test for ELR (#19344)
* [PR-19331](https://github.com/apache/kafka/pull/19331) - KAFKA-15931: Cancel RemoteLogReader gracefully (#19331)
* [PR-19338](https://github.com/apache/kafka/pull/19338) - KAFKA-18796-2: Corrected the check for acquisition lock timeout in Sh… (#19338)
* [PR-19335](https://github.com/apache/kafka/pull/19335) - KAFKA-19062: Port changes from KAFKA-18645 to share-consumers (#19335)
* [PR-19334](https://github.com/apache/kafka/pull/19334) - KAFKA-19018,KAFKA-19063: Implement maxRecords and acquisition lock timeout in share fetch request and response resp. (#19334)
* [PR-18383](https://github.com/apache/kafka/pull/18383) - KAFKA-18613: Unit tests for usage of incorrect RPCs (#18383)
* [PR-19189](https://github.com/apache/kafka/pull/19189) - KAFKA-18613: Improve test coverage for missing topics (#19189)
* [PR-18510](https://github.com/apache/kafka/pull/18510) - KAFKA-18409: ShareGroupStateMessageFormatter should use CoordinatorRecordMessageFormatter (#18510)
* [PR-19274](https://github.com/apache/kafka/pull/19274) - KAFKA-18959 increase the num_workers from 9 to 14 (#19274)
* [PR-19283](https://github.com/apache/kafka/pull/19283) - KAFKA-19042 Move ConsumerTopicCreationTest to client-integration-tests module (#19283)
* [PR-18297](https://github.com/apache/kafka/pull/18297) - KAFKA-16260: Deprecate window.size.ms and window.inner.class.serde in StreamsConfig (#18297)
* [PR-19114](https://github.com/apache/kafka/pull/19114) - KAFKA-18613: Add StreamsGroupHeartbeat handler in the group coordinator (#19114)
* [PR-19270](https://github.com/apache/kafka/pull/19270) - KAFKA-19032 Remove TestInfoUtils.TestWithParameterizedQuorumAndGroupProtocolNames (#19270)
* [PR-19268](https://github.com/apache/kafka/pull/19268) - KAFKA-19005 improve the documentation of DescribeTopicsOptions#partitionSizeLimitPerResponse (#19268)
* [PR-19282](https://github.com/apache/kafka/pull/19282) - KAFKA-19036 Rewrite LogAppendTimeTest and move it to storage module (#19282)
* [PR-19299](https://github.com/apache/kafka/pull/19299) - KAFKA-19049 Remove the @ExtendWith(ClusterTestExtensions.class) from code base (#19299)
* [PR-19076](https://github.com/apache/kafka/pull/19076) - KAFKA-17830 Cover unit tests for TBRLMM init failure scenarios (#19076)
* [PR-19216](https://github.com/apache/kafka/pull/19216) - KAFKA-14486 Move LogCleanerManager to storage module (#19216)
* [PR-19026](https://github.com/apache/kafka/pull/19026) - KAFKA-18827: Initialize share group state group coordinator impl. [3/N] (#19026)
* [PR-18695](https://github.com/apache/kafka/pull/18695) - KAFKA-18616; Refactor Tools’s ApiMessageFormatter (#18695)
* [PR-19192](https://github.com/apache/kafka/pull/19192) - KAFKA-18899: Improve handling of timeouts for commitAsync() in ShareConsumer. (#19192)
* [PR-19154](https://github.com/apache/kafka/pull/19154) - KAFKA-18914 Migrate ConsumerRebootstrapTest to use new test infra (#19154)
* [PR-19233](https://github.com/apache/kafka/pull/19233) - KAFKA-18736: Add pollOnClose() and maximumTimeToWait() (#19233)
* [PR-19230](https://github.com/apache/kafka/pull/19230) - KAFKA-18736: Handle errors in the Streams group heartbeat request manager (#19230)
* [PR-18711](https://github.com/apache/kafka/pull/18711) - KAFKA-18576 Convert ConfigType to Enum (#18711)
* [PR-19247](https://github.com/apache/kafka/pull/19247) - KAFKA-18796: Added more information to error message when assertion fails for acquisition lock timeout (#19247)
* [PR-19046](https://github.com/apache/kafka/pull/19046) - KAFKA-18276 Migrate ProducerRebootstrapTest to new test infra (#19046)
* [PR-19207](https://github.com/apache/kafka/pull/19207) - KAFKA-18980 OffsetMetadataManager#cleanupExpiredOffsets should record the number of records rather than topic partitions (#19207)
* [PR-19227](https://github.com/apache/kafka/pull/19227) - KAFKA-18999 Remove BrokerMetadata (#19227)
* [PR-19256](https://github.com/apache/kafka/pull/19256) - KAFKA-17806 remove this-escape suppress warnings in AclCommand (#19256)
* [PR-19255](https://github.com/apache/kafka/pull/19255) - KAFKA-18329; [3/3] Delete old group coordinator (KIP-848) (#19255)
* [PR-19064](https://github.com/apache/kafka/pull/19064) - KAFKA-18893: Add KIP-877 support to ReplicaSelector (#19064)
* [PR-19251](https://github.com/apache/kafka/pull/19251) - KAFKA-18329; [2/3] Delete old group coordinator (KIP-848) (#19251)
* [PR-19254](https://github.com/apache/kafka/pull/19254) - KAFKA-19017: Changed consumer.config to command-config in verifiable_share_consumer.py (#19254)
* [PR-19246](https://github.com/apache/kafka/pull/19246) - KAFKA-15599 Move MetadataLogConfig to raft module (#19246)
* [PR-19180](https://github.com/apache/kafka/pull/19180) - KAFKA-18954: Add ELR election rate metric (#19180)
* [PR-19197](https://github.com/apache/kafka/pull/19197) - KAFKA-15931: Cancel RemoteLogReader gracefully (#19197)
* [PR-19243](https://github.com/apache/kafka/pull/19243) - KAFKA-18329; [1/3] Delete old group coordinator (KIP-848) (#19243)
* [PR-19174](https://github.com/apache/kafka/pull/19174) - KAFKA-18946 Move BrokerReconfigurable and DynamicProducerStateManagerConfig to server module (#19174)
* [PR-18842](https://github.com/apache/kafka/pull/18842) - KAFKA-806 Index may not always observe log.index.interval.bytes (#18842)
* [PR-19214](https://github.com/apache/kafka/pull/19214) - KAFKA-18989 Optimize FileRecord#searchForOffsetWithSize (#19214)
* [PR-19183](https://github.com/apache/kafka/pull/19183) - KAFKA-18819 StreamsGroupHeartbeat API and StreamsGroupDescribe API  check topic describe (#19183)
* [PR-19217](https://github.com/apache/kafka/pull/19217) - KAFKA-18975 Move clients-integration-test out of core module (#19217)
* [PR-19193](https://github.com/apache/kafka/pull/19193) - KAFKA-18953: [1/N] Add broker side handling for 2 PC (KIP-939) (#19193)
* [PR-18949](https://github.com/apache/kafka/pull/18949) - KAFKA-17431: Support invalid static configs for KRaft so long as dynamic configs are valid (#18949)
* [PR-19202](https://github.com/apache/kafka/pull/19202) - KAFKA-18969 Rewrite ShareConsumerTest#setup and move to clients-integration-tests module (#19202)
* [PR-19165](https://github.com/apache/kafka/pull/19165) - KAFKA-18955: Fix infinite loop and standardize options in MetadataSchemaCheckerTool (#19165)
* [PR-18966](https://github.com/apache/kafka/pull/18966) - KAFKA-18808 add test to ensure the name=<default> is not equal to default quota (#18966)
* [PR-18463](https://github.com/apache/kafka/pull/18463) - KAFKA-17171 Add test cases for STATIC_BROKER_CONFIG in kraft mode (#18463)
* [PR-18801](https://github.com/apache/kafka/pull/18801) - KAFKA-17565 Move MetadataCache interface to metadata module (#18801)
* [PR-19181](https://github.com/apache/kafka/pull/19181) - KAFKA-18736: Do not send fields if not needed (#19181)
* [PR-19215](https://github.com/apache/kafka/pull/19215) - KAFKA-18990 Avoid redundant MetricName creation in BaseQuotaTest#produceUntilThrottled (#19215)
* [PR-19027](https://github.com/apache/kafka/pull/19027) - KAFKA-18859 honor the error message of UnregisterBrokerResponse (#19027)
* [PR-19212](https://github.com/apache/kafka/pull/19212) - KAFKA-18993 Remove confusing notable change section from upgrade.html (#19212)
* [PR-19129](https://github.com/apache/kafka/pull/19129) - KAFKA-18703 Remove unused class PayloadKeyType (#19129)
* [PR-19187](https://github.com/apache/kafka/pull/19187) - KAFKA-18915 Rewrite AdminClientRebootstrapTest to cover the current scenario (#19187)
* [PR-19147](https://github.com/apache/kafka/pull/19147) - KAFKA-18924 Running the storage module tests produces a storage/storage.log file (#19147)
* [PR-19136](https://github.com/apache/kafka/pull/19136) - KAFKA-18781: Extend RefreshRetriableException related exceptions (#19136)
* [PR-18994](https://github.com/apache/kafka/pull/18994) - KAFKA-18843: MirrorMaker2 unique workerId (#18994)
* [PR-17264](https://github.com/apache/kafka/pull/17264) - KAFKA-17516 Synonyms for client metrics configs (#17264)
* [PR-19134](https://github.com/apache/kafka/pull/19134) - KAFKA-18927 Remove LATEST_0_11, LATEST_1_0, LATEST_1_1, LATEST_2_0 (#19134)
* [PR-19164](https://github.com/apache/kafka/pull/19164) - KAFKA-18943: Kafka Streams incorrectly commits TX during task revokation (#19164)
* [PR-19205](https://github.com/apache/kafka/pull/19205) - KAFKA-18979; Report correct kraft.version in ApiVersions (#19205)
* [PR-19176](https://github.com/apache/kafka/pull/19176) - KAFKA-18651: Add Streams-specific broker configurations (#19176)
* [PR-19040](https://github.com/apache/kafka/pull/19040) - KAFKA-18858 Refactor FeatureControlManager to avoid using uninitialized MV (#19040)
* [PR-19030](https://github.com/apache/kafka/pull/19030) - KAFKA-14484: Move UnifiedLog to storage module (#19030)
* [PR-18662](https://github.com/apache/kafka/pull/18662) - KAFKA-18617 Allow use of ClusterInstance inside BeforeEach (#18662)
* [PR-18018](https://github.com/apache/kafka/pull/18018) - KAFKA-18142 Switch to com.gradleup.shadow (#18018)
* [PR-19169](https://github.com/apache/kafka/pull/19169) - KAFKA-18947 Remove unused raftManager in metadataShell (#19169)
* [PR-18998](https://github.com/apache/kafka/pull/18998) - KAFKA-18837: Ensure controller quorum timeouts and backoffs are at least 0 (#18998)
* [PR-19119](https://github.com/apache/kafka/pull/19119) - KAFKA-18422 Adjust Kafka client upgrade path section (#19119)
* [PR-19168](https://github.com/apache/kafka/pull/19168) - KAFKA-18942: Add reviewers to PR body with committer-tools (#19168)
* [PR-19148](https://github.com/apache/kafka/pull/19148) - KAFKA-18932: Removed usage of partition max bytes from share fetch requests (#19148)
* [PR-19145](https://github.com/apache/kafka/pull/19145) - KAFKA-18936: Fix share fetch when records are larger than max bytes (#19145)
* [PR-18091](https://github.com/apache/kafka/pull/18091) - KAFKA-18074: Add kafka client compatibility matrix (#18091)
* [PR-18258](https://github.com/apache/kafka/pull/18258) - KAFKA-18195: Fix Kafka Streams broker compatibility matrix (#18258)
* [PR-19171](https://github.com/apache/kafka/pull/19171) - KAFKA-17808: Fix id typo for connector-dlq-adminclient (#19171)
* [PR-19144](https://github.com/apache/kafka/pull/19144) - KAFKA-18933 Add client integration tests module (#19144)
* [PR-19155](https://github.com/apache/kafka/pull/19155) - KAFKA-18925: Add streams groups support to Admin.listGroups (#19155)
* [PR-19142](https://github.com/apache/kafka/pull/19142) - KAFKA-18901: [1/N] Improved homogeneous SimpleAssignor (#19142)
* [PR-19162](https://github.com/apache/kafka/pull/19162) - KAFKA-18941: Do not test 3.3 in upgrade_tests.py (#19162)
* [PR-19121](https://github.com/apache/kafka/pull/19121) - KAFKA-18736: Decide when a heartbeat should be sent (#19121)
* [PR-19173](https://github.com/apache/kafka/pull/19173) - KAFKA-18931: added a share group session timeout task when group coordinator is loaded (#19173)
* [PR-19099](https://github.com/apache/kafka/pull/19099) - KAFKA-18637: Fix max connections per ip and override reconfigurations (#19099)
* [PR-17767](https://github.com/apache/kafka/pull/17767) - KAFKA-17856 Move ConfigCommandTest and ConfigCommandIntegrationTest to tool module (#17767)
* [PR-18802](https://github.com/apache/kafka/pull/18802) - KAFKA-18706 Move AclPublisher to metadata module (#18802)
* [PR-19166](https://github.com/apache/kafka/pull/19166) - KAFKA-18944 Remove unused setters from ClusterConfig (#19166)
* [PR-19081](https://github.com/apache/kafka/pull/19081) - KAFKA-18909 Move DynamicThreadPool to server module (#19081)
* [PR-19062](https://github.com/apache/kafka/pull/19062) - KAFKA-18700 Migrate SnapshotPath, Entry, OffsetAndEpoch, LogFetchInfo, and LogAppendInfo to record classes (#19062)
* [PR-19156](https://github.com/apache/kafka/pull/19156) - KAFKA-18940: fix electionWasClean (#19156)
* [PR-19127](https://github.com/apache/kafka/pull/19127) - KAFKA-18920: The kcontrollers must set kraft.version in ApiVersionsResponse (#19127)
* [PR-19116](https://github.com/apache/kafka/pull/19116) - KAFKA-18285: Add describeStreamsGroup to Admin API (#19116)
* [PR-18684](https://github.com/apache/kafka/pull/18684) - KAFKA-18461 Add Objects.requireNotNull to Snapshot (#18684)
* [PR-18299](https://github.com/apache/kafka/pull/18299) - KAFKA-17607: Add CI step to verify LICENSE-binary (#18299)
* [PR-19137](https://github.com/apache/kafka/pull/19137) - KAFKA-18929: Log a warning when time based segment delete is blocked by a future timestamp (#19137)
* [PR-15241](https://github.com/apache/kafka/pull/15241) - KAFKA-15931: Reopen TransactionIndex if channel is closed (#15241)
* [PR-19138](https://github.com/apache/kafka/pull/19138) - KAFKA-18046; High CPU usage when using Log4j2 (#19138)
* [PR-19094](https://github.com/apache/kafka/pull/19094) - KAFKA-18915: Migrate AdminClientRebootstrapTest to use new test infra (#19094)
* [PR-19113](https://github.com/apache/kafka/pull/19113) - KAFKA-18900: Experimental share consumer acknowledge mode config (#19113)
* [PR-19131](https://github.com/apache/kafka/pull/19131) - KAFKA-18648: Make records in FetchResponse nullable again (#19131)
* [PR-19120](https://github.com/apache/kafka/pull/19120) - KAFKA-18887: Implement Streams Admin APIs (#19120)
* [PR-19130](https://github.com/apache/kafka/pull/19130) - KAFKA-18811: Added command configs to admin client as well in VerifiableShareConsumer (#19130)
* [PR-19112](https://github.com/apache/kafka/pull/19112) - KAFKA-18910 Remove kafka.utils.json (#19112)
* [4a500418](https://github.com/apache/kafka/commit/4a500418c63a063198c5f6ce256bfef9ffd74e3a) - Revert “KAFKA-18246 Fix ConnectRestApiTest.test_rest_api by adding multiversioning configs (#18191)”
* [d86cb597](https://github.com/apache/kafka/commit/d86cb597902d32ce83f27d65b60df6700cb7a61d) - Revert “KAFKA-18887: Implement Streams Admin APIs (#19049)”
* [PR-19049](https://github.com/apache/kafka/pull/19049) - KAFKA-18887: Implement Streams Admin APIs (#19049)
* [PR-19104](https://github.com/apache/kafka/pull/19104) - KAFKA-18919 Clarify that KafkaPrincipalBuilder classes must also implement KafkaPrincipalSerde (#19104)
* [PR-19054](https://github.com/apache/kafka/pull/19054) - KAFKA-18882 Remove BaseKey, TxnKey, and UnknownKey (#19054)
* [PR-19083](https://github.com/apache/kafka/pull/19083) - KAFKA-18817: ShareGroupHeartbeat and ShareGroupDescribe API must check topic describe (#19083)
* [PR-18983](https://github.com/apache/kafka/pull/18983) - KAFKA-14121: AlterPartitionReassignments API should allow callers to specify the option of preserving the replication factor (#18983)
* [PR-18918](https://github.com/apache/kafka/pull/18918) - KAFKA-18804 Remove slf4j warning when using tool script (#18918)
* [PR-9766](https://github.com/apache/kafka/pull/9766) - KAFKA-10864 Convert end txn marker schema to use auto-generated protocol (#9766)
* [PR-19087](https://github.com/apache/kafka/pull/19087) - KAFKA-18886 add behavior change of CreateTopicPolicy and AlterConfigPolicy to zk2kraft (#19087)
* [PR-19097](https://github.com/apache/kafka/pull/19097) - KAFKA-18422 add link of KIP-1124 to “rolling upgrade” section  (#19097)
* [PR-19089](https://github.com/apache/kafka/pull/19089) - KAFKA-18917: TransformValues throws NPE (#19089)
* [PR-19065](https://github.com/apache/kafka/pull/19065) - KAFKA-18876 4.0 documentation improvement (#19065)
* [PR-19086](https://github.com/apache/kafka/pull/19086) - Fix typos in multiple files (#19086)
* [PR-19091](https://github.com/apache/kafka/pull/19091) - KAFKA-18918: Correcting releasing of locks on exception (#19091)
* [PR-19088](https://github.com/apache/kafka/pull/19088) - KAFKA-18916; Resolved regular expressions must update the group by topics data structure (#19088)
* [PR-19075](https://github.com/apache/kafka/pull/19075) - KAFKA-18867 add tests to describe topic configs with empty name (#19075)
* [PR-18449](https://github.com/apache/kafka/pull/18449) - KAFKA-18500 Build PRs at HEAD commit (#18449)
* [PR-19059](https://github.com/apache/kafka/pull/19059) - KAFKA-18878: Added share session cache and delayed share fetch metrics (KIP-1103) (#19059)
* [PR-18997](https://github.com/apache/kafka/pull/18997) - KAFKA-18844: Stale features information in QuorumController#registerBroker (#18997)
* [PR-19036](https://github.com/apache/kafka/pull/19036) - KAFKA-18864:remove the Evolving tag from stable public interfaces (#19036)
* [PR-19055](https://github.com/apache/kafka/pull/19055) - KAFKA-18817:[1/N] ShareGroupHeartbeat and ShareGroupDescribe API must check topic describe (#19055)
* [PR-18981](https://github.com/apache/kafka/pull/18981) - KAFKA-18613: Auto-creation of internal topics in streams group heartbeat (#18981)
* [PR-19056](https://github.com/apache/kafka/pull/19056) - KAFKA-18881 Document the ConsumerRecord as non-thread safe (#19056)
* [PR-18752](https://github.com/apache/kafka/pull/18752) - KAFKA-18168: Adding checkpointing for GlobalKTable during restoration and closing (#18752)
* [PR-19070](https://github.com/apache/kafka/pull/19070) - KAFKA-18907 Add suitable error message when the appended value is too larger (#19070)
* [PR-19067](https://github.com/apache/kafka/pull/19067) - KAFKA-18908 Document that the size of appended value can’t be larger than Short.MAX_VALUE (#19067)
* [PR-19047](https://github.com/apache/kafka/pull/19047) - KAFKA-18880 Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException (#19047)
* [PR-19063](https://github.com/apache/kafka/pull/19063) - KAFKA-17039 KIP-919 supports for unregisterBroker (#19063)
* [PR-17771](https://github.com/apache/kafka/pull/17771) - KAFKA-17981 add Integration test for ConfigCommand to add config key=[val1,val2] (#17771)
* [PR-19045](https://github.com/apache/kafka/pull/19045) - KAFKA-18734: Implemented share partition metrics (KIP-1103) (#19045)
* [PR-19048](https://github.com/apache/kafka/pull/19048) - KAFKA-18860 Remove Missing Features section (#19048)
* [PR-18349](https://github.com/apache/kafka/pull/18349) - KAFKA-18371 TopicBasedRemoteLogMetadataManagerConfig exposes sensitive configuration data in logs (#18349)
* [PR-19020](https://github.com/apache/kafka/pull/19020) - KAFKA-18780: Extend RetriableException related exceptions (#19020)
* [PR-19037](https://github.com/apache/kafka/pull/19037) - KAFKA-18869 add remote storage threads to “Updating Thread Configs” section (#19037)
* [PR-17743](https://github.com/apache/kafka/pull/17743) - KAFKA-18863: Connect Multiversion Support (Versioned Connector Creation and related changes) (#17743)
* [PR-19042](https://github.com/apache/kafka/pull/19042) - KAFKA-18813: ConsumerGroupHeartbeat API and ConsumerGroupDescribe API… (#19042)
* [PR-18989](https://github.com/apache/kafka/pull/18989) - KAFKA-18813: ConsumerGroupHeartbeat API and ConsumerGroupDescribe API must check topic describe (#18989)
* [PR-18979](https://github.com/apache/kafka/pull/18979) - KAFKA-18614, KAFKA-18613: Add streams group request plumbing (#18979)
* [PR-18864](https://github.com/apache/kafka/pull/18864) - KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description (#18864)
* [PR-18988](https://github.com/apache/kafka/pull/18988) - KAFKA-18839: Drop EAGER rebalancing support in Kafka Streams (#18988)
* [PR-18985](https://github.com/apache/kafka/pull/18985) - KAFKA-18792 Add workflow to check PR format (#18985)
* [PR-19010](https://github.com/apache/kafka/pull/19010) - KAFKA-17351: Improved handling of compacted topics in share partition (2/N) (#19010)
* [PR-19021](https://github.com/apache/kafka/pull/19021) - KAFKA-17836 Move RackAwareTest to server module (#19021)
* [PR-18803](https://github.com/apache/kafka/pull/18803) - KAFKA-18712 Move Endpoint to server module (#18803)
* [PR-18387](https://github.com/apache/kafka/pull/18387) - KAFKA-18281: Kafka is improperly validating non-advertised listeners for routable controller addresses (#18387)
* [PR-18900](https://github.com/apache/kafka/pull/18900) - KAFKA-17937 Cleanup AbstractFetcherThreadTest (#18900)
* [PR-18898](https://github.com/apache/kafka/pull/18898) - KIP-966 part 1 release doc (#18898)
* [PR-18770](https://github.com/apache/kafka/pull/18770) - KAFKA-18748 Run new tests separately in PRs (#18770)
* [PR-18804](https://github.com/apache/kafka/pull/18804) - KAFKA-18522: Slice records for share fetch (#18804)
* [PR-18233](https://github.com/apache/kafka/pull/18233) - KAFKA-18023: Enforcing Explicit Naming for Kafka Streams Internal Topics (#18233)
* [PR-18939](https://github.com/apache/kafka/pull/18939) - KAFKA-18779: Validate responses from broker in client for ShareFetch and ShareAcknowledge RPCs. (#18939)
* [PR-18992](https://github.com/apache/kafka/pull/18992) - KAFKA-18827: Initialize share group state persister impl [2/N]. (#18992)
* [PR-18880](https://github.com/apache/kafka/pull/18880) - KAFKA-15583 doc update for the “strict min ISR” rule (#18880)
* [PR-18928](https://github.com/apache/kafka/pull/18928) - KAFKA-18629: ShareGroupDeleteState admin client impl. (#18928)
* [PR-18978](https://github.com/apache/kafka/pull/18978) - KAFKA-17351: Update tests and acquire API to allow discard batches from compacted topics (1/N) (#18978)
* [PR-18968](https://github.com/apache/kafka/pull/18968) - KAFKA-18827: Initialize share state, share coordinator impl. [1/N] (#18968)
* [PR-19000](https://github.com/apache/kafka/pull/19000) - Revert “KAFKA-16803: Change fork, update ShadowJavaPlugin to 8.1.7 (#16295)” (#19000)
* [PR-18897](https://github.com/apache/kafka/pull/18897) - KAFKA-18795 Remove Records#downConvert (#18897)
* [PR-18996](https://github.com/apache/kafka/pull/18996) - KAFKA-18813: [3/N] Client support for TopicAuthException in DescribeConsumerGroup path (#18996)
* [PR-18959](https://github.com/apache/kafka/pull/18959) - KAFKA-18733: Implemented fetch ratio and partition acquire time metrics (3/N) (#18959)
* [PR-18986](https://github.com/apache/kafka/pull/18986) - KAFKA-18813: [2/N] Client support for TopicAuthException in HB path (#18986)
* [PR-18844](https://github.com/apache/kafka/pull/18844) - KAFKA-18737 KafkaDockerWrapper setup functions fails due to storage format command (#18844)
* [PR-18848](https://github.com/apache/kafka/pull/18848) - KAFKA-18629: Delete share group state RPC group coordinator impl. [3/N] (#18848)
* [PR-18982](https://github.com/apache/kafka/pull/18982) - KAFKA-18829: Added check before converting to IMPLICIT mode (#18964) (Cherry-pick) (#18982)
* [PR-18969](https://github.com/apache/kafka/pull/18969) - KAFKA-18831 Migrating to log4j2 introduce behavior changes of adjusting level dynamically (#18969)
* [PR-18737](https://github.com/apache/kafka/pull/18737) - KAFKA-18641: AsyncKafkaConsumer could lose records with auto offset commit (#18737)
* [PR-18962](https://github.com/apache/kafka/pull/18962) - KAFKA-18828: Update share group metrics per new init and call mechanism. (#18962)
* [PR-18891](https://github.com/apache/kafka/pull/18891) - KAFKA-16918 TestUtils#assertFutureThrows should use future.get with timeout  (#18891)
* [PR-18965](https://github.com/apache/kafka/pull/18965) - MIINOR: Remove redundant quorum parameter from `*AdminIntegrationTest` classes (#18965)
* [PR-18967](https://github.com/apache/kafka/pull/18967) - KAFKA-18791 Set default commit to PR title and description [2/n] (#18967)
* [PR-18964](https://github.com/apache/kafka/pull/18964) - KAFKA-18829: Added check before converting to IMPLICIT mode (#18964)
* [PR-18955](https://github.com/apache/kafka/pull/18955) - KAFKA-18791 Enable new asf.yaml parser [1/n] (#18955)
* [PR-18845](https://github.com/apache/kafka/pull/18845) - KAFKA-18601: Assume a baseline of 3.3 for server protocol versions (#18845)
* [PR-18944](https://github.com/apache/kafka/pull/18944) - KAFKA-18198: Added check to prevent acknowledgements on initial ShareFetchRequest. (#18944)
* [PR-18946](https://github.com/apache/kafka/pull/18946) - KAFKA-18799 Remove AdminUtils (#18946)
* [PR-18757](https://github.com/apache/kafka/pull/18757) - KAFKA-18667 Add replication system test case for combined broker + controller failure (#18757)
* [PR-18872](https://github.com/apache/kafka/pull/18872) - KAFKA-18773 Migrate the log4j1 config to log4j 2 for native image and README (#18872)
* [PR-18004](https://github.com/apache/kafka/pull/18004) - KAFKA-18089: Upgrade Caffeine lib to 3.1.8 (#18004)
* [PR-18850](https://github.com/apache/kafka/pull/18850) - KAFKA-18767: Add client side config check for shareConsumer (#18850)
* [PR-18460](https://github.com/apache/kafka/pull/18460) - KAFKA-14484: Decouple UnifiedLog and RemoteLogManager (#18460)
* [PR-18927](https://github.com/apache/kafka/pull/18927) - KAFKA-16718 [1/n]: Added DeleteShareGroupOffsets request and response schema (#18927)
* [PR-18870](https://github.com/apache/kafka/pull/18870) - KAFKA-18736: Add Streams group heartbeat request manager (1/N) (#18870)
* [PR-18914](https://github.com/apache/kafka/pull/18914) - KAFKA-18798 The replica placement policy used by ReassignPartitionsCommand is not aligned with kraft controller (#18914)
* [PR-18888](https://github.com/apache/kafka/pull/18888) - KAFKA-18787: RemoteIndexCache fails to delete invalid files on init (#18888)
* [PR-18934](https://github.com/apache/kafka/pull/18934) - KAFKA-18807; Fix thread idle ratio metric (#18934)
* [PR-18871](https://github.com/apache/kafka/pull/18871) - KAFKA-18684: Add base exception classes (#18871)
* [PR-18924](https://github.com/apache/kafka/pull/18924) - KAFKA-18733: Updating share group record acks metric (2/N) (#18924)
* [PR-18907](https://github.com/apache/kafka/pull/18907) - KAFKA-18801 Remove ClusterGenerator and revise ClusterTemplate javadoc (#18907)
* [PR-18809](https://github.com/apache/kafka/pull/18809) - KAFKA-18730: Add replaying streams group state from offset topic (#18809)
* [PR-18889](https://github.com/apache/kafka/pull/18889) - KAFKA-18784 Fix ConsumerWithLegacyMessageFormatIntegrationTest (#18889)
* [PR-18920](https://github.com/apache/kafka/pull/18920) - KAFKA-18805: add synchronized block for Consumer Heartbeat close (#18920)
* [PR-18908](https://github.com/apache/kafka/pull/18908) - KAFKA-18755 Align timeout in kafka-share-groups.sh (#18908)
* [PR-18922](https://github.com/apache/kafka/pull/18922) - KAFKA-18809 Set min in sync replicas for \_\_share_group_state. (#18922)
* [PR-18916](https://github.com/apache/kafka/pull/18916) - KAFKA-18803 The acls would appear at the wrong level of the metadata shell “tree” (#18916)
* [PR-18906](https://github.com/apache/kafka/pull/18906) - KAFKA-18790 Fix testCustomQuotaCallback (#18906)
* [PR-18894](https://github.com/apache/kafka/pull/18894) - KAFKA-18761: Complete listing of share group offsets [1/N] (#18894)
* [PR-18819](https://github.com/apache/kafka/pull/18819) - KAFKA-16717 [1/2]: Add AdminClient.alterShareGroupOffsets (#18819)
* [PR-18899](https://github.com/apache/kafka/pull/18899) - KAFKA-18772 Define share group config defaults for Docker (#18899)
* [PR-18826](https://github.com/apache/kafka/pull/18826) - KAFKA-18733: Updating share group metrics (1/N) (#18826)
* [PR-18680](https://github.com/apache/kafka/pull/18680) - KAFKA-18634: Fix ELR metadata version issues (#18680)
* [PR-18795](https://github.com/apache/kafka/pull/18795) - KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#18795)
* [PR-18834](https://github.com/apache/kafka/pull/18834) - KAFKA-16720: Support multiple groups in DescribeShareGroupOffsets RPC (#18834)
* [PR-18810](https://github.com/apache/kafka/pull/18810) - KAFKA-18654[2/2]: Transction V2 retry add partitions on the server side when handling produce request. (#18810)
* [PR-18756](https://github.com/apache/kafka/pull/18756) - KAFKA-17298: Update upgrade notes for 4.0 KIP-848 (#18756)
* [PR-18807](https://github.com/apache/kafka/pull/18807) - KAFKA-18728 Move ListOffsetsPartitionStatus to server module (#18807)
* [PR-18851](https://github.com/apache/kafka/pull/18851) - KAFKA-18769: Improve leadership changes handling in ShareConsumeRequestManager. (#18851)
* [PR-18869](https://github.com/apache/kafka/pull/18869) - KAFKA-18777 add PartitionsWithLateTransactionsCount to BrokerMetricNamesTest (#18869)
* [PR-18729](https://github.com/apache/kafka/pull/18729) - KAFKA-18323: Add StreamsGroup class (#18729)
* [PR-18275](https://github.com/apache/kafka/pull/18275) - KAFKA-15443: Upgrade RocksDB to 9.7.3 (#18275)
* [PR-18451](https://github.com/apache/kafka/pull/18451) - KAFKA-18035: TransactionsTest testBumpTransactionalEpochWithTV2Disabled failed on trunk (#18451)
* [PR-17804](https://github.com/apache/kafka/pull/17804) - KAFKA-15995: Adding KIP-877 support to Connect  (#17804)
* [PR-18829](https://github.com/apache/kafka/pull/18829) - KAFKA-18756: Enabled share group configs for queues related system tests (#18829)
* [PR-18858](https://github.com/apache/kafka/pull/18858) - Fix bug in json naming (#18858)
* [PR-18833](https://github.com/apache/kafka/pull/18833) - KAFKA-18758:  NullPointerException in shutdown following InvalidConfigurationException (#18833)
* [PR-18855](https://github.com/apache/kafka/pull/18855) - KAFKA-18764: Throttle on share state RPCs auth failure. (#18855)
* [PR-18039](https://github.com/apache/kafka/pull/18039) - KAFKA-14484: Move UnifiedLog static methods to storage (#18039)
* [PR-18394](https://github.com/apache/kafka/pull/18394) - KAFKA-18396: Migrate log4j1 configuration to log4j2 in KafkaDockerWrapper (#18394)
* [PR-18853](https://github.com/apache/kafka/pull/18853) - KAFKA-18770 close the RM created by testDelayedShareFetchPurgatoryOperationExpiration (#18853)
* [PR-18820](https://github.com/apache/kafka/pull/18820) - KAFKA-18366 Remove KafkaConfig.interBrokerProtocolVersion (#18820)
* [PR-18812](https://github.com/apache/kafka/pull/18812) - KAFKA-18658 add import control for examples module (#18812)
* [PR-18821](https://github.com/apache/kafka/pull/18821) - KAFKA-18743 Remove leader.imbalance.per.broker.percentage as it is not supported by Kraft (#18821)
* [PR-1578](https://github.com/confluentinc/kafka/pull/1578) - CCS CP release test regex updates
* [PR-18196](https://github.com/apache/kafka/pull/18196) - KAFKA-18225 ClientQuotaCallback#updateClusterMetadata is unsupported by kraft (#18196)
* [PR-1582](https://github.com/confluentinc/kafka/pull/1582) - Fix build failure
* [PR-18846](https://github.com/apache/kafka/pull/18846) - KAFKA-18763: changed the assertion statement for acknowledgements to include only successful acks (#18846)
* [PR-18824](https://github.com/apache/kafka/pull/18824) - KAFKA-18745: Handle network related errors in persister. (#18824)
* [PR-18252](https://github.com/apache/kafka/pull/18252) - KAFKA-17833: Convert DescribeAuthorizedOperationsTest to use KRaft (#18252)
* [PR-1577](https://github.com/confluentinc/kafka/pull/1577) - CCS CP release test regex updates
* [PR-18381](https://github.com/apache/kafka/pull/18381) - KAFKA-18275 Restarting broker in testing should use the same port (#18381)
* [PR-18818](https://github.com/apache/kafka/pull/18818) - KAFKA-18741 document the removal of inter.broker.protocol.version (#18818)
* [PR-18496](https://github.com/apache/kafka/pull/18496) - KAFKA-18483 Disable Log4jController and Loggers if Log4j Core absent (#18496)
* [PR-18672](https://github.com/apache/kafka/pull/18672) - KAFKA-18618: Improve leader change handling of acknowledgements [1/N] (#18672)
* [PR-18566](https://github.com/apache/kafka/pull/18566) - KAFKA-18360 Remove zookeeper configurations (#18566)
* [PR-18641](https://github.com/apache/kafka/pull/18641) - KAFKA-18530 Remove ZooKeeperInternals (#18641)
* [PR-18583](https://github.com/apache/kafka/pull/18583) - KAFKA-18499 Clean up zookeeper from LogConfig (#18583)
* [PR-18771](https://github.com/apache/kafka/pull/18771) - KAFKA-18689: Improve metric calculation to avoid NoSuchElementException (#18771)
* [PR-18189](https://github.com/apache/kafka/pull/18189) - KAFKA-18206: EmbeddedKafkaCluster must set features (#18189)
* [PR-18765](https://github.com/apache/kafka/pull/18765) - KAFKA-17379: Fix inexpected state transition from ERROR to PENDING_SHUTDOWN (#18765)
* [PR-18696](https://github.com/apache/kafka/pull/18696) - KAFKA-18494-3: solution for the bug relating to gaps in the share partition cachedStates post initialization (#18696)
* [PR-18748](https://github.com/apache/kafka/pull/18748) - KAFKA-18629: Add persister impl and tests for DeleteShareGroupState RPC. [2/N] (#18748)
* [PR-18671](https://github.com/apache/kafka/pull/18671) - [KAFKA-16720] AdminClient Support for ListShareGroupOffsets (2/2) (#18671)
* [PR-18702](https://github.com/apache/kafka/pull/18702) - KAFKA-18645: New consumer should align close timeout handling with classic consumer (#18702)
* [PR-18791](https://github.com/apache/kafka/pull/18791) - KAFKA-18722: Remove the unreferenced methods in TBRLMM and ConsumerManager (#18791)
* [PR-18782](https://github.com/apache/kafka/pull/18782) - KAFKA-18694: Migrate suitable classes to records in coordinator-common module (#18782)
* [PR-18784](https://github.com/apache/kafka/pull/18784) - KAFKA-18705: Move ConfigRepository to metadata module (#18784)
* [PR-18783](https://github.com/apache/kafka/pull/18783) - KAFKA-18698: Migrate suitable classes to records in server and server-common modules (#18783)
* [PR-18781](https://github.com/apache/kafka/pull/18781) - KAFKA-18675 Add tests for valid and invalid broker addresses (#18781)
* [PR-18304](https://github.com/apache/kafka/pull/18304) - KAFKA-16524; Metrics for KIP-853 (#18304)
* [PR-18277](https://github.com/apache/kafka/pull/18277) - KAFKA-18635: reenable the unclean shutdown detection (#18277)
* [PR-18708](https://github.com/apache/kafka/pull/18708) - KAFKA-18649: complete ClearElrRecord handling (#18708)
* [PR-18148](https://github.com/apache/kafka/pull/18148) - KAFKA-16540: Clear ELRs when min.insync.replicas is changed. (#18148)
* [PR-17952](https://github.com/apache/kafka/pull/17952) - KAFKA-16540: enforce min.insync.replicas config invariants for ELR (#17952)
* [PR-15622](https://github.com/apache/kafka/pull/15622) - KAFKA-16446: Improve controller event duration logging (#15622)
* [PR-18028](https://github.com/apache/kafka/pull/18028) - KAFKA-18131: Improve logs for voters (#18028)
* [PR-18222](https://github.com/apache/kafka/pull/18222) - KAFKA-18305: validate controller.listener.names is not in inter.broker.listener.name for kcontrollers (#18222)
* [PR-18777](https://github.com/apache/kafka/pull/18777) - KAFKA-18690: Keep leader metadata for RE2J-assigned partitions (#18777)
* [PR-18551](https://github.com/apache/kafka/pull/18551) - KAFKA-18538: Add Streams membership manager (#18551)
* [PR-18165](https://github.com/apache/kafka/pull/18165) - KAFKA-18230: Handle not controller or not leader error in admin client (#18165)
* [PR-18700](https://github.com/apache/kafka/pull/18700) - KAFKA-18644: improve generic type names for internal FK-join classes (#18700)
* [PR-18790](https://github.com/apache/kafka/pull/18790) - KAFKA-18693 Remove PasswordEncoder (#18790)
* [PR-18720](https://github.com/apache/kafka/pull/18720) - KAFKA-18654 [1/2]: Transaction Version 2 performance regression due to early return (#18720)
* [PR-18592](https://github.com/apache/kafka/pull/18592) - KAFKA-18545: Remove Zookeeper logic from LogManager (#18592)
* [PR-18676](https://github.com/apache/kafka/pull/18676) - KAFKA-18325: Add TargetAssignmentBuilder (#18676)
* [PR-18786](https://github.com/apache/kafka/pull/18786) - KAFKA-18672; CoordinatorRecordSerde must validate value version (4.0) (#18786)
* [PR-18717](https://github.com/apache/kafka/pull/18717) - KAFKA-18655: Implement the consumer group size counter with scheduled task (#18717)
* [PR-18764](https://github.com/apache/kafka/pull/18764) - KAFKA-18685: Cleanup DynamicLogConfig constructor (#18764)
* [PR-18785](https://github.com/apache/kafka/pull/18785) - KAFKA-18676; Update Benchmark system tests (#18785)
* [PR-18330](https://github.com/apache/kafka/pull/18330) - KAFKA-17631 Convert SaslApiVersionsRequestTest to kraft (#18330)
* [PR-18749](https://github.com/apache/kafka/pull/18749) - KAFKA-18672; CoordinatorRecordSerde must validate value version (#18749)
* [PR-18768](https://github.com/apache/kafka/pull/18768) - KAFKA-18678 Update TestVerifiableProducer system test (#18768)
* [PR-18652](https://github.com/apache/kafka/pull/18652) - KAFKA-17125: Streams Sticky Task Assignor (#18652)
* [PR-18751](https://github.com/apache/kafka/pull/18751) - KAFKA-18674 Document the incompatible changes in parsing –bootstrap-server (#18751)
* [PR-18727](https://github.com/apache/kafka/pull/18727) - KAFKA-18659: librdkafka compressed produce fails unless api versions returns produce v0 (#18727)
* [PR-18759](https://github.com/apache/kafka/pull/18759) - KAFKA-18683: Handle slicing of file records for updated start position (#18759)
* [fc3dca4e](https://github.com/apache/kafka/commit/fc3dca4ed08a6acdcb5b1d5a4ed5b8a7095d318b) - Revert “KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700)”
* [7920fadb](https://github.com/apache/kafka/commit/7920fadbb586a9430ce1a45936d6bbd1555baa2d) - Revert “KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700)”
* [PR-18758](https://github.com/apache/kafka/pull/18758) - KAFKA-18660: Transactions Version 2 doesn’t handle epoch overflow correctly (#18730) (#18758)
* [PR-18750](https://github.com/apache/kafka/pull/18750) - KAFKA-18320; Ensure that assignors are at the right place (#18750)
* [PR-1541](https://github.com/confluentinc/kafka/pull/1541) - Merge trunk
* [PR-17700](https://github.com/apache/kafka/pull/17700) - KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700)
* [PR-18766](https://github.com/apache/kafka/pull/18766) - KAFKA-18146; tests/kafkatest/tests/core/upgrade_test.py needs to be re-added as KRaft (#18766)
* [PR-18763](https://github.com/apache/kafka/pull/18763) - KAFKA-18677; Update ConsoleConsumerTest system test (#18763)
* [PR-17511](https://github.com/apache/kafka/pull/17511) - KAFKA-15995: Initial API + make Producer/Consumer plugins Monitorable (#17511)
* [PR-18722](https://github.com/apache/kafka/pull/18722) - KAFKA-18644: improve generic type names for KStreamImpl and KTableImpl (#18722)
* [PR-18754](https://github.com/apache/kafka/pull/18754) - KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18754)
* [PR-18730](https://github.com/apache/kafka/pull/18730) - KAFKA-18660: Transactions Version 2 doesn’t handle epoch overflow correctly (#18730)
* [PR-1556](https://github.com/confluentinc/kafka/pull/1556) - MINOR: Disable publish artifacts for 4.0
* [PR-18548](https://github.com/apache/kafka/pull/18548) - KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18548)
* [PR-18731](https://github.com/apache/kafka/pull/18731) - KAFKA-18570: Update documentation to add remainingLogsToRecover, remainingSegmentsToRecover and LogDirectoryOffline metrics (#18731)
* [PR-18669](https://github.com/apache/kafka/pull/18669) - KAFKA-18621: Add StreamsCoordinatorRecordHelpers (#18669)
* [PR-18681](https://github.com/apache/kafka/pull/18681) - KAFKA-18636 Fix how we handle Gradle exits in CI (#18681)
* [PR-18590](https://github.com/apache/kafka/pull/18590) - KAFKA-18569: New consumer close may wait on unneeded FindCoordinator (#18590)
* [PR-18698](https://github.com/apache/kafka/pull/18698) - KAFKA-13722: remove internal usage of old ProcessorContext (#18698)
* [PR-18314](https://github.com/apache/kafka/pull/18314) - KAFKA-16339: Add Kafka Streams migrating guide from transform to process (#18314)
* [PR-18732](https://github.com/apache/kafka/pull/18732) - KAFKA-18498: Update lock ownership from main thread (#18732)
* [PR-18478](https://github.com/apache/kafka/pull/18478) - KAFKA-18383 Remove reserved.broker.max.id and broker.id.generation.enable (#18478)
* [PR-18733](https://github.com/apache/kafka/pull/18733) - KAFKA-18662: Return CONCURRENT_TRANSACTIONS on produce request in TV2 (#18733)
* [PR-18718](https://github.com/apache/kafka/pull/18718) - KAFKA-18632: Multibroker test improvements. (#18718)
* [PR-18725](https://github.com/apache/kafka/pull/18725) - KAFKA-18653: Fix mocks and potential thread leak issues causing silent RejectedExecutionException in share group broker tests (#18725)
* [PR-18726](https://github.com/apache/kafka/pull/18726) - KAFKA-18646: Null records in fetch response breaks librdkafka (#18726)
* [PR-18668](https://github.com/apache/kafka/pull/18668) - KAFKA-18619: New consumer topic metadata events should set requireMetadata flag (#18668)
* [PR-18728](https://github.com/apache/kafka/pull/18728) - KAFKA-18488: Improve KafkaShareConsumerTest (#18728)
* [PR-18716](https://github.com/apache/kafka/pull/18716) - KAFKA-18648: Add back support for metadata version 0-3 (#18716)
* [PR-18555](https://github.com/apache/kafka/pull/18555) - KAFKA-18528: MultipleListenersWithSameSecurityProtocolBaseTest and GssapiAuthenticationTest should run for async consumer (#18555)
* [PR-18651](https://github.com/apache/kafka/pull/18651) - KAFKA-17951: Share parition rotate strategy (#18651)
* [PR-18712](https://github.com/apache/kafka/pull/18712) - KAFKA-18629: Delete share group state impl [1/N] (#18712)
* [PR-18570](https://github.com/apache/kafka/pull/18570) - KAFKA-17162: join() started thread in DefaultTaskManagerTest (#18570)
* [PR-18602](https://github.com/apache/kafka/pull/18602) - KAFKA-17587 Refactor test infrastructure (#18602)
* [PR-18693](https://github.com/apache/kafka/pull/18693) - KAFKA-18631 Remove ZkConfigs (#18693)
* [PR-18699](https://github.com/apache/kafka/pull/18699) - KAFKA-18642: Increased the timeouts in share_consumer_test.py system tests (#18699)
* [PR-18632](https://github.com/apache/kafka/pull/18632) - KAFKA-18555 Avoid casting MetadataCache to KRaftMetadataCache (#18632)
* [PR-18547](https://github.com/apache/kafka/pull/18547) - KAFKA-18533 Remove KafkaConfig zookeeper related logic (#18547)
* [PR-18554](https://github.com/apache/kafka/pull/18554) - KAFKA-18529: ConsumerRebootstrapTest should run for async consumer (#18554)
* [PR-18292](https://github.com/apache/kafka/pull/18292) - KAFKA-13722: remove usage of old ProcessorContext  (#18292)
* [PR-18444](https://github.com/apache/kafka/pull/18444) - KAFKA-17894: Implemented broker topic metrics for Share Group 1/N (KIP-1103) (#18444)
* [PR-18687](https://github.com/apache/kafka/pull/18687) - KAFKA-18630: Clean ReplicaManagerBuilder (#18687)
* [PR-18477](https://github.com/apache/kafka/pull/18477) - KAFKA-18474: Remove zkBroker listener (#18477)
* [PR-18688](https://github.com/apache/kafka/pull/18688) - KAFKA-18616; Refactor DumpLogSegments’s MessageParsers (#18688)
* [PR-15574](https://github.com/apache/kafka/pull/15574) - KAFKA-16372 Fix producer doc discrepancy with the exception behavior (#15574)
* [PR-18618](https://github.com/apache/kafka/pull/18618) - KAFKA-18590 Cleanup DelegationTokenManager (#18618)
* [PR-18593](https://github.com/apache/kafka/pull/18593) - KAFKA-18559 Cleanup FinalizedFeatures (#18593)
* [PR-18627](https://github.com/apache/kafka/pull/18627) - KAFKA-18597 Fix max-buffer-utilization-percent is always 0 (#18627)
* [PR-18686](https://github.com/apache/kafka/pull/18686) - KAFKA-18620: Remove UnifiedLog#legacyFetchOffsetsBefore (#18686)
* [PR-18621](https://github.com/apache/kafka/pull/18621) - KAFKA-18592 Cleanup ReplicaManager (#18621)
* [PR-18476](https://github.com/apache/kafka/pull/18476) - KAFKA-18324: Add CurrentAssignmentBuilder (#18476)
* [PR-12042](https://github.com/apache/kafka/pull/12042) - KAFKA-13810: Document behavior of KafkaProducer.flush() w.r.t callbacks (#12042)
* [PR-18667](https://github.com/apache/kafka/pull/18667) - KAFKA-18484 [2/2]; Handle exceptions during coordinator unload (#18667)
* [PR-18601](https://github.com/apache/kafka/pull/18601) - KAFKA-18488: Additional protocol tests for share consumption (#18601)
* [PR-18666](https://github.com/apache/kafka/pull/18666) - KAFKA-18486; [1/2] Update LocalLeaderEndPointTest (#18666)
* [d2024436](https://github.com/apache/kafka/commit/d2024436218343a127385e0149a692caf432b772) - KAFKA-18575: Transaction Version 2 doesn’t correctly handle race condition with completing and new transaction(#18604)
* [PR-18532](https://github.com/apache/kafka/pull/18532) - KAFKA-18517: Enable ConsumerBounceTest to run for new async consumer (#18532)
* [PR-18614](https://github.com/apache/kafka/pull/18614) - KAFKA-18519: Remove Json.scala, cleanup AclEntry.scala (#18614)
* [PR-18630](https://github.com/apache/kafka/pull/18630) - KAFKA-18599: Remove Optional wrapping for forwardingManager in ApiVersionManager (#18630)
* [PR-18389](https://github.com/apache/kafka/pull/18389) - KAFKA-18229: Move configs out of “kraft” directory (#18389)
* [PR-18661](https://github.com/apache/kafka/pull/18661) - KAFKA-18484 [1/N]; Handle exceptions from deferred events in coordinator (#18661)
* [PR-18649](https://github.com/apache/kafka/pull/18649) - KAFKA-18392: Ensure client sets member ID for share group (#18649)
* [PR-18527](https://github.com/apache/kafka/pull/18527) - KAFKA-18518: Add processor to handle rebalance events (#18527)
* [PR-18607](https://github.com/apache/kafka/pull/18607) - KAFKA-17402: DefaultStateUpdated should transite task atomically (#18607)
* [PR-18539](https://github.com/apache/kafka/pull/18539) - KAFKA-18454 Publish build scans to develocity.apache.org (#18539)
* [PR-18512](https://github.com/apache/kafka/pull/18512) - KAFKA-18302; Update CoordinatorRecord (#18512)
* [PR-18316](https://github.com/apache/kafka/pull/18316) - KAFKA-15370: Support Participation in 2PC (KIP-939) (2/N) (#18316)
* [PR-18587](https://github.com/apache/kafka/pull/18587) - KAFKA-8862: Improve Producer error message for failed metadata update (#18587)
* [PR-18581](https://github.com/apache/kafka/pull/18581) - KAFKA-17561: add processId tag to thread-state metric (#18581)
* [PR-18629](https://github.com/apache/kafka/pull/18629) - KAFKA-18598: Remove ControllerMetadataMetrics ZK-related Metrics (#18629)
* [PR-18611](https://github.com/apache/kafka/pull/18611) - KAFKA-18585 Fix fail test ValuesTest#shouldConvertDateValues (#18611)
* [PR-18647](https://github.com/apache/kafka/pull/18647) - KAFKA-18487; Remove ReplicaManager#stopReplicas (#18647)
* [PR-18635](https://github.com/apache/kafka/pull/18635) - KAFKA-18583; Fix getPartitionReplicaEndpoints for KRaft (#18635)
* [PR-18442](https://github.com/apache/kafka/pull/18442) - KAFKA-18311: Internal Topic Manager (5/5) (#18442)
* [PR-18636](https://github.com/apache/kafka/pull/18636) - KAFKA-18604; Update transaction coordinator (#18636)
* [PR-18497](https://github.com/apache/kafka/pull/18497) - KAFKA-14552: Assume a baseline of 3.0 for server protocol versions (#18497)
* [PR-18346](https://github.com/apache/kafka/pull/18346) - KAFKA-18363: Remove ZooKeeper mentiosn in broker configs (#18346)
* [PR-18631](https://github.com/apache/kafka/pull/18631) - KAFKA-18595: Remove AuthorizerUtils#sessionToRequestContext (#18631)
* [PR-18626](https://github.com/apache/kafka/pull/18626) - KAFKA-18594: Cleanup BrokerLifecycleManager (#18626)
* [PR-18174](https://github.com/apache/kafka/pull/18174) - KAFKA-18232: Add share group state topic prune metrics. (#18174)
* [PR-18567](https://github.com/apache/kafka/pull/18567) - KAFKA-18553: Update javadoc and comments of ConfigType (#18567)
* [PR-18571](https://github.com/apache/kafka/pull/18571) - [KAFKA-16720] AdminClient Support for ListShareGroupOffsets (1/n) (#18571)
* [PR-18624](https://github.com/apache/kafka/pull/18624) - KAFKA-18588 Remove TopicKey.scala (#18624)
* [PR-18628](https://github.com/apache/kafka/pull/18628) - KAFKA-18578: Remove UpdateMetadataRequest from MetadataCacheTest (#18628)
* [PR-18625](https://github.com/apache/kafka/pull/18625) - KAFKA-18593 Remove ZkCachedControllerId In MetadataCache (#18625)
* [PR-17390](https://github.com/apache/kafka/pull/17390) - KAFKA-17668: Clean-up LogCleaner#maxOverCleanerThreads and LogCleanerManager#maintainUncleanablePartitions (#17390)
* [PR-18616](https://github.com/apache/kafka/pull/18616) - KAFKA-18429 Remove ZkFinalizedFeatureCache and StateChangeFailedException (#18616)
* [PR-18619](https://github.com/apache/kafka/pull/18619) - KAFKA-18589 Remove unused interBrokerProtocolVersion from GroupMetadataManager (#18619)
* [PR-18598](https://github.com/apache/kafka/pull/18598) - KAFKA-18516 Remove RackAwareMode (#18598)
* [PR-18608](https://github.com/apache/kafka/pull/18608) - KAFKA-18492 Cleanup RequestHandlerHelper (#18608)
* [PR-18613](https://github.com/apache/kafka/pull/18613) - KAFKA-18427: Remove ZooKeeperClient (#18613)
* [PR-18591](https://github.com/apache/kafka/pull/18591) - KAFKA-18540: Remove UpdataMetadataRequest from KafkaApisTest (#18591)
* [PR-18594](https://github.com/apache/kafka/pull/18594) - KAFKA-18532: Clean Partition.scala zookeeper logic (#18594)
* [PR-18605](https://github.com/apache/kafka/pull/18605) - KAFKA-18423: Remove ZkData and related unused references (#18605)
* [PR-18586](https://github.com/apache/kafka/pull/18586) - KAFKA-18565 Cleanup SaslSetup (#18586)
* [PR-18606](https://github.com/apache/kafka/pull/18606) - KAFKA-18430 Remove ZkNodeChangeNotificationListener (#18606)
* [PR-18492](https://github.com/apache/kafka/pull/18492) - KAFKA-18480 Fix fail e2e test_offset_truncate (#18492)
* [PR-18012](https://github.com/apache/kafka/pull/18012) - KAFKA-806: Index may not always observe log.index.interval.bytes (#18012)
* [PR-18595](https://github.com/apache/kafka/pull/18595) - KAFKA-18515 Remove DelegationTokenManagerZk (#18595)
* [PR-18579](https://github.com/apache/kafka/pull/18579) - Remove casts to KRaftMetadataCache (#18579)
* [PR-18577](https://github.com/apache/kafka/pull/18577) - Convert BrokerEndPoint to record (#18577)
* [PR-18240](https://github.com/apache/kafka/pull/18240) - KAFKA-17642: PreVote response handling and ProspectiveState (#18240)
* [PR-18585](https://github.com/apache/kafka/pull/18585) - KAFKA-18413: Remove AdminZkClient (#18585)
* [PR-18406](https://github.com/apache/kafka/pull/18406) - KAFKA-18318: Add logs for online/offline migration indication (#18406)
* [PR-18224](https://github.com/apache/kafka/pull/18224) - KAFKA-18150; Downgrade group on classic leave of last consumer member (#18224)
* [PR-18209](https://github.com/apache/kafka/pull/18209) - Infrastructure for system tests for the new share consumer client (#18209)
* [PR-18553](https://github.com/apache/kafka/pull/18553) - KAFKA-18373: Remove ZkMetadataCache (#18553)
* [PR-18582](https://github.com/apache/kafka/pull/18582) - KAFKA-18557 streamline codebase with testConfig() (#18582)
* [PR-18573](https://github.com/apache/kafka/pull/18573) - KAFKA-18431: Remove KafkaController (#18573)
* [PR-18574](https://github.com/apache/kafka/pull/18574) - KAFKA-18407: Remove ZkAdminManager, DelayedCreatePartitions, CreatePartitionsMetadata, ZkConfigRepository, DelayedDeleteTopics (#18574)
* [PR-18568](https://github.com/apache/kafka/pull/18568) - KAFKA-18556: Remove JaasModule#zkDigestModule, JaasTestUtils#zkSections (#18568)
* [PR-18534](https://github.com/apache/kafka/pull/18534) - KAFKA-14485: Move LogCleaner exceptions to storage module (#18534)
* [PR-18565](https://github.com/apache/kafka/pull/18565) - KAFKA-18546: Use mocks instead of a real DNS lookup to the outside (#18565)
* [PR-18140](https://github.com/apache/kafka/pull/18140) - KAFKA-16368: Add a new constraint for segment.bytes to min 1MB for KIP-1030 (#18140)
* [PR-18106](https://github.com/apache/kafka/pull/18106) - KAFKA-16368: Update defaults for LOG_MESSAGE_TIMESTAMP_AFTER_MAX_MS_DEFAULT and NUM_RECOVERY_THREADS_PER_DATA_DIR_CONFIG (#18106)
* [PR-18374](https://github.com/apache/kafka/pull/18374) - KAFKA-7776: Tests for ISO8601 in Connect value parsing (#18374)
* [PR-18562](https://github.com/apache/kafka/pull/18562) - KAFKA-18558: Added check before adding previously subscribed partitions (#18562)
* [PR-18535](https://github.com/apache/kafka/pull/18535) - KAFKA-18521 Cleanup NodeApiVersions zkMigrationEnabled field (#18535)
* [PR-18552](https://github.com/apache/kafka/pull/18552) - KAFKA-18542 Cleanup AlterPartitionManager (#18552)
* [PR-18561](https://github.com/apache/kafka/pull/18561) - KAFKA-18406 Remove ZkBrokerEpochManager.scala (#18561)
* [PR-18508](https://github.com/apache/kafka/pull/18508) - KAFKA-18405 Remove ZooKeeper logic from DynamicBrokerConfig (#18508)
* [PR-18080](https://github.com/apache/kafka/pull/18080) - KAFKA-16368: Update default linger.ms to 5ms for KIP-1030 (#18080)
* [PR-18524](https://github.com/apache/kafka/pull/18524) - KAFKA-18514: Refactor share module code to server and server-common (#18524)
* [PR-18414](https://github.com/apache/kafka/pull/18414) - KAFKA-18331: Make process.roles and node.id required configs (#18414)
* [PR-18559](https://github.com/apache/kafka/pull/18559) - KAFKA-18552: Remove unnecessary version check in `testHandleOffsetFetch*` (#18559)
* [PR-18483](https://github.com/apache/kafka/pull/18483) - KAFKA-18472: Remove MetadataSupport (#18483)
* [PR-18342](https://github.com/apache/kafka/pull/18342) - KAFKA-18026: KIP-1112, clean up graph node grace period resolution (#18342)
* [PR-18491](https://github.com/apache/kafka/pull/18491) - KAFKA-18479: Remove keepPartitionMetadataFile in UnifiedLog and LogMan… (#18491)
* [PR-18365](https://github.com/apache/kafka/pull/18365) - KAFKA-18364 add document to show the changes of metrics and configs after removing zookeeper (#18365)
* [PR-18550](https://github.com/apache/kafka/pull/18550) - KAFKA-18539 Remove optional managers in KafkaApis (#18550)
* [PR-18563](https://github.com/apache/kafka/pull/18563) - Use version.py get_version to get version (#18563)
* [PR-18459](https://github.com/apache/kafka/pull/18459) - KAFKA-18452: Implemented batch size in acquired records (#18459)
* [PR-18448](https://github.com/apache/kafka/pull/18448) - KAFKA-18401: Transaction version 2 does not support commit transaction without records (#18448)
* [PR-18490](https://github.com/apache/kafka/pull/18490) - KAFKA-18479: RocksDBTimeOrderedKeyValueBuffer not initialized correctly (#18490)
* [PR-18536](https://github.com/apache/kafka/pull/18536) - KAFKA-18514 Remove server dependency on share coordinator (#18536)
* [PR-18521](https://github.com/apache/kafka/pull/18521) - KAFKA-18513: Validate share state topic records produced in tests. (#18521)
* [PR-18542](https://github.com/apache/kafka/pull/18542) - KAFKA-18399 Remove ZooKeeper from KafkaApis (12/N): clean up ZKMetadataCache, KafkaController and raftSupport (#18542)
* [PR-18386](https://github.com/apache/kafka/pull/18386) - KAFKA-18346 Fix e2e TestKRaftUpgrade for v3.3.2 (#18386)
* [PR-18530](https://github.com/apache/kafka/pull/18530) - KAFKA-18520: Remove ZooKeeper logic from JaasUtils (#18530)
* [PR-18540](https://github.com/apache/kafka/pull/18540) - KAFKA-18399 Remove ZooKeeper from KafkaApis (11/N): CREATE_ACLS and DELETE_ACLS  (#18540)
* [PR-18432](https://github.com/apache/kafka/pull/18432) - KAFKA-18399 Remove ZooKeeper from KafkaApis (10/N): ALTER_CONFIG and INCREMENETAL_ALTER_CONFIG (#18432)
* [PR-18544](https://github.com/apache/kafka/pull/18544) - Revert “KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18050)” (#18544)
* [PR-18487](https://github.com/apache/kafka/pull/18487) - KAFKA-18476: KafkaStreams should swallow TransactionAbortedException (#18487)
* [PR-18195](https://github.com/apache/kafka/pull/18195) - KAFKA-18026: KIP-1112, clean up StatefulProcessorNode (#18195)
* [PR-18518](https://github.com/apache/kafka/pull/18518) - KAFKA-18502 Remove kafka.controller.Election (#18518)
* [PR-18281](https://github.com/apache/kafka/pull/18281) - KAFKA-18330: Update documentation to remove controller deployment limitations (#18281)
* [PR-18465](https://github.com/apache/kafka/pull/18465) - KAFKA-18399 Remove ZooKeeper from KafkaApis (9/N): ALTER_CLIENT_QUOTAS and ALLOCATE_PRODUCER_IDS (#18465)
* [PR-18511](https://github.com/apache/kafka/pull/18511) - KAFKA-18493: Fix configure :streams:integration-tests project error (#18511)
* [PR-18453](https://github.com/apache/kafka/pull/18453) - KAFKA-18399 Remove ZooKeeper from KafkaApis (8/N): ELECT_LEADERS , ALTER_PARTITION, UPDATE_FEATURES (#18453)
* [PR-18525](https://github.com/apache/kafka/pull/18525) - Rename the variable to reflect its purpose (#18525)
* [PR-18403](https://github.com/apache/kafka/pull/18403) - KAFKA-18211: Override class loaders for class graph scanning in connect. (#18403)
* [PR-18500](https://github.com/apache/kafka/pull/18500) - Add DescribeShareGroupOffsets API [KIP-932] (#18500)
* [PR-17669](https://github.com/apache/kafka/pull/17669) - KAFKA-17915: Convert Kafka Client system tests to use KRaft (#17669)
* [PR-17901](https://github.com/apache/kafka/pull/17901) - KAFKA-18064: SASL mechanisms should throw exception on wrap/unwrap (#17901)
* [PR-18507](https://github.com/apache/kafka/pull/18507) - KAFKA-18491 Remove zkClient & maybeUpdateMetadataCache from ReplicaManager (#18507)
* [PR-18337](https://github.com/apache/kafka/pull/18337) - KAFKA-18274 Failed to restart controller in testing due to closed socket channel [2/2] (#18337)
* [PR-18475](https://github.com/apache/kafka/pull/18475) - KAFKA-18469;KAFKA-18036: AsyncConsumer should request metadata update if ListOffsetRequest encounters a retriable error (#18475)
* [PR-17728](https://github.com/apache/kafka/pull/17728) - KAFKA-17973: Relax Restriction for Voters Set Change (#17728)
* [PR-18050](https://github.com/apache/kafka/pull/18050) - KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18050)
* [PR-17870](https://github.com/apache/kafka/pull/17870) - KAFKA-18404: Remove partitionMaxBytes usage from DelayedShareFetch (#17870)
* [PR-18504](https://github.com/apache/kafka/pull/18504) - KAFKA-18485; Update log4j2.yaml (#18504)
* [PR-18433](https://github.com/apache/kafka/pull/18433) - KAFKA-18399 Remove ZooKeeper from KafkaApis (7/N): CREATE_TOPICS, DELETE_TOPICS, CREATE_PARTITIONS (#18433)
* [PR-18320](https://github.com/apache/kafka/pull/18320) - KAFKA-18341: Remove KafkaConfig GroupType config check and warn log (#18320)
* [PR-18480](https://github.com/apache/kafka/pull/18480) - KAFKA-18457; Update DumpLogSegments to use coordinator record json converters (#18480)
* [PR-18447](https://github.com/apache/kafka/pull/18447) - KAFKA-18399 Remove ZooKeeper from KafkaApis (6/N): handleCreateTokenRequest,  handleRenewTokenRequestZk,  handleExpireTokenRequestZk (#18447)
* [PR-18464](https://github.com/apache/kafka/pull/18464) - KAFKA-18399 Remove ZooKeeper from KafkaApis (5/N): ALTER_PARTITION_REASSIGNMENTS, LIST_PARTITION_REASSIGNMENTS (#18464)
* [PR-18461](https://github.com/apache/kafka/pull/18461) - KAFKA-18399 Remove ZooKeeper from KafkaApis (4/N): OFFSET_COMMIT and OFFSET_FETCH (#18461)
* [PR-18456](https://github.com/apache/kafka/pull/18456) - KAFKA-18399 Remove ZooKeeper from KafkaApis (3/N): USER_SCRAM_CREDENTIALS (#18456)
* [PR-18472](https://github.com/apache/kafka/pull/18472) - KAFKA-18466 Remove log4j-1.2-api from runtime scope while keeping it in distribution package (#18472)
* [PR-18404](https://github.com/apache/kafka/pull/18404) - KAFKA-18400: Don’t use YYYY when formatting/parsing dates in Java client (#18404)
* [PR-18437](https://github.com/apache/kafka/pull/18437) - KAFKA-18446 Remove MetadataCacheControllerNodeProvider (#18437)
* [PR-18468](https://github.com/apache/kafka/pull/18468) - KAFKA-18465: Remove MetadataVersions older than 3.0-IV1 (#18468)
* [PR-18467](https://github.com/apache/kafka/pull/18467) - KAFKA-18464: Empty Abort Transaction can fence producer incorrectly with Transactions V2 (#18467)
* [PR-18471](https://github.com/apache/kafka/pull/18471) - KAFKA-8116: Update Kafka Streams archetype for Java 11 (#18471)
* [PR-17510](https://github.com/apache/kafka/pull/17510) - KAFKA-17792: Efficiently parse decimals with large exponents in Connect Values (#17510)
* [PR-18679](https://github.com/apache/kafka/pull/18679) - KAFKA-18632: Added few share consumer multibroker tests. (#18679)
* [82ccf75a](https://github.com/apache/kafka/commit/82ccf75ae091bffb94cbb3fd173240c48627db17) - KAFKA-18575: Transaction Version 2 doesn’t correctly handle race condition with completing and new transaction(#18604)
* [94a1bfb1](https://github.com/apache/kafka/commit/94a1bfb1281f06263976b1ba8bba8c5ac5d7f2ce) - KAFKA-18575: Transaction Version 2 doesn’t correctly handle race condition with completing and new transaction(#18604)
* [PR-18340](https://github.com/apache/kafka/pull/18340) - KAFKA-18339: Fix parseRequestHeader error handling (#18340)
* [PR-18643](https://github.com/apache/kafka/pull/18643) - Revert “KAFKA-18404: Remove partitionMaxBytes usage from DelayedShareFetch (#17870)” (#18643)
* [21c4539d](https://github.com/apache/kafka/commit/21c4539dfe1134e60a7d8680d9ea19ae48f569a3) - Revert “KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18050)”
* [PR-18150](https://github.com/apache/kafka/pull/18150) - KAFKA-18026: KIP-1112 migrate KTableSuppressProcessorSupplier (#18150)
* [0186534a](https://github.com/apache/kafka/commit/0186534a992a123a7f53dd32860c6ba5787dbb18) - Revert “KAFKA-17411: Create local state Standbys on start (#16922)” and “KAFKA-17978: Fix invalid topology on Task assignment (#17778)”
* [PR-18378](https://github.com/apache/kafka/pull/18378) - KAFKA-18340: Change Dockerfile to use log4j2 yaml instead log4j properties (#18378)
* [PR-18397](https://github.com/apache/kafka/pull/18397) - KAFKA-18311: Enforcing copartitioned topics (4/N) (#18397)
* [PR-17454](https://github.com/apache/kafka/pull/17454) - KAFKA-17671: Create better documentation for transactions (#17454)
* [PR-18455](https://github.com/apache/kafka/pull/18455) - KAFKA-18308; Update CoordinatorSerde (#18455)
* [PR-18435](https://github.com/apache/kafka/pull/18435) - KAFKA-18440: Convert AuthorizationException to fatal error in AdminClient (#18435)
* [PR-18458](https://github.com/apache/kafka/pull/18458) - KAFKA-18304; Introduce json converter generator (#18458)
* [PR-18422](https://github.com/apache/kafka/pull/18422) - KAFKA-18399 Remove ZooKeeper from KafkaApis (2/N): CONTROLLED_SHUTDOWN and ENVELOPE (#18422)
* [PR-18146](https://github.com/apache/kafka/pull/18146) - KAFKA-18073: Prevent dropped records from failed retriable exceptions (#18146)
* [PR-18321](https://github.com/apache/kafka/pull/18321) - KAFKA-13093: Log compaction should write new segments with record version v2 (KIP-724) (#18321)
* [PR-18100](https://github.com/apache/kafka/pull/18100) - KAFKA-18180: Move OffsetResultHolder to storage module (#18100)
* [PR-17527](https://github.com/apache/kafka/pull/17527) - KAFKA-17455: fix stuck producer when throttling or retrying (#17527)
* [PR-18367](https://github.com/apache/kafka/pull/18367) - KAFKA-17915: Convert remaining Kafka Client system tests to use KRaft (#18367)
* [PR-18247](https://github.com/apache/kafka/pull/18247) - KAFKA-18277 Convert network_degrade_test to Kraft mode (#18247)
* [PR-18175](https://github.com/apache/kafka/pull/18175) - KAFKA-17986 Fix ConsumerRebootstrapTest and ProducerRebootstrapTest (#18175)
* [PR-18445](https://github.com/apache/kafka/pull/18445) - KAFKA-18445 Remove LazyDownConversionRecords and LazyDownConversionRecordsSend (#18445)
* [PR-18417](https://github.com/apache/kafka/pull/18417) - KAFKA-18399 Remove ZooKeeper from KafkaApis (1/N): LEADER_AND_ISR, STOP_REPLICA, UPDATE_METADATA (#18417)
* [PR-18382](https://github.com/apache/kafka/pull/18382) - KAFKA-17730 ReplicaFetcherThreadBenchmark is broken (#18382)
* [PR-18423](https://github.com/apache/kafka/pull/18423) - KAFKA-18437: Correct version of ShareUpdateRecord value (#18423)
* [PR-18457](https://github.com/apache/kafka/pull/18457) - KAFKA-18397: Added null check before sending background event from ShareConsumeRequestManager. (#18419) (#18457)
* [PR-18462](https://github.com/apache/kafka/pull/18462) - KAFKA-18449: Add share group state configs to reconfig-server.properties (#18440) (#18462)
* [PR-18395](https://github.com/apache/kafka/pull/18395) - KAFKA-18311: Configuring repartition topics (3/N) (#18395)
* [PR-18446](https://github.com/apache/kafka/pull/18446) - KAFKA-18453: Add StreamsTopology class to group coordinator (#18446)
* [PR-18450](https://github.com/apache/kafka/pull/18450) - KAFKA-18435 Remove zookeeper dependencies in build.gradle (#18450)
* [PR-18452](https://github.com/apache/kafka/pull/18452) - KAFKA-18111: Add Kafka Logo to README (#18452)
* [PR-18438](https://github.com/apache/kafka/pull/18438) - KAFKA-18432 Remove unused code from AutoTopicCreationManager (#18438)
* [PR-18436](https://github.com/apache/kafka/pull/18436) - KAFKA-18434: enrich the authorization error message of connecting to controller (#18436)
* [PR-18441](https://github.com/apache/kafka/pull/18441) - KAFKA-18426 Remove FinalizedFeatureChangeListener (#18441)
* [PR-18276](https://github.com/apache/kafka/pull/18276) - KAFKA-18321: Add StreamsGroupMember, MemberState and Assignment classes (#18276)
* [PR-18443](https://github.com/apache/kafka/pull/18443) - KAFKA-18425 Remove OffsetTrackingListener (#18443)
* [PR-18439](https://github.com/apache/kafka/pull/18439) - KAFKA-18433: Add BatchSize to ShareFetch request (1/N) (#18439)
* [PR-18428](https://github.com/apache/kafka/pull/18428) - Backport some GHA changes from trunk (#18428)
* [PR-18415](https://github.com/apache/kafka/pull/18415) - KAFKA-18428: Measure share consumers performance (#18415)
* [PR-18419](https://github.com/apache/kafka/pull/18419) - KAFKA-18397: Added null check before sending background event from ShareConsumeRequestManager. (#18419)
* [PR-18440](https://github.com/apache/kafka/pull/18440) - KAFKA-18449: Add share group state configs to reconfig-server.properties (#18440)
* [PR-18296](https://github.com/apache/kafka/pull/18296) - KAFKA-18173 Remove duplicate assertFutureError (#18296)
* [PR-18094](https://github.com/apache/kafka/pull/18094) - KAFKA-15599: Move SegmentPosition/TimingWheelExpirationService to raft module (#18094)
* [PR-18329](https://github.com/apache/kafka/pull/18329) - KAFKA-18353 Remove zk config control.plane.listener.name (#18329)
* [PR-18429](https://github.com/apache/kafka/pull/18429) - KAFKA-18443 Remove ZkFourLetterWords (#18429)
* [PR-18431](https://github.com/apache/kafka/pull/18431) - KAFKA-18417 Remove controlled.shutdown.max.retries and controlled.shutdown.retry.backoff.ms (#18431)
* [PR-18287](https://github.com/apache/kafka/pull/18287) - KAFKA-18326: fix merge iterator with cache tombstones (#18287)
* [PR-18413](https://github.com/apache/kafka/pull/18413) - KAFKA-18411 Remove ZkProducerIdManager (#18413)
* [PR-18421](https://github.com/apache/kafka/pull/18421) - KAFKA-18408 tweak the ‘tag’ field for BrokerHeartbeatRequest.json, BrokerRegistrationChangeRecord.json and RegisterBrokerRecord.json (#18421)
* [PR-18401](https://github.com/apache/kafka/pull/18401) - KAFKA-18414 Remove KRaftRegistrationResult (#18401)
* [PR-17671](https://github.com/apache/kafka/pull/17671) - KAFKA-17921 Support SASL_PLAINTEXT protocol with java.security.auth.login.config (#17671)
* [PR-18411](https://github.com/apache/kafka/pull/18411) - KAFKA-18436: Revert Multiversioning Changes from 4.0 release. (#18411)
* [PR-18364](https://github.com/apache/kafka/pull/18364) - KAFKA-18384 Remove ZkAlterPartitionManager (#18364)
* [PR-17946](https://github.com/apache/kafka/pull/17946) - KAFKA-10790: Add deadlock detection to producer#flush (#17946)
* [PR-18399](https://github.com/apache/kafka/pull/18399) - KAFKA-18412: Remove EmbeddedZookeeper (#18399)
* [PR-18352](https://github.com/apache/kafka/pull/18352) - KAFKA-18368 Remove TestUtils#MockZkConnect and remove zkConnect from TestUtils#createBrokerConfig (#18352)
* [PR-18396](https://github.com/apache/kafka/pull/18396) - KAFKA-18303; Update ShareCoordinator to use new record format (#18396)
* [PR-18370](https://github.com/apache/kafka/pull/18370) - KAFKA-18388 test-kraft-server-start.sh should use log4j2.yaml (#18370)
* [PR-17742](https://github.com/apache/kafka/pull/17742) - KAFKA-18419: KIP-891 Connect Multiversion Support (Transformation and Predicate Changes) (#17742)
* [PR-18355](https://github.com/apache/kafka/pull/18355) - KAFKA-18374 Remove EncryptingPasswordEncoder, CipherParamsEncoder, GcmParamsEncoder, IvParamsEncoder, and the unused static variables in PasswordEncoder (#18355)
* [PR-18379](https://github.com/apache/kafka/pull/18379) - KAFKA-18311: Configuring changelog topics (2/N) (#18379)
* [PR-18318](https://github.com/apache/kafka/pull/18318) - KAFKA-18307: Don’t report on disabled/removed tests (#18318)
* [PR-17801](https://github.com/apache/kafka/pull/17801) - KAFKA-17278; Add KRaft RPC compatibility tests (#17801)
* [PR-18377](https://github.com/apache/kafka/pull/18377) - KAFKA-17539: Application metrics extension for share consumer (#18377)
* [PR-18384](https://github.com/apache/kafka/pull/18384) - KAFKA-17616: Remove KafkaServer (#18384)
* [PR-18268](https://github.com/apache/kafka/pull/18268) - KAFKA-18311: Add internal datastructure for configuring topologies (1/N) (#18268)
* [PR-18343](https://github.com/apache/kafka/pull/18343) - KAFKA-18358: Replace Deprecated $buildDir variable in build.gradle  (#18343)
* [PR-18353](https://github.com/apache/kafka/pull/18353) - KAFKA-18365 Remove zookeeper.connect in Test (#18353)
* [PR-18373](https://github.com/apache/kafka/pull/18373) - Use instanceof pattern to avoid explicit cast (#18373)
* [PR-18270](https://github.com/apache/kafka/pull/18270) - KAFKA-18319: Add task assignor interfaces (#18270)
* [PR-18259](https://github.com/apache/kafka/pull/18259) - KAFKA-18273: KIP-1099 verbose display share group options (#18259)
* [PR-18363](https://github.com/apache/kafka/pull/18363) - KAFKA-18367 Remove ZkConfigManager (#18363)
* [PR-18351](https://github.com/apache/kafka/pull/18351) - KAFKA-18347 Add tools-log4j2.yaml to config and remove unsed tools-log4j.properties from config (#18351)
* [PR-18359](https://github.com/apache/kafka/pull/18359) - KAFKA-18375 Update the LICENSE-binary (#18359)
* [PR-18345](https://github.com/apache/kafka/pull/18345) - KAFKA-18026: KIP-1112, configure all StoreBuilder & StoreFactory layers (#18345)
* [PR-18232](https://github.com/apache/kafka/pull/18232) - KAFKA-12469: Deprecated and corrected topic metrics for consumer (KIP-1109) (#18232)
* [PR-18254](https://github.com/apache/kafka/pull/18254) - KAFKA-17421 Add integration tests for ConsumerRecord#leaderEpoch (#18254)
* [PR-18347](https://github.com/apache/kafka/pull/18347) - KAFKA-18361 Remove PasswordEncoderConfigs (#18347)
* [PR-18271](https://github.com/apache/kafka/pull/18271) - KAFKA-17615 Remove KafkaServer from tests (#18271)
* [PR-18308](https://github.com/apache/kafka/pull/18308) - KAFKA-18280 fix e2e TestSecurityRollingUpgrade.test_rolling_upgrade_sasl_mechanism_phase_one (#18308)
* [PR-18327](https://github.com/apache/kafka/pull/18327) - KAFKA-18313 Fix to Kraft or remove tests associate with Zk Broker config in SocketServerTest and ReplicaFetcherThreadTest (#18327)
* [PR-18279](https://github.com/apache/kafka/pull/18279) - KAFKA-18316 Fix to Kraft or remove tests associate with Zk Broker config in ConnectionQuotasTest (#18279)
* [PR-18185](https://github.com/apache/kafka/pull/18185) - KAFKA-18243 Fix compatibility of Loggers class between log4j and log4j2 (#18185)
* [PR-18269](https://github.com/apache/kafka/pull/18269) - KAFKA-18315 Fix to Kraft or remove tests associate with Zk Broker config in DynamicBrokerConfigTest, ReplicaManagerTest, DescribeTopicPartitionsRequestHandlerTest, KafkaConfigTest (#18269)
* [PR-18338](https://github.com/apache/kafka/pull/18338) - KAFKA-18354 Use log4j2 APIs to refactor LogCaptureAppender (#18338)
* [PR-18309](https://github.com/apache/kafka/pull/18309) - KAFKA-18314 Fix to Kraft or remove tests associate with Zk Broker config in KafkaApisTest (#18309)
* [PR-18344](https://github.com/apache/kafka/pull/18344) - KAFKA-18359 Set zkConnect to null in LocalLeaderEndPointTest, HighwatermarkPersistenceTest, IsrExpirationTest, ReplicaManagerQuotasTest, OffsetsForLeaderEpochTest (#18344)
* [PR-18101](https://github.com/apache/kafka/pull/18101) - KAFKA-18135: ShareConsumer HB UnsupportedVersion msg mixed with Consumer HB (#18101)
* [PR-18283](https://github.com/apache/kafka/pull/18283) - KAFKA-18317 Remove zookeeper.connect from RemoteLogManagerTest (#18283)
* [PR-18295](https://github.com/apache/kafka/pull/18295) - KAFKA-18339: Remove raw unversioned direct SASL protocol (KIP-896) (#18295)
* [PR-18313](https://github.com/apache/kafka/pull/18313) - KAFKA-18272: Deprecated protocol api usage should be logged at info level (#18313)
* [PR-18282](https://github.com/apache/kafka/pull/18282) - KAFKA-18295 Remove deprecated function Partitioner#onNewBatch (#18282)
* [PR-18317](https://github.com/apache/kafka/pull/18317) - KAFKA-18348 Remove the deprecated MockConsumer#setException (#18317)
* [PR-18324](https://github.com/apache/kafka/pull/18324) - KAFKA-18352: Add back DeleteGroups v0, it incorrectly tagged as deprecated (#18324)
* [PR-18310](https://github.com/apache/kafka/pull/18310) - KAFKA-18274 Failed to restart controller in testing due to closed socket channel [1/2] (#18310)
* [PR-18250](https://github.com/apache/kafka/pull/18250) - KAFKA-18093 Remove deprecated DeleteTopicsResult#values (#18250)
* [PR-18312](https://github.com/apache/kafka/pull/18312) - KAFKA-18343: Use java_pids to implement pids (#18312)
* [PR-18294](https://github.com/apache/kafka/pull/18294) - KAFKA-18338 add log4j.yaml to test-common-api and remove unsed log4j.properties from test-common (#18294)
* [PR-18306](https://github.com/apache/kafka/pull/18306) - KAFKA-18342 Use File.exist instead of File.exists to ensure the Vagrantfile works with Ruby 3.2+ (#18306)
* [PR-18246](https://github.com/apache/kafka/pull/18246) - KAFKA-18290 Remove deprecated methods of FeatureUpdate (#18246)
* [PR-18255](https://github.com/apache/kafka/pull/18255) - KAFKA-18289 Remove deprecated methods of DescribeTopicsResult (#18255)
* [PR-18265](https://github.com/apache/kafka/pull/18265) - KAFKA-18291 Remove deprecated methods of ListConsumerGroupOffsetsOptions (#18265)
* [PR-18223](https://github.com/apache/kafka/pull/18223) - KAFKA-18278: Correct name and description for run-gradle step (#18223)
* [PR-18267](https://github.com/apache/kafka/pull/18267) - KAFKA-17393: Remove log.message.format.version/message.format.version (KIP-724) (#18267)
* [PR-18132](https://github.com/apache/kafka/pull/18132) - KAFKA-17705: Add Transactions V2 system tests and mark as production ready (#18132)
* [PR-18291](https://github.com/apache/kafka/pull/18291) - KAFKA-18269: Remove deprecated protocol APIs support (KIP-896, KIP-724) (#18291)
* [PR-18218](https://github.com/apache/kafka/pull/18218) - KAFKA-18269: Remove deprecated protocol APIs support (KIP-896, KIP-724) (#18218)
* [PR-18288](https://github.com/apache/kafka/pull/18288) - KAFKA-18334: Produce v4-v6 should be undeprecated (#18288)
* [PR-18262](https://github.com/apache/kafka/pull/18262) - KAFKA-18270: FindCoordinator v0 incorrectly tagged as deprecated (#18262)
* [PR-18221](https://github.com/apache/kafka/pull/18221) - KAFKA-18270: SaslHandshake v0 incorrectly tagged as deprecated (#18221)
* [PR-18249](https://github.com/apache/kafka/pull/18249) - KAFKA-13722: code cleanup after deprecated StateStore.init() was removed (#18249)
* [PR-17687](https://github.com/apache/kafka/pull/17687) - KAFKA-15370: Support Participation in 2PC (KIP-939) (1/N) (#17687)
* [PR-18285](https://github.com/apache/kafka/pull/18285) - KAFKA-18312: Added entityType: topicName to SubscribedTopicNames in ShareGroupHeartbeatRequest.json (#18285)
* [PR-18261](https://github.com/apache/kafka/pull/18261) - KAFKA-18301; Make coordinator records first class citizen (#18261)
* [PR-18204](https://github.com/apache/kafka/pull/18204) - KAFKA-18262 Remove DefaultPartitioner and UniformStickyPartitioner (#18204)
* [PR-18257](https://github.com/apache/kafka/pull/18257) - KAFKA-18296 Remove deprecated KafkaBasedLog constructor (#18257)
* [PR-18238](https://github.com/apache/kafka/pull/18238) - KAFKA-12829: Remove old Processor and ProcessorSupplier interfaces (#18238)
* [PR-18245](https://github.com/apache/kafka/pull/18245) - KAFKA-18292 Remove deprecated methods of UpdateFeaturesOptions (#18245)
* [PR-18154](https://github.com/apache/kafka/pull/18154) - KAFKA-12829: Remove deprecated Topology#addProcessor of old Processor API (#18154)
* [PR-18136](https://github.com/apache/kafka/pull/18136) - KAFKA-18207: Serde for handling transaction records (#18136)
* [PR-18243](https://github.com/apache/kafka/pull/18243) - KAFKA-13722: Refactor Kafka Streams store interfaces (#18243)
* [PR-18241](https://github.com/apache/kafka/pull/18241) - KAFKA-17131: Refactor TimeDefinitions (#18241)
* [PR-18228](https://github.com/apache/kafka/pull/18228) - KAFKA-18284: Add group coordinator records for Streams rebalance protocol (#18228)
* [PR-18242](https://github.com/apache/kafka/pull/18242) - KAFKA-13722: Refactor SerdeGetter (#18242)
* [PR-18176](https://github.com/apache/kafka/pull/18176) - KAFKA-18227: Ensure v2 partitions are not added to last transaction during upgrade (#18176)
* [PR-18251](https://github.com/apache/kafka/pull/18251) - Add IT for share consumer with duration base offet auto reset (#18251)
* [PR-18230](https://github.com/apache/kafka/pull/18230) - KAFKA-18283: Add StreamsGroupDescribe RPC definitions (#18230)
* [PR-18260](https://github.com/apache/kafka/pull/18260) - KAFKA-18294 Remove deprecated SourceTask#commitRecord (#18260)
* [PR-18211](https://github.com/apache/kafka/pull/18211) - KAFKA-18264 Remove NotLeaderForPartitionException (#18211)
* [PR-18248](https://github.com/apache/kafka/pull/18248) - KAFKA-18094 Remove deprecated TopicListing(String, Boolean) (#18248)
* [PR-18227](https://github.com/apache/kafka/pull/18227) - KAFKA-18282: Add StreamsGroupHeartbeat RPC definitions (#18227)
* [PR-18205](https://github.com/apache/kafka/pull/18205) - KAFKA-18026: transition KTable#filter impl to use processor wrapper (#18205)
* [PR-18244](https://github.com/apache/kafka/pull/18244) - KAFKA-18293 Remove org.apache.kafka.common.security.oauthbearer.secured.OAuthBearerLoginCallbackHandler and org.apache.kafka.common.security.oauthbearer.secured.OAuthBearerValidatorCallbackHandler (#18244)
* [PR-18234](https://github.com/apache/kafka/pull/18234) - KAFKA-17960; PlaintextAdminIntegrationTest.testConsumerGroups fails with CONSUMER group protocol (#18234)
* [PR-18144](https://github.com/apache/kafka/pull/18144) - KAFKA-18200; Handle empty batches in coordinator runtime (#18144)
* [PR-18180](https://github.com/apache/kafka/pull/18180) - KAFKA-18237: Upgrade system tests from using 3.7.1 to 3.7.2 (#18180)
* [PR-18210](https://github.com/apache/kafka/pull/18210) - KAFKA-18259: Documentation for consumer auto.offset.reset contains invalid HTML (#18210)
* [PR-18207](https://github.com/apache/kafka/pull/18207) - KAFKA-18263; Group lock must be acquired when reverting static membership rejoin (#18207)
* [PR-18190](https://github.com/apache/kafka/pull/18190) - KAFKA-18244: Fix empty SHA on “Pull Request Labeled” workflow (#18190)
* [PR-18166](https://github.com/apache/kafka/pull/18166) - KAFKA-18226: Disable CustomQuotaCallbackTest and remove isKRaftTest (#18166)


#### Kafka

* [PR-20633](https://github.com/apache/kafka/pull/20633) - KAFKA-19748: Fix metrics leak in Kafka Streams (#20633)
* [PR-20618](https://github.com/apache/kafka/pull/20618) - KAFKA-19690 Add epoch check before verification guard check to prevent unexpected fatal error (#20618)
* [PR-20583](https://github.com/apache/kafka/pull/20583) - [MINOR] Cleaning ignored streams test (#20583)
* [PR-20604](https://github.com/apache/kafka/pull/20604) - KAFKA-19719 –no-initial-controllers should not assume kraft.version=1 (#20604)
* [PR-19961](https://github.com/apache/kafka/pull/19961) - KAFKA-19390: Call safeForceUnmap() in AbstractIndex.resize() on Linux to prevent stale mmap of index files (#19961)
* [PR-20591](https://github.com/apache/kafka/pull/20591) - KAFKA-19732, KAFKA-19716: Clear out coordinator snapshots periodically while loading (#20591)
* [PR-20581](https://github.com/apache/kafka/pull/20581) - KAFKA-19546: Rebalance should be triggered by subscription change during group protocol downgrade (#20581)
* [PR-20519](https://github.com/apache/kafka/pull/20519) - KAFKA-19695: Fix bug in redundant offset calculation. (#20516) (#20519)
* [PR-20512](https://github.com/apache/kafka/pull/20512) - KAFKA-19679: Fix NoSuchElementException in oldest open iterator metric (#20512)
* [PR-20470](https://github.com/apache/kafka/pull/20470) - KAFKA-19668: processValue() must be declared as value-changing operation (#20470)
* [13f70256](https://github.com/apache/kafka/commit/13f70256db3c994c590e5d262a7cc50b9e973204) - Bump version to 4.1.0
* [70dd1ca2](https://github.com/apache/kafka/commit/70dd1ca2cab81f78c68782659db1d8453b1de5d6) - Revert “Bump version to 4.1.0”
* [PR-20405](https://github.com/apache/kafka/pull/20405) - KAFKA-19642 Replace dynamicPerBrokerConfigs with dynamicDefaultConfigs (#20405)
* [PR-1777](https://github.com/confluentinc/kafka/pull/1777) - KSECURITY-2558: Bump jetty to version 12.0.25 in 4.1
* [PR-20070](https://github.com/apache/kafka/pull/20070) - KAFKA-19429: Deflake streams_smoke_test, again (#20070)
* [PR-20398](https://github.com/apache/kafka/pull/20398) - Revert “KAFKA-13722: remove usage of old ProcessorContext  (#18292)” (#20398)
* [PR-1765](https://github.com/confluentinc/kafka/pull/1765) - DPA-1801 Add run_tags to worker-ami and aws-packer
* [PR-1746](https://github.com/confluentinc/kafka/pull/1746) - Change ci_tools import path
* [23b64404](https://github.com/apache/kafka/commit/23b64404ae7ba98d89a2d456991abaf2f32af35f) - Bump version to 4.1.0
* [6340f437](https://github.com/apache/kafka/commit/6340f437cd2d15be4180febb9505437266080002) - Revert “Bump version to 4.1.0”
* [de16dd10](https://github.com/apache/kafka/commit/de16dd103af93bb68a329987ff19469941f85cbc) - KAFKA-19581: Temporary fix for Streams system tests
* [PR-20269](https://github.com/apache/kafka/pull/20269) - KAFKA-19576 Fix typo in state-change log filename after rotate (#20269)
* [PR-20274](https://github.com/apache/kafka/pull/20274) - KAFKA-19529: State updater sensor names should be unique (#20262) (#20274)
* [PR-1708](https://github.com/confluentinc/kafka/pull/1708) - DPA-1675: In case of infra failure in ccs-kafka tag that as infra failure in testbreak
* [PR-20165](https://github.com/apache/kafka/pull/20165) - KAFKA-19501 Update OpenJDK base image from buster to bullseye (#20165)
* [e14d849c](https://github.com/apache/kafka/commit/e14d849cbf8836cc9e4a592342baf19a1fbd93c9) - Bump version to 4.1.0
* [PR-20200](https://github.com/apache/kafka/pull/20200) - KAFKA-19522: avoid electing fenced lastKnownLeader (#20200)
* [PR-20196](https://github.com/apache/kafka/pull/20196) - KAFKA-19520 Bump Commons-Lang for CVE-2025-48924 (#20196)
* [PR-20040](https://github.com/apache/kafka/pull/20040) - KAFKA-19427 Allow the coordinator to grow its buffer dynamically (#20040)
* [PR-20166](https://github.com/apache/kafka/pull/20166) - KAFKA-19504: Remove unused metrics reporter initialization in KafkaAdminClient (#20166)
* [PR-20151](https://github.com/apache/kafka/pull/20151) - KAFKA-19495: Update config for native image (v4.1.0) (#20151)
* [610f0765](https://github.com/apache/kafka/commit/610f076542e1ac177c4b97ea7d6ca1335f9a3065) - Bump version to 4.1.0
* [PR-1684](https://github.com/confluentinc/kafka/pull/1684) - DPA-1489 migrate from vagrant to terraform
* [PR-1693](https://github.com/confluentinc/kafka/pull/1693) - Revert “Temporarily disable artifact publishing for the 4.1 branch.”
* [57e81f20](https://github.com/apache/kafka/commit/57e81f201055b58f94febf0509bfc8acba632854) - Bump version to 4.1.0
* [PR-20071](https://github.com/apache/kafka/pull/20071) - KAFKA-19184: Add documentation for upgrading the kraft version (#20071)
* [PR-20116](https://github.com/apache/kafka/pull/20116) - KAFKA-19444: Add back JoinGroup v0 & v1 (#20116)
* [PR-19964](https://github.com/apache/kafka/pull/19964) - KAFKA-19397: Ensure consistent metadata usage in produce request and response (#19964)
* [PR-19971](https://github.com/apache/kafka/pull/19971) - KAFKA-19042 Move ProducerSendWhileDeletionTest to client-integration-tests module (#19971)
* [PR-20100](https://github.com/apache/kafka/pull/20100) - KAFKA-19453: Ignore group not found in share group record replay (#20100)
* [PR-20025](https://github.com/apache/kafka/pull/20025) - KAFKA-19152: Add top-level documentation for OAuth flows (#20025)
* [PR-20029](https://github.com/apache/kafka/pull/20029) - KAFKA-19379: Basic upgrade guide for KIP-1071 EA (#20029)
* [PR-20062](https://github.com/apache/kafka/pull/20062) - KAFKA-19445: Fix coordinator runtime metrics sharing sensors (#20062)
* [PR-19704](https://github.com/apache/kafka/pull/19704) - KAFKA-19246; OffsetFetch API does not return group level errors correctly with version 1 (#19704)
* [PR-19985](https://github.com/apache/kafka/pull/19985) - KAFKA-19414: Remove 2PC public APIs from 4.1 until release (KIP-939) (#19985)
* [PR-1672](https://github.com/confluentinc/kafka/pull/1672) - DPA-1593 exclude newly added files to fix build
* [PR-1663](https://github.com/confluentinc/kafka/pull/1663) - DPA-1593 add cloudwatch metrics to view cpu, memory and disk usage
* [PR-20022](https://github.com/apache/kafka/pull/20022) - KAFKA-19398: (De)Register oldest-iterator-open-since-ms metric dynamically (#20022)
* [PR-20033](https://github.com/apache/kafka/pull/20033) - KAFKA-19383: Handle the deleted topics when applying ClearElrRecord (#20033)
* [PR-19745](https://github.com/apache/kafka/pull/19745) - KAFKA-19294: Fix BrokerLifecycleManager RPC timeouts (#19745)
* [PR-19974](https://github.com/apache/kafka/pull/19974) - KAFKA-19411: Fix deleteAcls bug which allows more deletions than max records per user op (#19974)
* [PR-19972](https://github.com/apache/kafka/pull/19972) - KAFKA-19407 Fix potential IllegalStateException when appending to timeIndex (#19972)
* [PR-1659](https://github.com/confluentinc/kafka/pull/1659) - Reapply “KAFKA-18296 Remove deprecated KafkaBasedLog constructor (#18
* [PR-20019](https://github.com/apache/kafka/pull/20019) - KAFKA-19429: Deflake streams_smoke_test (#20019)
* [PR-19999](https://github.com/apache/kafka/pull/19999) - KAFKA-19421: Deflake streams_broker_down_resilience_test (#19999)
* [PR-20004](https://github.com/apache/kafka/pull/20004) - KAFKA-19422: Deflake streams_application_upgrade_test (#20004)
* [PR-20005](https://github.com/apache/kafka/pull/20005) - KAFKA-19423: Deflake streams_broker_bounce_test (#20005)
* [PR-19983](https://github.com/apache/kafka/pull/19983) - KAFKA-19356: Prevent new consumer fetch assigned partitions not in explicit subscription  (#19983)
* [PR-19917](https://github.com/apache/kafka/pull/19917) - KAFKA-19297: Refactor AsyncKafkaConsumer’s use of Java Streams APIs in critical sections (#19917)
* [PR-19981](https://github.com/apache/kafka/pull/19981) - KAFKA-19413: Extended AuthorizerIntegrationTest to cover StreamsGroupDescribe (#19981)
* [PR-19978](https://github.com/apache/kafka/pull/19978) - KAFKA-19412: Extended AuthorizerIntegrationTest to cover StreamsGroupHeartbeat (#19978)
* [PR-19976](https://github.com/apache/kafka/pull/19976) - KAFKA-19367: Follow up bug fix (#19976)
* [PR-19800](https://github.com/apache/kafka/pull/19800) - KAFKA-14145; Faster KRaft HWM replication (#19800)
* [PR-1655](https://github.com/confluentinc/kafka/pull/1655) - Add back deprecated constructors in KafkaBasedLog
* [PR-19938](https://github.com/apache/kafka/pull/19938) - KAFKA-19153: Add OAuth integration tests (#19938)
* [PR-19910](https://github.com/apache/kafka/pull/19910) - KAFKA-19367: Fix InitProducerId with TV2 double-increments epoch if ongoing transaction is aborted (#19910)
* [PR-19814](https://github.com/apache/kafka/pull/19814) - KAFKA-18117; KAFKA-18729: Use assigned topic IDs to avoid full metadata requests on broker-side regex (#19814)
* [PR-19904](https://github.com/apache/kafka/pull/19904) - KAFKA-18961: Time-based refresh for server-side RE2J regex (#19904)
* [PR-19939](https://github.com/apache/kafka/pull/19939) - KAFKA-19359: force bump commons-beanutils for CVE-2025-48734 (#19939)
* [b311ac7d](https://github.com/apache/kafka/commit/b311ac7dd5bce649fd5bd83a948f95c8c468a9aa) - Temporarily disable artifact publishing for the 4.1 branch.
* [PR-19607](https://github.com/apache/kafka/pull/19607) - KAFKA-19221 Propagate IOException on LogSegment#close (#19607)
* [PR-19928](https://github.com/apache/kafka/pull/19928) - KAFKA-19389: Fix memory consumption for completed share fetch requests (#19928)
* [PR-19895](https://github.com/apache/kafka/pull/19895) - KAFKA-19244: Add support for kafka-streams-groups.sh options (delete offsets) [4/N] (#19895)
* [PR-19908](https://github.com/apache/kafka/pull/19908) - KAFKA-19376: Throw an error message if any unsupported feature is used with KIP-1071 (#19908)
* [PR-19936](https://github.com/apache/kafka/pull/19936) - KAFKA-19392 Fix metadata.log.segment.ms not being applied (#19936)
* [PR-19919](https://github.com/apache/kafka/pull/19919) - KAFKA-19382:Upgrade junit from 5.10 to 5.13 (#19919)
* [PR-19929](https://github.com/apache/kafka/pull/19929) - KAFKA-18486 Remove becomeLeaderOrFollower from readFromLogWithOffsetOutOfRange and other related methods. (#19929)
* [PR-19931](https://github.com/apache/kafka/pull/19931) - KAFKA-19283: Update transaction exception handling documentation (#19931)
* [PR-19832](https://github.com/apache/kafka/pull/19832) - KAFKA-19271: allow intercepting internal method call (#19832)
* [PR-19918](https://github.com/apache/kafka/pull/19918) - KAFKA-19386: Correcting ExpirationReaper thread names from Purgatory (#19918)
* [PR-19817](https://github.com/apache/kafka/pull/19817) - KAFKA-19334 MetadataShell execution unintentionally deletes lock file (#19817)
* [PR-19922](https://github.com/apache/kafka/pull/19922) - KAFKA-18486 Remove ReplicaManager#becomeLeaderOrFollower from testReplicaAlterLogDirs (#19922)
* [PR-19827](https://github.com/apache/kafka/pull/19827) - KAFKA-19042 Move PlaintextConsumerSubscriptionTest to client-integration-tests module (#19827)
* [PR-19883](https://github.com/apache/kafka/pull/19883) - KAFKA-18486 Update testExceptionWhenUnverifiedTransactionHasMultipleProducerIds  (#19883)
* [PR-19890](https://github.com/apache/kafka/pull/19890) - KAFKA-18486 Update activeProducerState wih KRaft mechanism in ReplicaManagerTest (#19890)
* [PR-19879](https://github.com/apache/kafka/pull/19879) - KAFKA-14895 [1/N] Move AddPartitionsToTxnManager files to java (#19879)
* [PR-19915](https://github.com/apache/kafka/pull/19915) - KAFKA-19295: Remove AsyncKafkaConsumer event ID generation (#19915)
* [PR-19902](https://github.com/apache/kafka/pull/19902) - KAFKA-18202: Add rejection for non-zero sequences in TV2 (KIP-890) (#19902)
* [PR-15913](https://github.com/apache/kafka/pull/15913) - KAFKA-19309 : Add transaction client template code in kafka examples (#15913)
* [PR-19900](https://github.com/apache/kafka/pull/19900) - KAFKA-19369: Add group.share.assignors config and integration test (#19900)
* [PR-19815](https://github.com/apache/kafka/pull/19815) - KAFKA-19290: Exploit mapKey optimisation in protocol requests and responses (wip) (#19815)
* [PR-19889](https://github.com/apache/kafka/pull/19889) - KAFKA-18913: Start state updater in task manager (#19889)
* [PR-19907](https://github.com/apache/kafka/pull/19907) - KAFKA-19370: Create JMH benchmark for share group assignor (#19907)
* [PR-19773](https://github.com/apache/kafka/pull/19773) - KAFKA-19042 Move PlaintextConsumerAssignTest to clients-integration-tests module (#19773)
* [PR-18739](https://github.com/apache/kafka/pull/18739) - KAFKA-16505: Add source raw key and value (#18739)
* [PR-19903](https://github.com/apache/kafka/pull/19903) - KAFKA-19373 Fix protocol name comparison (#19903)
* [PR-18325](https://github.com/apache/kafka/pull/18325) - KAFKA-19248: Multiversioning in Kafka Connect - Plugin Loading Isolation Tests (#18325)
* [PR-19844](https://github.com/apache/kafka/pull/19844) - KAFKA-18042: Reject the produce request with lower producer epoch early (KIP-890) (#19844)
* [PR-19901](https://github.com/apache/kafka/pull/19901) - KAFKA-19372: StreamsGroup not subscribed to a topic when empty (#19901)
* [PR-19722](https://github.com/apache/kafka/pull/19722) - KAFKA-19044: Handle tasks that are not present in the current topology (#19722)
* [PR-19856](https://github.com/apache/kafka/pull/19856) - KAFKA-17747: [7/N] Add consumer group integration test for rack aware assignment (#19856)
* [PR-19898](https://github.com/apache/kafka/pull/19898) - KAFKA-19347 Deduplicate ACLs when creating (#19898)
* [PR-19872](https://github.com/apache/kafka/pull/19872) - KAFKA-19328: SharePartitionManagerTest testMultipleConcurrentShareFetches doAnswer chaining needs verification (#19872)
* [PR-19758](https://github.com/apache/kafka/pull/19758) - KAFKA-19244: Add support for kafka-streams-groups.sh options (delete all groups) [2/N]  (#19758)
* [PR-19754](https://github.com/apache/kafka/pull/19754) - KAFKA-18573: Add support for OAuth jwt-bearer grant type (#19754)
* [PR-19656](https://github.com/apache/kafka/pull/19656) - KAFKA-19250 : txnProducer.abortTransaction() API should not return abortable exception (#19656)
* [PR-19522](https://github.com/apache/kafka/pull/19522) - KAFKA-19176: Update Transactional producer to translate retriable into abortable exceptions (#19522)
* [PR-19796](https://github.com/apache/kafka/pull/19796) - KAFKA-17747: [6/N] Replace subscription metadata with metadata hash in share group (#19796)
* [PR-19802](https://github.com/apache/kafka/pull/19802) - KAFKA-17747: [5/N] Replace subscription metadata with metadata hash in stream group (#19802)
* [PR-19861](https://github.com/apache/kafka/pull/19861) - KAFKA-19338: Error on read/write of uninitialized share part. (#19861)
* [PR-19878](https://github.com/apache/kafka/pull/19878) - KAFKA-19358: Updated share_consumer_test.py tests to use set_group_offset_reset_strategy (#19878)
* [PR-19849](https://github.com/apache/kafka/pull/19849) - KAFKA-19349 Move CreateTopicsRequestWithPolicyTest to clients-integration-tests (#19849)
* [PR-19877](https://github.com/apache/kafka/pull/19877) - KAFKA-18904: [4/N] Add ListClientMetricsResources metric if request is v0 ListConfigResources (#19877)
* [PR-19811](https://github.com/apache/kafka/pull/19811) - KAFKA-19320: Added share_consume_bench_test.py system tests (#19811)
* [PR-19831](https://github.com/apache/kafka/pull/19831) - KAFKA-16894: share.version becomes stable feature for preview (#19831)
* [PR-19836](https://github.com/apache/kafka/pull/19836) - KAFKA-19321: Added share_consumer_performance.py and related system tests (#19836)
* [PR-19866](https://github.com/apache/kafka/pull/19866) - KAFKA-19355 Remove interBrokerListenerName from ClusterControlManager (#19866)
* [PR-19327](https://github.com/apache/kafka/pull/19327) - KAFKA-19053 Remove FetchResponse#of which is not used in production … (#19327)
* [PR-19728](https://github.com/apache/kafka/pull/19728) - KAFKA-19284 Add documentation to clarify the behavior of null values for all partitionsToOffsetAndMetadata methods. (#19728)
* [PR-19864](https://github.com/apache/kafka/pull/19864) - KAFKA-19311 Document commitAsync behavioral differences between Classic and Async Consumer (#19864)
* [PR-19685](https://github.com/apache/kafka/pull/19685) - KAFKA-19042 Move GroupAuthorizerIntegrationTest to clients-integration-tests module (#19685)
* [PR-19651](https://github.com/apache/kafka/pull/19651) - KAFKA-19042 Move BaseConsumerTest, SaslPlainPlaintextConsumerTest to client-integration-tests module (#19651)
* [PR-19846](https://github.com/apache/kafka/pull/19846) - KAFKA-19346: Move LogReadResult to server module (#19846)
* [PR-19855](https://github.com/apache/kafka/pull/19855) - KAFKA-19351: AsyncConsumer#commitAsync should copy the input offsets (#19855)
* [PR-19810](https://github.com/apache/kafka/pull/19810) - KAFKA-19042 move ConsumerWithLegacyMessageFormatIntegrationTest to clients-integration-tests module (#19810)
* [PR-19808](https://github.com/apache/kafka/pull/19808) - KAFKA-18904: kafka-configs.sh return resource doesn’t exist message [3/N] (#19808)
* [PR-19714](https://github.com/apache/kafka/pull/19714) - KAFKA-19082:[4/4] Complete Txn Client Side Changes (KIP-939) (#19714)
* [PR-19404](https://github.com/apache/kafka/pull/19404) - KAFKA-6629: parameterise SegmentedCacheFunctionTest for session key schemas (#19404)
* [PR-19843](https://github.com/apache/kafka/pull/19843) - KAFKA-19337: Write state writes snapshot for higher state epoch. (#19843)
* [PR-19741](https://github.com/apache/kafka/pull/19741) - KAFKA-19056 Rewrite EndToEndClusterIdTest in Java and move it to the server module (#19741)
* [PR-19774](https://github.com/apache/kafka/pull/19774) - KAFKA-19316: added share_group_command_test.py system tests (#19774)
* [PR-19838](https://github.com/apache/kafka/pull/19838) - KAFKA-19344: Replace desc.assignablePartitions with spec.isPartitionAssignable. (#19838)
* [PR-19840](https://github.com/apache/kafka/pull/19840) - KAFKA-19347 Don’t update timeline data structures in createAcls (#19840)
* [PR-19826](https://github.com/apache/kafka/pull/19826) - KAFKA-19342: Authorization tests for alter share-group offsets (#19826)
* [PR-19818](https://github.com/apache/kafka/pull/19818) - KAFKA-19335: Membership managers send negative epoch in JOINING (#19818)
* [PR-19778](https://github.com/apache/kafka/pull/19778) - KAFKA-19285: Added more tests in SharePartitionManagerTest (#19778)
* [PR-19823](https://github.com/apache/kafka/pull/19823) - KAFKA-19310: (MINOR) Missing mocks for DelayedShareFetchTest tests related to Memory Records slicing (#19823)
* [PR-19761](https://github.com/apache/kafka/pull/19761) - KAFKA-17747: [4/N] Replace subscription metadata with metadata hash in consumer group (#19761)
* [PR-19835](https://github.com/apache/kafka/pull/19835) - KAFKA-19336 Upgrade Jackson to 2.19.0 (#19835)
* [PR-19744](https://github.com/apache/kafka/pull/19744) - KAFKA-19154; Offset Fetch API should return INVALID_OFFSET if requested topic id does not match persisted one (#19744)
* [PR-19812](https://github.com/apache/kafka/pull/19812) - KAFKA-19330 Change MockSerializer/Deserializer to use String serializer instead of byte[] (#19812)
* [PR-19790](https://github.com/apache/kafka/pull/19790) - KAFKA-18687: Setting the subscriptionMetadata during conversion to consumer group (#19790)
* [PR-19786](https://github.com/apache/kafka/pull/19786) - KAFKA-19268 Missing mocks for SharePartitionManagerTest tests (#19786)
* [PR-19798](https://github.com/apache/kafka/pull/19798) - KAFKA-19322 Remove the DelayedOperation constructor that accepts an external lock (#19798)
* [PR-19779](https://github.com/apache/kafka/pull/19779) - KAFKA-19300 AsyncConsumer#unsubscribe always timeout due to GroupAuthorizationException (#19779)
* [PR-19093](https://github.com/apache/kafka/pull/19093) - KAFKA-18424: Consider splitting PlaintextAdminIntegrationTest#testConsumerGroups (#19093)
* [PR-19371](https://github.com/apache/kafka/pull/19371) - KAFKA-19080 The constraint on segment.ms is not enforced at topic level (#19371)
* [PR-19681](https://github.com/apache/kafka/pull/19681) - KAFKA-19034 [1/N] Rewrite RemoteTopicCrudTest by ClusterTest and move it to storage module (#19681)
* [PR-19759](https://github.com/apache/kafka/pull/19759) - KAFKA-19312 Avoiding concurrent execution of onComplete and tryComplete (#19759)
* [PR-19767](https://github.com/apache/kafka/pull/19767) - KAFKA-19313 Replace LogOffsetMetadata#UNIFIED_LOG_UNKNOWN_OFFSET by UnifiedLog.UNKNOWN_OFFSET (#19767)
* [PR-19747](https://github.com/apache/kafka/pull/19747) - KAFKA-18345; Wait the entire election timeout on election loss (#19747)
* [PR-19687](https://github.com/apache/kafka/pull/19687) - KAFKA-19260 Move LoggingController to server module (#19687)
* [PR-18929](https://github.com/apache/kafka/pull/18929) - KAFKA-16717 [2/N]: Add AdminClient.alterShareGroupOffsets (#18929)
* [PR-19729](https://github.com/apache/kafka/pull/19729) - KAFKA-19069 DumpLogSegments does not dump the LEADER_CHANGE record (#19729)
* [PR-19781](https://github.com/apache/kafka/pull/19781) - KAFKA-19204: Add timestamp to share state metadata init maps [1/N] (#19781)
* [PR-19582](https://github.com/apache/kafka/pull/19582) - KAFKA-19042 Move PlaintextConsumerPollTest to client-integration-tests module (#19582)
* [PR-19763](https://github.com/apache/kafka/pull/19763) - KAFKA-19314 Remove unnecessary code of closing snapshotWriter (#19763)
* [PR-19743](https://github.com/apache/kafka/pull/19743) - KAFKA-18904: Add Admin#listConfigResources [2/N] (#19743)
* [PR-19757](https://github.com/apache/kafka/pull/19757) - KAFKA-19291: Increase the timeout of remote storage share fetch requests in purgatory (#19757)
* [PR-18951](https://github.com/apache/kafka/pull/18951) - KAFKA-4650: Add unit tests for GraphNode class (#18951)
* [PR-19749](https://github.com/apache/kafka/pull/19749) - KAFKA-19287 document all group coordinator metrics (#19749)
* [PR-19731](https://github.com/apache/kafka/pull/19731) - KAFKA-18783 : Extend InvalidConfigurationException related exceptions (#19731)
* [PR-19658](https://github.com/apache/kafka/pull/19658) - KAFKA-18345; Prevent livelocked elections (#19658)
* [PR-1627](https://github.com/confluentinc/kafka/pull/1627) - Trigger cp-jar-build to verify CP packaging in after_pipeline job
* [PR-19755](https://github.com/apache/kafka/pull/19755) - KAFKA-19302 Move ReplicaState and Replica to server module (#19755)
* [PR-19389](https://github.com/apache/kafka/pull/19389) - KAFKA-19042 Move PlaintextConsumerCommitTest to client-integration-tests module (#19389)
* [PR-19611](https://github.com/apache/kafka/pull/19611) - KAFKA-17747: [3/N] Get rid of TopicMetadata in SubscribedTopicDescriberImpl (#19611)
* [PR-19700](https://github.com/apache/kafka/pull/19700) - KAFKA-19202: Enable KIP-1071 in streams_eos_test (#19700)
* [PR-19717](https://github.com/apache/kafka/pull/19717) - KAFKA-19280: Fix NoSuchElementException in UnifiedLog (#19717)
* [PR-19691](https://github.com/apache/kafka/pull/19691) - KAFKA-19256: Only send IQ metadata on assignment changes (#19691)
* [PR-19708](https://github.com/apache/kafka/pull/19708) - KAFKA-19226: Added test_console_share_consumer.py (#19708)
* [PR-19683](https://github.com/apache/kafka/pull/19683) - KAFKA-19141; Persist topic id in OffsetCommit record (#19683)
* [PR-19697](https://github.com/apache/kafka/pull/19697) - KAFKA-19271:  Add internal ConsumerWrapper (#19697)
* [PR-1625](https://github.com/confluentinc/kafka/pull/1625) - Increase timeout for Connect tests
* [PR-19734](https://github.com/apache/kafka/pull/19734) - KAFKA-19217: Fix ShareConsumerTest.testComplexConsumer flakiness. (#19734)
* [PR-19507](https://github.com/apache/kafka/pull/19507) - KAFKA-19171: Kafka Streams crashes with UnsupportedOperationException (#19507)
* [PR-19709](https://github.com/apache/kafka/pull/19709) - KAFKA-19267 the min version used by ListOffsetsRequest should be 1 rather than 0 (#19709)
* [PR-19580](https://github.com/apache/kafka/pull/19580) - KAFKA-19208: KStream-GlobalKTable join should not drop left-null-key record (#19580)
* [PR-19493](https://github.com/apache/kafka/pull/19493) - KAFKA-18904: [1/N] Change ListClientMetricsResources API to ListConfigResources (#19493)
* [PR-19713](https://github.com/apache/kafka/pull/19713) - KAFKA-19274; Group Coordinator Shards are not unloaded when \_\_consumer_offsets topic is deleted (#19713)
* [PR-19701](https://github.com/apache/kafka/pull/19701) - KAFKA-19231-1: Handle fetch request when share session cache is full (#19701)
* [PR-19721](https://github.com/apache/kafka/pull/19721) - KAFKA-19281: Add share enable flag to periodic jobs. (#19721)
* [PR-19523](https://github.com/apache/kafka/pull/19523) - KAFKA-17747: [2/N] Add compute topic and group hash (#19523)
* [PR-19698](https://github.com/apache/kafka/pull/19698) - KAFKA-19269 Unexpected error .. should not happen when the delete.topic.enable is false (#19698)
* [PR-19718](https://github.com/apache/kafka/pull/19718) - KAFKA-19270: Remove Optional from ClusterInstance#controllerListenerName() return type (#19718)
* [PR-19539](https://github.com/apache/kafka/pull/19539) - KAFKA-19082:[3/4] Add prepare txn method (KIP-939) (#19539)
* [PR-19586](https://github.com/apache/kafka/pull/19586) - KAFKA-18666: Controller-side monitoring for broker shutdown and startup (#19586)
* [PR-19635](https://github.com/apache/kafka/pull/19635) - KAFKA-19234: broker should return UNAUTHORIZATION error for non-existing topic in produce request (#19635)
* [PR-19702](https://github.com/apache/kafka/pull/19702) - KAFKA-19273 Ensure the delete policy is configured when the tiered storage is enabled (#19702)
* [PR-19553](https://github.com/apache/kafka/pull/19553) - KAFKA-19091 Fix race condition in DelayedFutureTest (#19553)
* [PR-19666](https://github.com/apache/kafka/pull/19666) - KAFKA-19116, KAFKA-19258: Handling share group member change events (#19666)
* [PR-19569](https://github.com/apache/kafka/pull/19569) - KAFKA-19206 ConsumerNetworkThread.cleanup() throws NullPointerException if initializeResources() previously failed (#19569)
* [PR-19712](https://github.com/apache/kafka/pull/19712) - KAFKA-19275 client-state and thread-state metrics are always “Unavailable” (#19712)
* [PR-19630](https://github.com/apache/kafka/pull/19630) - KAFKA-19145 Move LeaderEndPoint to Server module (#19630)
* [PR-19622](https://github.com/apache/kafka/pull/19622) - KAFKA-18847: Refactor OAuth layer to improve reusability 1/N (#19622)
* [PR-19677](https://github.com/apache/kafka/pull/19677) - KAFKA-18688: Fix uniform homogeneous assignor stability (#19677)
* [PR-19659](https://github.com/apache/kafka/pull/19659) - KAFKA-19253: Improve metadata handling for share version using feature listeners (1/N) (#19659)
* [PR-19559](https://github.com/apache/kafka/pull/19559) - KAFKA-19201: Handle deletion of user topics part of share partitions. (#19559)
* [PR-19515](https://github.com/apache/kafka/pull/19515) - KAFKA-14691; Add TopicId to OffsetFetch API (#19515)
* [PR-19705](https://github.com/apache/kafka/pull/19705) - KAFKA-19245: Updated default locks config for share group (#19705)
* [PR-19496](https://github.com/apache/kafka/pull/19496) - KAFKA-19163: Avoid deleting groups with pending transactional offsets (#19496)
* [PR-1554](https://github.com/confluentinc/kafka/pull/1554) - Chore: update repo by service bot
* [PR-19644](https://github.com/apache/kafka/pull/19644) - KAFKA-18905; Disable idempotent producer to remove test flakiness  (#19644)
* [PR-19631](https://github.com/apache/kafka/pull/19631) - KAFKA-19242: Fix commit bugs caused by race condition during rebalancing. (#19631)
* [PR-19497](https://github.com/apache/kafka/pull/19497) - KAFKA-19160;KAFKA-19164; Improve performance of fetching stable offsets (#19497)
* [PR-19633](https://github.com/apache/kafka/pull/19633) - KAFKA-18695 Remove quorum=kraft and kip932 from all integration tests (#19633)
* [PR-19673](https://github.com/apache/kafka/pull/19673) - KAFKA-19264 Remove fallback for thread pool sizes in RemoteLogManagerConfig (#19673)
* [PR-19346](https://github.com/apache/kafka/pull/19346) - KAFKA-19068 Eliminate the duplicate type check in creating ControlRecord (#19346)
* [PR-19543](https://github.com/apache/kafka/pull/19543) - KAFKA-19109 Don’t print null in kafka-metadata-quorum describe status (#19543)
* [PR-19650](https://github.com/apache/kafka/pull/19650) - KAFKA-19220 Add tests to ensure the internal configs don’t return by public APIs by default (#19650)
* [PR-1623](https://github.com/confluentinc/kafka/pull/1623) - KBROKER-295: Ignore failing quota_test
* [PR-1622](https://github.com/confluentinc/kafka/pull/1622) - KBROKER-295: Ignore failing quota_test
* [PR-19508](https://github.com/apache/kafka/pull/19508) - KAFKA-17897: Deprecate Admin.listConsumerGroups [2/N] (#19508)
* [PR-19657](https://github.com/apache/kafka/pull/19657) - KAFKA-19209: Clarify index.interval.bytes impact on offset and time index (#19657)
* [PR-18391](https://github.com/apache/kafka/pull/18391) - KAFKA-18115; Fix for loading big files while performing load tests (#18391)
* [PR-19608](https://github.com/apache/kafka/pull/19608) - KAFKA-19182 Move SchedulerTest to server module (#19608)
* [PR-19568](https://github.com/apache/kafka/pull/19568) - KAFKA-19087 Move TransactionState to transaction-coordinator module (#19568)
* [PR-19581](https://github.com/apache/kafka/pull/19581) - KAFKA-18855 Slice API for MemoryRecords (#19581)
* [PR-19590](https://github.com/apache/kafka/pull/19590) - KAFKA-19212: Correct the unclean leader election metric calculation (#19590)
* [PR-19609](https://github.com/apache/kafka/pull/19609) - KAFKA-19214: Clean up use of Optionals in RequestManagers.entries() (#19609)
* [PR-19640](https://github.com/apache/kafka/pull/19640) - KAFKA-19241: Updated tests in ShareFetchAcknowledgeRequestTest to reuse the socket for subsequent requests (#19640)
* [PR-19598](https://github.com/apache/kafka/pull/19598) - KAFKA-19215: Handle share partition fetch lock cleanly using tokens (#19598)
* [PR-19625](https://github.com/apache/kafka/pull/19625) - KAFKA-19202: Enable KIP-1071 in streams_standby_replica_test.py (#19625)
* [PR-19602](https://github.com/apache/kafka/pull/19602) - KAFKA-19218: Add missing leader epoch to share group state summary response (#19602)
* [PR-19574](https://github.com/apache/kafka/pull/19574) - KAFKA-19207 Move ForwardingManagerMetrics and ForwardingManagerMetricsTest to server module (#19574)
* [PR-19528](https://github.com/apache/kafka/pull/19528) - KAFKA-19170 Move MetricsDuringTopicCreationDeletionTest to client-integration-tests module (#19528)
* [PR-19612](https://github.com/apache/kafka/pull/19612) - KAFKA-19227: Piggybacked share fetch acknowledgements performance issue (#19612)
* [PR-19639](https://github.com/apache/kafka/pull/19639) - KAFKA-19216: Eliminate flakiness in kafka.server.share.SharePartitionTest (#19639)
* [PR-19592](https://github.com/apache/kafka/pull/19592) - KAFKA-19133: Support fetching for multiple remote fetch topic partitions in a single share fetch request (#19592)
* [PR-19641](https://github.com/apache/kafka/pull/19641) - KAFKA-19240 Move MetadataVersionIntegrationTest to clients-integration-tests module (#19641)
* [PR-19619](https://github.com/apache/kafka/pull/19619) - KAFKA-19232: Handle Share session limit reached exception in clients. (#19619)
* [PR-19629](https://github.com/apache/kafka/pull/19629) - KAFKA-19131: Adjust remote storage reader thread maximum pool size to avoid illegal argument (#19629)
* [PR-19393](https://github.com/apache/kafka/pull/19393) - KAFKA-19060 Documented null edge cases in the Clients API JavaDoc (#19393)
* [PR-19578](https://github.com/apache/kafka/pull/19578) - KAFKA-19205: inconsistent result of beginningOffsets/endoffset between classic and async consumer with 0 timeout (#19578)
* [PR-19571](https://github.com/apache/kafka/pull/19571) - KAFKA-18267 Add unit tests for CloseOptions (#19571)
* [PR-19603](https://github.com/apache/kafka/pull/19603) - KAFKA-19204: Allow persister retry of initializing topics. (#19603)
* [PR-1621](https://github.com/confluentinc/kafka/pull/1621) - Dexcom fix master
* [PR-1620](https://github.com/confluentinc/kafka/pull/1620) - Dexcom fix 4.0
* [PR-19475](https://github.com/apache/kafka/pull/19475) - KAFKA-19146 Merge OffsetAndEpoch from raft to server-common (#19475)
* [PR-19606](https://github.com/apache/kafka/pull/19606) - KAFKA-16894 Correct definition of ShareVersion (#19606)
* [PR-19355](https://github.com/apache/kafka/pull/19355) - KAFKA-19073 add transactional ID pattern filter to ListTransactions (#19355)
* [PR-19430](https://github.com/apache/kafka/pull/19430) - KAFKA-17541:[1/2] Improve handling of delivery count  (#19430)
* [PR-19329](https://github.com/apache/kafka/pull/19329) - KAFKA-19015: Remove share session from cache on share consumer connection drop (#19329)
* [PR-19540](https://github.com/apache/kafka/pull/19540) - KAFKA-19169: Enhance AuthorizerIntegrationTest for share group APIs (#19540)
* [PR-19604](https://github.com/apache/kafka/pull/19604) - KAFKA-19202: Enable KIP-1071 in streams_relational_smoke_test (#19604)
* [PR-19587](https://github.com/apache/kafka/pull/19587) - KAFKA-16718-4/n: ShareGroupCommand changes for DeleteShareGroupOffsets admin call (#19587)
* [PR-19601](https://github.com/apache/kafka/pull/19601) - KAFKA-19210: resolved the flakiness in testShareGroupHeartbeatInitializeOnPartitionUpdate (#19601)
* [PR-19542](https://github.com/apache/kafka/pull/19542) - KAFKA-16894: Exploit share feature [3/N] (#19542)
* [PR-19594](https://github.com/apache/kafka/pull/19594) - KAFKA-19202: Enable KIP-1071 in streams_broker_down_resilience_test (#19594)
* [PR-19509](https://github.com/apache/kafka/pull/19509) - KAFKA-19173: Add Feature for “streams” group (#19509)
* [PR-19191](https://github.com/apache/kafka/pull/19191) - KAFKA-18760: Deprecate Optional<String> and return String from public Endpoint#listener (#19191)
* [PR-19519](https://github.com/apache/kafka/pull/19519) - KAFKA-19139 Plugin#wrapInstance should use LinkedHashMap instead of Map (#19519)
* [PR-19588](https://github.com/apache/kafka/pull/19588) - KAFKA-19135 Migrate initial IQ support for KIP-1071 from feature branch to trunk (#19588)
* [PR-15968](https://github.com/apache/kafka/pull/15968) - KAFKA-10551: Add topic id support to produce request and response (#15968)
* [PR-19470](https://github.com/apache/kafka/pull/19470) - KAFKA-19082: [2/4] Add preparedTxnState class to Kafka Producer (KIP-939) (#19470)
* [PR-19584](https://github.com/apache/kafka/pull/19584) - KAFKA-19202: Enable KIP-1071 in streams_broker_bounce_test.py (#19584)
* [PR-19593](https://github.com/apache/kafka/pull/19593) - KAFKA-19181-2: Increased offsets.commit.timeout.ms value as a temporary solution for the system test test_broker_failure failure (#19593)
* [PR-19560](https://github.com/apache/kafka/pull/19560) - KAFKA-19202: Enable KIP-1071 in streams_smoke_test.py (#19560)
* [PR-19555](https://github.com/apache/kafka/pull/19555) - KAFKA-19195: Only send the right group ID subset to each GC shard (#19555)
* [PR-19535](https://github.com/apache/kafka/pull/19535) - KAFKA-19183 Replace Pool with ConcurrentHashMap (#19535)
* [PR-19529](https://github.com/apache/kafka/pull/19529) - KAFKA-19178 Replace Vector by ArrayList for PluginClassLoader#getResources (#19529)
* [PR-19478](https://github.com/apache/kafka/pull/19478) - KAFKA-16718-3/n: Added the ShareGroupStatePartitionMetadata record during deletion of share group offsets (#19478)
* [PR-19520](https://github.com/apache/kafka/pull/19520) - KAFKA-19042 Move PlaintextConsumerFetchTest to client-integration-tests module (#19520)
* [PR-19532](https://github.com/apache/kafka/pull/19532) - KAFKA-19131: Adjust remote storage reader thread maximum pool size to avoid illegal argument (#19532)
* [PR-19504](https://github.com/apache/kafka/pull/19504) - KAFKA-17747: [1/N] Add MetadataHash field to Consumer/Share/StreamGroupMetadataValue (#19504)
* [PR-19544](https://github.com/apache/kafka/pull/19544) - KAFKA-19190: Handle shutdown application correctly (#19544)
* [PR-19552](https://github.com/apache/kafka/pull/19552) - KAFKA-19198: Resolve NPE when topic assigned in share group is deleted (#19552)
* [PR-19548](https://github.com/apache/kafka/pull/19548) - KAFKA-19195: Only send the right group ID subset to each GC shard (#19548)
* [PR-19450](https://github.com/apache/kafka/pull/19450) - KAFKA-19128: Kafka Streams should not get offsets when close dirty (#19450)
* [PR-19545](https://github.com/apache/kafka/pull/19545) - KAFKA-19192; Old bootstrap checkpoint files cause problems updated servers (#19545)
* [PR-17988](https://github.com/apache/kafka/pull/17988) - KAFKA-18988: Connect Multiversion Support (Updates to status and metrics) (#17988)
* [PR-19429](https://github.com/apache/kafka/pull/19429) - KAFKA-19082: [1/4] Add client config for enable2PC and overloaded initProducerId (KIP-939) (#19429)
* [PR-19536](https://github.com/apache/kafka/pull/19536) - KAFKA-18889: Make records in ShareFetchResponse non-nullable (#19536)
* [PR-19457](https://github.com/apache/kafka/pull/19457) - KAFKA-19110: Add missing unit test for Streams-consumer integration (#19457)
* [PR-19440](https://github.com/apache/kafka/pull/19440) - KAFKA-15767 Refactor TransactionManager to avoid use of ThreadLocal (#19440)
* [PR-19453](https://github.com/apache/kafka/pull/19453) - KAFKA-19124: Follow up on code improvements (#19453)
* [PR-19443](https://github.com/apache/kafka/pull/19443) - KAFKA-18170: Add scheduled job to snapshot cold share partitions. (#19443)
* [PR-19505](https://github.com/apache/kafka/pull/19505) - KAFKA-19156: Streamlined share group configs, with usage in ShareSessionCache (#19505)
* [PR-19541](https://github.com/apache/kafka/pull/19541) - KAFKA-19181: removed assertions in test_share_multiple_partitions as a result of change in assignor algorithm (#19541)
* [PR-19461](https://github.com/apache/kafka/pull/19461) - KAFKA-14690; Add TopicId to OffsetCommit API (#19461)
* [PR-19416](https://github.com/apache/kafka/pull/19416) - KAFKA-16538; Enable upgrading kraft version for existing clusters (#19416)
* [PR-18673](https://github.com/apache/kafka/pull/18673) - KAFKA-18572: Update Kafka Streams metric documenation (#18673)
* [PR-19500](https://github.com/apache/kafka/pull/19500) - KAFKA-19159: Removed time based evictions for share sessions (#19500)
* [PR-19378](https://github.com/apache/kafka/pull/19378) - KAFKA-19057: Stabilize KIP-932 RPCs for AK 4.1 (#19378)
* [PR-19518](https://github.com/apache/kafka/pull/19518) - KAFKA-19166: Fix RC tag in release script (#19518)
* [PR-19525](https://github.com/apache/kafka/pull/19525) - KAFKA-19179: remove the dot from thread_dump_url (#19525)
* [PR-19437](https://github.com/apache/kafka/pull/19437) - KAFKA-19019: Add support for remote storage fetch for share groups (#19437)
* [PR-17099](https://github.com/apache/kafka/pull/17099) - KAFKA-8830 make Record Headers available in onAcknowledgement (#17099)
* [PR-19526](https://github.com/apache/kafka/pull/19526) - KAFKA-19180 Fix the hanging testPendingTaskSize (#19526)
* [PR-19302](https://github.com/apache/kafka/pull/19302) - KAFKA-14487: Move LogManager static methods/fields to storage module (#19302)
* [PR-19487](https://github.com/apache/kafka/pull/19487) - KAFKA-18854 remove DynamicConfig inner class  (#19487)
* [PR-19286](https://github.com/apache/kafka/pull/19286) - KAFKA-18891: Add KIP-877 support to RemoteLogMetadataManager and RemoteStorageManager (#19286)
* [PR-19462](https://github.com/apache/kafka/pull/19462) - KAFKA-17184: Fix the error thrown while accessing the RemoteIndexCache (#19462)
* [PR-19477](https://github.com/apache/kafka/pull/19477) - KAFKA-17897 Deprecate Admin.listConsumerGroups (#19477)
* [PR-18926](https://github.com/apache/kafka/pull/18926) - KAFKA-18332 fix ClassDataAbstractionCoupling problem in KafkaRaftClientTest(1/2) (#18926)
* [PR-19465](https://github.com/apache/kafka/pull/19465) - KAFKA-19136 Move metadata-related configs from KRaftConfigs to MetadataLogConfig (#19465)
* [PR-19503](https://github.com/apache/kafka/pull/19503) - KAFKA-19157: added group.share.max.share.sessions config (#19503)
* [PR-1614](https://github.com/confluentinc/kafka/pull/1614) - CONFLUENT: Fix tools-log4j files in the scripts
* [PR-1613](https://github.com/confluentinc/kafka/pull/1613) - CONFLUENT: Fix tools-log4j files names in the scripts
* [PR-19474](https://github.com/apache/kafka/pull/19474) - KAFKA-14523: Move kafka.log.remote classes to storage (#19474)
* [PR-19491](https://github.com/apache/kafka/pull/19491) - KAFKA-19162: Topology metadata contains non-deterministically ordered topic configs (#19491)
* [PR-19394](https://github.com/apache/kafka/pull/19394) - KAFKA-19054: StreamThread exception handling with SHUTDOWN_APPLICATION may trigger a tight loop with MANY logs (#19394)
* [PR-19454](https://github.com/apache/kafka/pull/19454) - KAFKA-19130: Do not add fenced brokers to BrokerRegistrationTracker on startup (#19454)
* [PR-19460](https://github.com/apache/kafka/pull/19460) - KAFKA-19002 Rewrite ListOffsetsIntegrationTest and move it to clients-integration-test (#19460)
* [PR-19492](https://github.com/apache/kafka/pull/19492) - KAFKA-19158: Add SHARE_SESSION_LIMIT_REACHED error code (#19492)
* [PR-19488](https://github.com/apache/kafka/pull/19488) - KAFKA-19147: Start authorizer before group coordinator to ensure coordinator authorizes regex topics (#19488)
* [PR-19298](https://github.com/apache/kafka/pull/19298) - KAFKA-19042 Move PlaintextConsumerCallbackTest to client-integration-tests module (#19298)
* [PR-19472](https://github.com/apache/kafka/pull/19472) - KAFKA-13610: Deprecate log.cleaner.enable configuration (#19472)
* [PR-19050](https://github.com/apache/kafka/pull/19050) - KAFKA-18888: Add KIP-877 support to Authorizer (#19050)
* [PR-19420](https://github.com/apache/kafka/pull/19420) - KAFKA-18983 Ensure all README.md(s) are mentioned by the root README.md (#19420)
* [PR-19433](https://github.com/apache/kafka/pull/19433) - KAFKA-18288: Add support kafka-streams-groups.sh –describe (#19433)
* [PR-19464](https://github.com/apache/kafka/pull/19464) - KAFKA-19137 Use StandardCharsets.UTF_8 instead of StandardCharsets.UTF_8.name() (#19464)
* [PR-19364](https://github.com/apache/kafka/pull/19364) - KAFKA-15370: ACL changes to support 2PC (KIP-939) (#19364)
* [PR-19417](https://github.com/apache/kafka/pull/19417) - KAFKA-18900: Implement share.acknowledgement.mode to choose acknowledgement mode (#19417)
* [PR-19463](https://github.com/apache/kafka/pull/19463) - KAFKA-18629: Account for existing deleting topics in share group delete. (#19463)
* [PR-19319](https://github.com/apache/kafka/pull/19319) - KAFKA-19042 Move ProducerCompressionTest, ProducerFailureHandlingTest, and ProducerIdExpirationTest to client-integration-tests module (#19319)
* [PR-19391](https://github.com/apache/kafka/pull/19391) - KAFKA-14523: Decouple RemoteLogManager and Partition (#19391)
* [PR-19469](https://github.com/apache/kafka/pull/19469) - KAFKA-18172 Move RemoteIndexCacheTest to the storage module (#19469)
* [PR-19426](https://github.com/apache/kafka/pull/19426) - KAFKA-19119 Move ApiVersionManager/SimpleApiVersionManager to server (#19426)
* [PR-19439](https://github.com/apache/kafka/pull/19439) - KAFKA-19121 Move AddPartitionsToTxnConfig and TransactionStateManagerConfig out of KafkaConfig (#19439)
* [PR-19424](https://github.com/apache/kafka/pull/19424) - KAFKA-19113: Migrate DelegationTokenManager to server module (#19424)
* [PR-19347](https://github.com/apache/kafka/pull/19347) - KAFKA-19027 Replace ConsumerGroupCommandTestUtils#generator by ClusterTestDefaults (#19347)
* [PR-19431](https://github.com/apache/kafka/pull/19431) - KAFKA-19115: Utilize initialized topics info to verify delete share group offsets (#19431)
* [PR-19419](https://github.com/apache/kafka/pull/19419) - KAFKA-15371 MetadataShell is stuck when bootstrapping (#19419)
* [PR-19345](https://github.com/apache/kafka/pull/19345) - KAFKA-19071: Fix doc for remote.storage.enable (#19345)
* [PR-19374](https://github.com/apache/kafka/pull/19374) - KAFKA-19030 Remove metricNamePrefix from RequestChannel (#19374)
* [PR-19387](https://github.com/apache/kafka/pull/19387) - KAFKA-14485: Move LogCleaner to storage module (#19387)
* [PR-19293](https://github.com/apache/kafka/pull/19293) - KAFKA-16894: Define feature to enable share groups (#19293)
* [PR-19436](https://github.com/apache/kafka/pull/19436) - KAFKA-19127: Integration test for altering and describing streams group configs (#19436)
* [PR-19441](https://github.com/apache/kafka/pull/19441) - KAFKA-19103 Remove OffsetConfig (#19441)
* [PR-19438](https://github.com/apache/kafka/pull/19438) - KAFKA-19118: Enable KIP-1071 in StandbyTaskCreationIntegrationTest (#19438)
* [PR-19423](https://github.com/apache/kafka/pull/19423) - KAFKA-18286: Implement support for streams groups in kafka-groups.sh (#19423)
* [PR-19289](https://github.com/apache/kafka/pull/19289) - KAFKA-19042 Move TransactionsWithMaxInFlightOneTest to client-integration-tests module (#19289)
* [PR-19410](https://github.com/apache/kafka/pull/19410) - KAFKA-19101 Remove ControllerMutationQuotaManager#throttleTimeMs unused parameter (#19410)
* [PR-19354](https://github.com/apache/kafka/pull/19354) - KAFKA-18782: Extend ApplicationRecoverableException related exceptions (#19354)
* [PR-19363](https://github.com/apache/kafka/pull/19363) - KAFKA-18629: Utilize share group partition metadata for delete group. (#19363)
* [PR-19421](https://github.com/apache/kafka/pull/19421) - KAFKA-19124: Use consumer background event queue for Streams events (#19421)
* [PR-19425](https://github.com/apache/kafka/pull/19425) - KAFKA-19118: Enable KIP-1071 in InternalTopicIntegrationTest (#19425)
* [PR-19432](https://github.com/apache/kafka/pull/19432) - KAFKA-18170: Add create and write timestamp fields in share snapshot [1/N] (#19432)
* [PR-19167](https://github.com/apache/kafka/pull/19167) - KAFKA-18935: Ensure brokers do not return null records in FetchResponse (#19167)
* [PR-19261](https://github.com/apache/kafka/pull/19261) - KAFKA-16729: Support isolation level for share consumer (#19261)
* [PR-19188](https://github.com/apache/kafka/pull/19188) - KAFKA-18962: Fix onBatchRestored call in GlobalStateManagerImpl (#19188)
* [83f6a1d7](https://github.com/apache/kafka/commit/83f6a1d7e6dfce4a78e1192a8fecf523b39ddaab) - KAFKA-18991; Missing change for cherry-pick
* [PR-19223](https://github.com/apache/kafka/pull/19223) - KAFKA-18991: FetcherThread should match leader epochs between fetch request and fetch state (#19223)
* [PR-19422](https://github.com/apache/kafka/pull/19422) - KAFKA-18287: Add support for kafka-streams-groups.sh –list (#19422)
* [PR-18852](https://github.com/apache/kafka/pull/18852) - KAFKA-18723; Better handle invalid records during replication (#18852)
* [PR-19377](https://github.com/apache/kafka/pull/19377) - KAFKA-19037: Integrate consumer-side code with Streams (#19377)
* [PR-1611](https://github.com/confluentinc/kafka/pull/1611) - Fix build failure (#1582)
* [PR-19390](https://github.com/apache/kafka/pull/19390) - KAFKA-19090: Move DelayedFuture and DelayedFuturePurgatory to server module (#19390)
* [PR-19213](https://github.com/apache/kafka/pull/19213) - KAFKA-18984: Reset interval.ms By Using kafka-client-metrics.sh (#19213)
* [PR-18976](https://github.com/apache/kafka/pull/18976) - KAFKA-16718-2/n: KafkaAdminClient and GroupCoordinator implementation for DeleteShareGroupOffsets RPC (#18976)
* [PR-19384](https://github.com/apache/kafka/pull/19384) - KAFKA-19093 Change the “Handler on Broker” to “Handler on Controller” for controller server (#19384)
* [PR-19296](https://github.com/apache/kafka/pull/19296) - KAFKA-19047: Allow quickly re-registering brokers that are in controlled shutdown (#19296)
* [PR-19413](https://github.com/apache/kafka/pull/19413) - KAFKA-19099 Remove GroupSyncKey, GroupJoinKey, and MemberKey (#19413)
* [PR-19068](https://github.com/apache/kafka/pull/19068) - KAFKA-18892: Add KIP-877 support for ClientQuotaCallback (#19068)
* [PR-19406](https://github.com/apache/kafka/pull/19406) - KAFKA-19100: Use ProcessRole instead of String in AclApis (#19406)
* [PR-19398](https://github.com/apache/kafka/pull/19398) - KAFKA-19098 Remove lastOffset from PartitionResponse (#19398)
* [PR-19359](https://github.com/apache/kafka/pull/19359) - KAFKA-19077: Propagate shutdownRequested field (#19359)
* [PR-19219](https://github.com/apache/kafka/pull/19219) - KAFKA-19001: Use streams group-level configurations in heartbeat (#19219)
* [PR-19369](https://github.com/apache/kafka/pull/19369) - KAFKA-19084: Port KAFKA-16224, KAFKA-16764 for ShareConsumers (#19369)
* [PR-19392](https://github.com/apache/kafka/pull/19392) - KAFKA-19076 replace String by Supplier<String> for UnifiedLog#maybeHandleIOException (#19392)
* [PR-17614](https://github.com/apache/kafka/pull/17614) - KAFKA-16758: Extend Consumer#close with an option to leave the group or not (#17614)
* [PR-19303](https://github.com/apache/kafka/pull/19303) - KAFKA-16407: Fix foreign key INNER join on change of FK from/to a null value (#19303)
* [PR-19242](https://github.com/apache/kafka/pull/19242) - KAFKA-19013 Reformat PR body to 72 characters (#19242)
* [PR-19288](https://github.com/apache/kafka/pull/19288) - KAFKA-19042 Move TransactionsExpirationTest to client-integration-tests module (#19288)
* [PR-19357](https://github.com/apache/kafka/pull/19357) - KAFKA-19074 Remove the cached responseData from ShareFetchResponse (#19357)
* [PR-19285](https://github.com/apache/kafka/pull/19285) - KAFKA-14523: Move DelayedRemoteListOffsets to the storage module (#19285)
* [PR-19323](https://github.com/apache/kafka/pull/19323) - KAFKA-13747: refactor TopologyTest to test different store type parametrized (#19323)
* [PR-19370](https://github.com/apache/kafka/pull/19370) - KAFKA-19085: SharePartitionManagerTest testMultipleConcurrentShareFetches throws silent exception and works incorrectly (#19370)
* [PR-19218](https://github.com/apache/kafka/pull/19218) - KAFKA-7952: use in memory stores for KTable test (#19218)
* [PR-19328](https://github.com/apache/kafka/pull/19328) - KAFKA-18761: [2/N] List share group offsets with state and auth (#19328)
* [PR-19005](https://github.com/apache/kafka/pull/19005) - KAFKA-18713: Fix FK Left-Join result race condition (#19005)
* [PR-19269](https://github.com/apache/kafka/pull/19269) - KAFKA-18067: Add a flag to disable producer reset during active task creator shutting down (#19269)
* [PR-19320](https://github.com/apache/kafka/pull/19320) - KAFKA-19055 Cleanup the 0.10.x information from clients module (#19320)
* [PR-19348](https://github.com/apache/kafka/pull/19348) - KAFKA-19075: Included other share group dynamic configs in extractShareGroupConfigMap method in ShareGroupConfig (#19348)
* [PR-19333](https://github.com/apache/kafka/pull/19333) - KAFKA-19064: Handle exceptions from deferred events in coordinator (#19333)
* [PR-19339](https://github.com/apache/kafka/pull/19339) - KAFKA-18827: Incorporate initializing topics in share group heartbeat [4/N] (#19339)
* [PR-19111](https://github.com/apache/kafka/pull/19111) - KAFKA-18923: resource leak in RSM fetchIndex inputStream (#19111)
* [PR-19317](https://github.com/apache/kafka/pull/19317) - KAFKA-18949 add consumer protocol to testDeleteRecordsAfterCorruptRecords (#19317)
* [PR-19324](https://github.com/apache/kafka/pull/19324) - KAFKA-19058 Running the streams/streams-scala module tests produces a streams-scala.log (#19324)
* [PR-19276](https://github.com/apache/kafka/pull/19276) - KAFKA-19003: Add forceTerminateTransaction command to CLI tools (#19276)
* [PR-19226](https://github.com/apache/kafka/pull/19226) - KAFKA-19004 Move DelayedDeleteRecords to server-common module (#19226)
* [PR-18953](https://github.com/apache/kafka/pull/18953) - KAFKA-18826: Add global thread metrics  (#18953)
* [PR-19343](https://github.com/apache/kafka/pull/19343) - KAFKA-19016: Updated the retention behaviour of share groups to retain them forever (#19343)
* [PR-19344](https://github.com/apache/kafka/pull/19344) - KAFKA-19072: Add system test for ELR (#19344)
* [PR-19331](https://github.com/apache/kafka/pull/19331) - KAFKA-15931: Cancel RemoteLogReader gracefully (#19331)
* [PR-19338](https://github.com/apache/kafka/pull/19338) - KAFKA-18796-2: Corrected the check for acquisition lock timeout in Sh… (#19338)
* [PR-19335](https://github.com/apache/kafka/pull/19335) - KAFKA-19062: Port changes from KAFKA-18645 to share-consumers (#19335)
* [PR-19334](https://github.com/apache/kafka/pull/19334) - KAFKA-19018,KAFKA-19063: Implement maxRecords and acquisition lock timeout in share fetch request and response resp. (#19334)
* [PR-18383](https://github.com/apache/kafka/pull/18383) - KAFKA-18613: Unit tests for usage of incorrect RPCs (#18383)
* [PR-19189](https://github.com/apache/kafka/pull/19189) - KAFKA-18613: Improve test coverage for missing topics (#19189)
* [PR-18510](https://github.com/apache/kafka/pull/18510) - KAFKA-18409: ShareGroupStateMessageFormatter should use CoordinatorRecordMessageFormatter (#18510)
* [PR-19274](https://github.com/apache/kafka/pull/19274) - KAFKA-18959 increase the num_workers from 9 to 14 (#19274)
* [PR-19283](https://github.com/apache/kafka/pull/19283) - KAFKA-19042 Move ConsumerTopicCreationTest to client-integration-tests module (#19283)
* [PR-18297](https://github.com/apache/kafka/pull/18297) - KAFKA-16260: Deprecate window.size.ms and window.inner.class.serde in StreamsConfig (#18297)
* [PR-19114](https://github.com/apache/kafka/pull/19114) - KAFKA-18613: Add StreamsGroupHeartbeat handler in the group coordinator (#19114)
* [PR-19270](https://github.com/apache/kafka/pull/19270) - KAFKA-19032 Remove TestInfoUtils.TestWithParameterizedQuorumAndGroupProtocolNames (#19270)
* [PR-19268](https://github.com/apache/kafka/pull/19268) - KAFKA-19005 improve the documentation of DescribeTopicsOptions#partitionSizeLimitPerResponse (#19268)
* [PR-19282](https://github.com/apache/kafka/pull/19282) - KAFKA-19036 Rewrite LogAppendTimeTest and move it to storage module (#19282)
* [PR-19299](https://github.com/apache/kafka/pull/19299) - KAFKA-19049 Remove the @ExtendWith(ClusterTestExtensions.class) from code base (#19299)
* [PR-19076](https://github.com/apache/kafka/pull/19076) - KAFKA-17830 Cover unit tests for TBRLMM init failure scenarios (#19076)
* [PR-19216](https://github.com/apache/kafka/pull/19216) - KAFKA-14486 Move LogCleanerManager to storage module (#19216)
* [PR-19026](https://github.com/apache/kafka/pull/19026) - KAFKA-18827: Initialize share group state group coordinator impl. [3/N] (#19026)
* [PR-18695](https://github.com/apache/kafka/pull/18695) - KAFKA-18616; Refactor Tools’s ApiMessageFormatter (#18695)
* [PR-19192](https://github.com/apache/kafka/pull/19192) - KAFKA-18899: Improve handling of timeouts for commitAsync() in ShareConsumer. (#19192)
* [PR-19154](https://github.com/apache/kafka/pull/19154) - KAFKA-18914 Migrate ConsumerRebootstrapTest to use new test infra (#19154)
* [PR-19233](https://github.com/apache/kafka/pull/19233) - KAFKA-18736: Add pollOnClose() and maximumTimeToWait() (#19233)
* [PR-19230](https://github.com/apache/kafka/pull/19230) - KAFKA-18736: Handle errors in the Streams group heartbeat request manager (#19230)
* [PR-18711](https://github.com/apache/kafka/pull/18711) - KAFKA-18576 Convert ConfigType to Enum (#18711)
* [PR-19247](https://github.com/apache/kafka/pull/19247) - KAFKA-18796: Added more information to error message when assertion fails for acquisition lock timeout (#19247)
* [PR-19046](https://github.com/apache/kafka/pull/19046) - KAFKA-18276 Migrate ProducerRebootstrapTest to new test infra (#19046)
* [PR-19207](https://github.com/apache/kafka/pull/19207) - KAFKA-18980 OffsetMetadataManager#cleanupExpiredOffsets should record the number of records rather than topic partitions (#19207)
* [PR-19227](https://github.com/apache/kafka/pull/19227) - KAFKA-18999 Remove BrokerMetadata (#19227)
* [PR-19256](https://github.com/apache/kafka/pull/19256) - KAFKA-17806 remove this-escape suppress warnings in AclCommand (#19256)
* [PR-19255](https://github.com/apache/kafka/pull/19255) - KAFKA-18329; [3/3] Delete old group coordinator (KIP-848) (#19255)
* [PR-19064](https://github.com/apache/kafka/pull/19064) - KAFKA-18893: Add KIP-877 support to ReplicaSelector (#19064)
* [PR-19251](https://github.com/apache/kafka/pull/19251) - KAFKA-18329; [2/3] Delete old group coordinator (KIP-848) (#19251)
* [PR-19254](https://github.com/apache/kafka/pull/19254) - KAFKA-19017: Changed consumer.config to command-config in verifiable_share_consumer.py (#19254)
* [PR-19246](https://github.com/apache/kafka/pull/19246) - KAFKA-15599 Move MetadataLogConfig to raft module (#19246)
* [PR-19180](https://github.com/apache/kafka/pull/19180) - KAFKA-18954: Add ELR election rate metric (#19180)
* [PR-19197](https://github.com/apache/kafka/pull/19197) - KAFKA-15931: Cancel RemoteLogReader gracefully (#19197)
* [PR-19243](https://github.com/apache/kafka/pull/19243) - KAFKA-18329; [1/3] Delete old group coordinator (KIP-848) (#19243)
* [PR-19174](https://github.com/apache/kafka/pull/19174) - KAFKA-18946 Move BrokerReconfigurable and DynamicProducerStateManagerConfig to server module (#19174)
* [PR-18842](https://github.com/apache/kafka/pull/18842) - KAFKA-806 Index may not always observe log.index.interval.bytes (#18842)
* [PR-19214](https://github.com/apache/kafka/pull/19214) - KAFKA-18989 Optimize FileRecord#searchForOffsetWithSize (#19214)
* [PR-19183](https://github.com/apache/kafka/pull/19183) - KAFKA-18819 StreamsGroupHeartbeat API and StreamsGroupDescribe API  check topic describe (#19183)
* [PR-19217](https://github.com/apache/kafka/pull/19217) - KAFKA-18975 Move clients-integration-test out of core module (#19217)
* [PR-19193](https://github.com/apache/kafka/pull/19193) - KAFKA-18953: [1/N] Add broker side handling for 2 PC (KIP-939) (#19193)
* [PR-18949](https://github.com/apache/kafka/pull/18949) - KAFKA-17431: Support invalid static configs for KRaft so long as dynamic configs are valid (#18949)
* [PR-19202](https://github.com/apache/kafka/pull/19202) - KAFKA-18969 Rewrite ShareConsumerTest#setup and move to clients-integration-tests module (#19202)
* [PR-19165](https://github.com/apache/kafka/pull/19165) - KAFKA-18955: Fix infinite loop and standardize options in MetadataSchemaCheckerTool (#19165)
* [PR-18966](https://github.com/apache/kafka/pull/18966) - KAFKA-18808 add test to ensure the name=<default> is not equal to default quota (#18966)
* [PR-18463](https://github.com/apache/kafka/pull/18463) - KAFKA-17171 Add test cases for STATIC_BROKER_CONFIG in kraft mode (#18463)
* [PR-18801](https://github.com/apache/kafka/pull/18801) - KAFKA-17565 Move MetadataCache interface to metadata module (#18801)
* [PR-19181](https://github.com/apache/kafka/pull/19181) - KAFKA-18736: Do not send fields if not needed (#19181)
* [PR-19215](https://github.com/apache/kafka/pull/19215) - KAFKA-18990 Avoid redundant MetricName creation in BaseQuotaTest#produceUntilThrottled (#19215)
* [PR-19027](https://github.com/apache/kafka/pull/19027) - KAFKA-18859 honor the error message of UnregisterBrokerResponse (#19027)
* [PR-19212](https://github.com/apache/kafka/pull/19212) - KAFKA-18993 Remove confusing notable change section from upgrade.html (#19212)
* [PR-19129](https://github.com/apache/kafka/pull/19129) - KAFKA-18703 Remove unused class PayloadKeyType (#19129)
* [PR-19187](https://github.com/apache/kafka/pull/19187) - KAFKA-18915 Rewrite AdminClientRebootstrapTest to cover the current scenario (#19187)
* [PR-19147](https://github.com/apache/kafka/pull/19147) - KAFKA-18924 Running the storage module tests produces a storage/storage.log file (#19147)
* [PR-19136](https://github.com/apache/kafka/pull/19136) - KAFKA-18781: Extend RefreshRetriableException related exceptions (#19136)
* [PR-18994](https://github.com/apache/kafka/pull/18994) - KAFKA-18843: MirrorMaker2 unique workerId (#18994)
* [PR-17264](https://github.com/apache/kafka/pull/17264) - KAFKA-17516 Synonyms for client metrics configs (#17264)
* [PR-19134](https://github.com/apache/kafka/pull/19134) - KAFKA-18927 Remove LATEST_0_11, LATEST_1_0, LATEST_1_1, LATEST_2_0 (#19134)
* [PR-19164](https://github.com/apache/kafka/pull/19164) - KAFKA-18943: Kafka Streams incorrectly commits TX during task revokation (#19164)
* [PR-19205](https://github.com/apache/kafka/pull/19205) - KAFKA-18979; Report correct kraft.version in ApiVersions (#19205)
* [PR-19176](https://github.com/apache/kafka/pull/19176) - KAFKA-18651: Add Streams-specific broker configurations (#19176)
* [PR-19040](https://github.com/apache/kafka/pull/19040) - KAFKA-18858 Refactor FeatureControlManager to avoid using uninitialized MV (#19040)
* [PR-19030](https://github.com/apache/kafka/pull/19030) - KAFKA-14484: Move UnifiedLog to storage module (#19030)
* [PR-18662](https://github.com/apache/kafka/pull/18662) - KAFKA-18617 Allow use of ClusterInstance inside BeforeEach (#18662)
* [PR-18018](https://github.com/apache/kafka/pull/18018) - KAFKA-18142 Switch to com.gradleup.shadow (#18018)
* [PR-19169](https://github.com/apache/kafka/pull/19169) - KAFKA-18947 Remove unused raftManager in metadataShell (#19169)
* [PR-18998](https://github.com/apache/kafka/pull/18998) - KAFKA-18837: Ensure controller quorum timeouts and backoffs are at least 0 (#18998)
* [PR-19119](https://github.com/apache/kafka/pull/19119) - KAFKA-18422 Adjust Kafka client upgrade path section (#19119)
* [PR-19168](https://github.com/apache/kafka/pull/19168) - KAFKA-18942: Add reviewers to PR body with committer-tools (#19168)
* [PR-19148](https://github.com/apache/kafka/pull/19148) - KAFKA-18932: Removed usage of partition max bytes from share fetch requests (#19148)
* [PR-19145](https://github.com/apache/kafka/pull/19145) - KAFKA-18936: Fix share fetch when records are larger than max bytes (#19145)
* [PR-18091](https://github.com/apache/kafka/pull/18091) - KAFKA-18074: Add kafka client compatibility matrix (#18091)
* [PR-18258](https://github.com/apache/kafka/pull/18258) - KAFKA-18195: Fix Kafka Streams broker compatibility matrix (#18258)
* [PR-19171](https://github.com/apache/kafka/pull/19171) - KAFKA-17808: Fix id typo for connector-dlq-adminclient (#19171)
* [PR-19144](https://github.com/apache/kafka/pull/19144) - KAFKA-18933 Add client integration tests module (#19144)
* [PR-19155](https://github.com/apache/kafka/pull/19155) - KAFKA-18925: Add streams groups support to Admin.listGroups (#19155)
* [PR-19142](https://github.com/apache/kafka/pull/19142) - KAFKA-18901: [1/N] Improved homogeneous SimpleAssignor (#19142)
* [PR-19162](https://github.com/apache/kafka/pull/19162) - KAFKA-18941: Do not test 3.3 in upgrade_tests.py (#19162)
* [PR-19121](https://github.com/apache/kafka/pull/19121) - KAFKA-18736: Decide when a heartbeat should be sent (#19121)
* [PR-19173](https://github.com/apache/kafka/pull/19173) - KAFKA-18931: added a share group session timeout task when group coordinator is loaded (#19173)
* [PR-19099](https://github.com/apache/kafka/pull/19099) - KAFKA-18637: Fix max connections per ip and override reconfigurations (#19099)
* [PR-17767](https://github.com/apache/kafka/pull/17767) - KAFKA-17856 Move ConfigCommandTest and ConfigCommandIntegrationTest to tool module (#17767)
* [PR-18802](https://github.com/apache/kafka/pull/18802) - KAFKA-18706 Move AclPublisher to metadata module (#18802)
* [PR-19166](https://github.com/apache/kafka/pull/19166) - KAFKA-18944 Remove unused setters from ClusterConfig (#19166)
* [PR-19081](https://github.com/apache/kafka/pull/19081) - KAFKA-18909 Move DynamicThreadPool to server module (#19081)
* [PR-19062](https://github.com/apache/kafka/pull/19062) - KAFKA-18700 Migrate SnapshotPath, Entry, OffsetAndEpoch, LogFetchInfo, and LogAppendInfo to record classes (#19062)
* [PR-19156](https://github.com/apache/kafka/pull/19156) - KAFKA-18940: fix electionWasClean (#19156)
* [PR-19127](https://github.com/apache/kafka/pull/19127) - KAFKA-18920: The kcontrollers must set kraft.version in ApiVersionsResponse (#19127)
* [PR-19116](https://github.com/apache/kafka/pull/19116) - KAFKA-18285: Add describeStreamsGroup to Admin API (#19116)
* [PR-18684](https://github.com/apache/kafka/pull/18684) - KAFKA-18461 Add Objects.requireNotNull to Snapshot (#18684)
* [PR-18299](https://github.com/apache/kafka/pull/18299) - KAFKA-17607: Add CI step to verify LICENSE-binary (#18299)
* [PR-19137](https://github.com/apache/kafka/pull/19137) - KAFKA-18929: Log a warning when time based segment delete is blocked by a future timestamp (#19137)
* [PR-15241](https://github.com/apache/kafka/pull/15241) - KAFKA-15931: Reopen TransactionIndex if channel is closed (#15241)
* [PR-19138](https://github.com/apache/kafka/pull/19138) - KAFKA-18046; High CPU usage when using Log4j2 (#19138)
* [PR-19094](https://github.com/apache/kafka/pull/19094) - KAFKA-18915: Migrate AdminClientRebootstrapTest to use new test infra (#19094)
* [PR-19113](https://github.com/apache/kafka/pull/19113) - KAFKA-18900: Experimental share consumer acknowledge mode config (#19113)
* [PR-19131](https://github.com/apache/kafka/pull/19131) - KAFKA-18648: Make records in FetchResponse nullable again (#19131)
* [PR-19120](https://github.com/apache/kafka/pull/19120) - KAFKA-18887: Implement Streams Admin APIs (#19120)
* [PR-19130](https://github.com/apache/kafka/pull/19130) - KAFKA-18811: Added command configs to admin client as well in VerifiableShareConsumer (#19130)
* [PR-19112](https://github.com/apache/kafka/pull/19112) - KAFKA-18910 Remove kafka.utils.json (#19112)
* [4a500418](https://github.com/apache/kafka/commit/4a500418c63a063198c5f6ce256bfef9ffd74e3a) - Revert “KAFKA-18246 Fix ConnectRestApiTest.test_rest_api by adding multiversioning configs (#18191)”
* [d86cb597](https://github.com/apache/kafka/commit/d86cb597902d32ce83f27d65b60df6700cb7a61d) - Revert “KAFKA-18887: Implement Streams Admin APIs (#19049)”
* [PR-19049](https://github.com/apache/kafka/pull/19049) - KAFKA-18887: Implement Streams Admin APIs (#19049)
* [PR-19104](https://github.com/apache/kafka/pull/19104) - KAFKA-18919 Clarify that KafkaPrincipalBuilder classes must also implement KafkaPrincipalSerde (#19104)
* [PR-19054](https://github.com/apache/kafka/pull/19054) - KAFKA-18882 Remove BaseKey, TxnKey, and UnknownKey (#19054)
* [PR-19083](https://github.com/apache/kafka/pull/19083) - KAFKA-18817: ShareGroupHeartbeat and ShareGroupDescribe API must check topic describe (#19083)
* [PR-18983](https://github.com/apache/kafka/pull/18983) - KAFKA-14121: AlterPartitionReassignments API should allow callers to specify the option of preserving the replication factor (#18983)
* [PR-18918](https://github.com/apache/kafka/pull/18918) - KAFKA-18804 Remove slf4j warning when using tool script (#18918)
* [PR-9766](https://github.com/apache/kafka/pull/9766) - KAFKA-10864 Convert end txn marker schema to use auto-generated protocol (#9766)
* [PR-19087](https://github.com/apache/kafka/pull/19087) - KAFKA-18886 add behavior change of CreateTopicPolicy and AlterConfigPolicy to zk2kraft (#19087)
* [PR-19097](https://github.com/apache/kafka/pull/19097) - KAFKA-18422 add link of KIP-1124 to “rolling upgrade” section  (#19097)
* [PR-19089](https://github.com/apache/kafka/pull/19089) - KAFKA-18917: TransformValues throws NPE (#19089)
* [PR-19065](https://github.com/apache/kafka/pull/19065) - KAFKA-18876 4.0 documentation improvement (#19065)
* [PR-19086](https://github.com/apache/kafka/pull/19086) - Fix typos in multiple files (#19086)
* [PR-19091](https://github.com/apache/kafka/pull/19091) - KAFKA-18918: Correcting releasing of locks on exception (#19091)
* [PR-19088](https://github.com/apache/kafka/pull/19088) - KAFKA-18916; Resolved regular expressions must update the group by topics data structure (#19088)
* [PR-19075](https://github.com/apache/kafka/pull/19075) - KAFKA-18867 add tests to describe topic configs with empty name (#19075)
* [PR-18449](https://github.com/apache/kafka/pull/18449) - KAFKA-18500 Build PRs at HEAD commit (#18449)
* [PR-19059](https://github.com/apache/kafka/pull/19059) - KAFKA-18878: Added share session cache and delayed share fetch metrics (KIP-1103) (#19059)
* [PR-18997](https://github.com/apache/kafka/pull/18997) - KAFKA-18844: Stale features information in QuorumController#registerBroker (#18997)
* [PR-19036](https://github.com/apache/kafka/pull/19036) - KAFKA-18864:remove the Evolving tag from stable public interfaces (#19036)
* [PR-19055](https://github.com/apache/kafka/pull/19055) - KAFKA-18817:[1/N] ShareGroupHeartbeat and ShareGroupDescribe API must check topic describe (#19055)
* [PR-18981](https://github.com/apache/kafka/pull/18981) - KAFKA-18613: Auto-creation of internal topics in streams group heartbeat (#18981)
* [PR-19056](https://github.com/apache/kafka/pull/19056) - KAFKA-18881 Document the ConsumerRecord as non-thread safe (#19056)
* [PR-18752](https://github.com/apache/kafka/pull/18752) - KAFKA-18168: Adding checkpointing for GlobalKTable during restoration and closing (#18752)
* [PR-19070](https://github.com/apache/kafka/pull/19070) - KAFKA-18907 Add suitable error message when the appended value is too larger (#19070)
* [PR-19067](https://github.com/apache/kafka/pull/19067) - KAFKA-18908 Document that the size of appended value can’t be larger than Short.MAX_VALUE (#19067)
* [PR-19047](https://github.com/apache/kafka/pull/19047) - KAFKA-18880 Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException (#19047)
* [PR-19063](https://github.com/apache/kafka/pull/19063) - KAFKA-17039 KIP-919 supports for unregisterBroker (#19063)
* [PR-17771](https://github.com/apache/kafka/pull/17771) - KAFKA-17981 add Integration test for ConfigCommand to add config key=[val1,val2] (#17771)
* [PR-19045](https://github.com/apache/kafka/pull/19045) - KAFKA-18734: Implemented share partition metrics (KIP-1103) (#19045)
* [PR-19048](https://github.com/apache/kafka/pull/19048) - KAFKA-18860 Remove Missing Features section (#19048)
* [PR-18349](https://github.com/apache/kafka/pull/18349) - KAFKA-18371 TopicBasedRemoteLogMetadataManagerConfig exposes sensitive configuration data in logs (#18349)
* [PR-19020](https://github.com/apache/kafka/pull/19020) - KAFKA-18780: Extend RetriableException related exceptions (#19020)
* [PR-19037](https://github.com/apache/kafka/pull/19037) - KAFKA-18869 add remote storage threads to “Updating Thread Configs” section (#19037)
* [PR-17743](https://github.com/apache/kafka/pull/17743) - KAFKA-18863: Connect Multiversion Support (Versioned Connector Creation and related changes) (#17743)
* [PR-19042](https://github.com/apache/kafka/pull/19042) - KAFKA-18813: ConsumerGroupHeartbeat API and ConsumerGroupDescribe API… (#19042)
* [PR-18989](https://github.com/apache/kafka/pull/18989) - KAFKA-18813: ConsumerGroupHeartbeat API and ConsumerGroupDescribe API must check topic describe (#18989)
* [PR-18979](https://github.com/apache/kafka/pull/18979) - KAFKA-18614, KAFKA-18613: Add streams group request plumbing (#18979)
* [PR-18864](https://github.com/apache/kafka/pull/18864) - KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description (#18864)
* [PR-18988](https://github.com/apache/kafka/pull/18988) - KAFKA-18839: Drop EAGER rebalancing support in Kafka Streams (#18988)
* [PR-18985](https://github.com/apache/kafka/pull/18985) - KAFKA-18792 Add workflow to check PR format (#18985)
* [PR-19010](https://github.com/apache/kafka/pull/19010) - KAFKA-17351: Improved handling of compacted topics in share partition (2/N) (#19010)
* [PR-19021](https://github.com/apache/kafka/pull/19021) - KAFKA-17836 Move RackAwareTest to server module (#19021)
* [PR-18803](https://github.com/apache/kafka/pull/18803) - KAFKA-18712 Move Endpoint to server module (#18803)
* [PR-18387](https://github.com/apache/kafka/pull/18387) - KAFKA-18281: Kafka is improperly validating non-advertised listeners for routable controller addresses (#18387)
* [PR-18900](https://github.com/apache/kafka/pull/18900) - KAFKA-17937 Cleanup AbstractFetcherThreadTest (#18900)
* [PR-18898](https://github.com/apache/kafka/pull/18898) - KIP-966 part 1 release doc (#18898)
* [PR-18770](https://github.com/apache/kafka/pull/18770) - KAFKA-18748 Run new tests separately in PRs (#18770)
* [PR-18804](https://github.com/apache/kafka/pull/18804) - KAFKA-18522: Slice records for share fetch (#18804)
* [PR-18233](https://github.com/apache/kafka/pull/18233) - KAFKA-18023: Enforcing Explicit Naming for Kafka Streams Internal Topics (#18233)
* [PR-18939](https://github.com/apache/kafka/pull/18939) - KAFKA-18779: Validate responses from broker in client for ShareFetch and ShareAcknowledge RPCs. (#18939)
* [PR-18992](https://github.com/apache/kafka/pull/18992) - KAFKA-18827: Initialize share group state persister impl [2/N]. (#18992)
* [PR-18880](https://github.com/apache/kafka/pull/18880) - KAFKA-15583 doc update for the “strict min ISR” rule (#18880)
* [PR-18928](https://github.com/apache/kafka/pull/18928) - KAFKA-18629: ShareGroupDeleteState admin client impl. (#18928)
* [PR-18978](https://github.com/apache/kafka/pull/18978) - KAFKA-17351: Update tests and acquire API to allow discard batches from compacted topics (1/N) (#18978)
* [PR-18968](https://github.com/apache/kafka/pull/18968) - KAFKA-18827: Initialize share state, share coordinator impl. [1/N] (#18968)
* [PR-19000](https://github.com/apache/kafka/pull/19000) - Revert “KAFKA-16803: Change fork, update ShadowJavaPlugin to 8.1.7 (#16295)” (#19000)
* [PR-18897](https://github.com/apache/kafka/pull/18897) - KAFKA-18795 Remove Records#downConvert (#18897)
* [PR-18996](https://github.com/apache/kafka/pull/18996) - KAFKA-18813: [3/N] Client support for TopicAuthException in DescribeConsumerGroup path (#18996)
* [PR-18959](https://github.com/apache/kafka/pull/18959) - KAFKA-18733: Implemented fetch ratio and partition acquire time metrics (3/N) (#18959)
* [PR-18986](https://github.com/apache/kafka/pull/18986) - KAFKA-18813: [2/N] Client support for TopicAuthException in HB path (#18986)
* [PR-18844](https://github.com/apache/kafka/pull/18844) - KAFKA-18737 KafkaDockerWrapper setup functions fails due to storage format command (#18844)
* [PR-18848](https://github.com/apache/kafka/pull/18848) - KAFKA-18629: Delete share group state RPC group coordinator impl. [3/N] (#18848)
* [PR-18982](https://github.com/apache/kafka/pull/18982) - KAFKA-18829: Added check before converting to IMPLICIT mode (#18964) (Cherry-pick) (#18982)
* [PR-18969](https://github.com/apache/kafka/pull/18969) - KAFKA-18831 Migrating to log4j2 introduce behavior changes of adjusting level dynamically (#18969)
* [PR-18737](https://github.com/apache/kafka/pull/18737) - KAFKA-18641: AsyncKafkaConsumer could lose records with auto offset commit (#18737)
* [PR-18962](https://github.com/apache/kafka/pull/18962) - KAFKA-18828: Update share group metrics per new init and call mechanism. (#18962)
* [PR-18891](https://github.com/apache/kafka/pull/18891) - KAFKA-16918 TestUtils#assertFutureThrows should use future.get with timeout  (#18891)
* [PR-18965](https://github.com/apache/kafka/pull/18965) - MIINOR: Remove redundant quorum parameter from `*AdminIntegrationTest` classes (#18965)
* [PR-18967](https://github.com/apache/kafka/pull/18967) - KAFKA-18791 Set default commit to PR title and description [2/n] (#18967)
* [PR-18964](https://github.com/apache/kafka/pull/18964) - KAFKA-18829: Added check before converting to IMPLICIT mode (#18964)
* [PR-18955](https://github.com/apache/kafka/pull/18955) - KAFKA-18791 Enable new asf.yaml parser [1/n] (#18955)
* [PR-18845](https://github.com/apache/kafka/pull/18845) - KAFKA-18601: Assume a baseline of 3.3 for server protocol versions (#18845)
* [PR-18944](https://github.com/apache/kafka/pull/18944) - KAFKA-18198: Added check to prevent acknowledgements on initial ShareFetchRequest. (#18944)
* [PR-18946](https://github.com/apache/kafka/pull/18946) - KAFKA-18799 Remove AdminUtils (#18946)
* [PR-18757](https://github.com/apache/kafka/pull/18757) - KAFKA-18667 Add replication system test case for combined broker + controller failure (#18757)
* [PR-18872](https://github.com/apache/kafka/pull/18872) - KAFKA-18773 Migrate the log4j1 config to log4j 2 for native image and README (#18872)
* [PR-18004](https://github.com/apache/kafka/pull/18004) - KAFKA-18089: Upgrade Caffeine lib to 3.1.8 (#18004)
* [PR-18850](https://github.com/apache/kafka/pull/18850) - KAFKA-18767: Add client side config check for shareConsumer (#18850)
* [PR-18460](https://github.com/apache/kafka/pull/18460) - KAFKA-14484: Decouple UnifiedLog and RemoteLogManager (#18460)
* [PR-18927](https://github.com/apache/kafka/pull/18927) - KAFKA-16718 [1/n]: Added DeleteShareGroupOffsets request and response schema (#18927)
* [PR-18870](https://github.com/apache/kafka/pull/18870) - KAFKA-18736: Add Streams group heartbeat request manager (1/N) (#18870)
* [PR-18914](https://github.com/apache/kafka/pull/18914) - KAFKA-18798 The replica placement policy used by ReassignPartitionsCommand is not aligned with kraft controller (#18914)
* [PR-18888](https://github.com/apache/kafka/pull/18888) - KAFKA-18787: RemoteIndexCache fails to delete invalid files on init (#18888)
* [PR-18934](https://github.com/apache/kafka/pull/18934) - KAFKA-18807; Fix thread idle ratio metric (#18934)
* [PR-18871](https://github.com/apache/kafka/pull/18871) - KAFKA-18684: Add base exception classes (#18871)
* [PR-18924](https://github.com/apache/kafka/pull/18924) - KAFKA-18733: Updating share group record acks metric (2/N) (#18924)
* [PR-18907](https://github.com/apache/kafka/pull/18907) - KAFKA-18801 Remove ClusterGenerator and revise ClusterTemplate javadoc (#18907)
* [PR-18809](https://github.com/apache/kafka/pull/18809) - KAFKA-18730: Add replaying streams group state from offset topic (#18809)
* [PR-18889](https://github.com/apache/kafka/pull/18889) - KAFKA-18784 Fix ConsumerWithLegacyMessageFormatIntegrationTest (#18889)
* [PR-18920](https://github.com/apache/kafka/pull/18920) - KAFKA-18805: add synchronized block for Consumer Heartbeat close (#18920)
* [PR-18908](https://github.com/apache/kafka/pull/18908) - KAFKA-18755 Align timeout in kafka-share-groups.sh (#18908)
* [PR-18922](https://github.com/apache/kafka/pull/18922) - KAFKA-18809 Set min in sync replicas for \_\_share_group_state. (#18922)
* [PR-18916](https://github.com/apache/kafka/pull/18916) - KAFKA-18803 The acls would appear at the wrong level of the metadata shell “tree” (#18916)
* [PR-18906](https://github.com/apache/kafka/pull/18906) - KAFKA-18790 Fix testCustomQuotaCallback (#18906)
* [PR-18894](https://github.com/apache/kafka/pull/18894) - KAFKA-18761: Complete listing of share group offsets [1/N] (#18894)
* [PR-18819](https://github.com/apache/kafka/pull/18819) - KAFKA-16717 [1/2]: Add AdminClient.alterShareGroupOffsets (#18819)
* [PR-18899](https://github.com/apache/kafka/pull/18899) - KAFKA-18772 Define share group config defaults for Docker (#18899)
* [PR-18826](https://github.com/apache/kafka/pull/18826) - KAFKA-18733: Updating share group metrics (1/N) (#18826)
* [PR-18680](https://github.com/apache/kafka/pull/18680) - KAFKA-18634: Fix ELR metadata version issues (#18680)
* [PR-18795](https://github.com/apache/kafka/pull/18795) - KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#18795)
* [PR-18834](https://github.com/apache/kafka/pull/18834) - KAFKA-16720: Support multiple groups in DescribeShareGroupOffsets RPC (#18834)
* [PR-18810](https://github.com/apache/kafka/pull/18810) - KAFKA-18654[2/2]: Transction V2 retry add partitions on the server side when handling produce request. (#18810)
* [PR-18756](https://github.com/apache/kafka/pull/18756) - KAFKA-17298: Update upgrade notes for 4.0 KIP-848 (#18756)
* [PR-18807](https://github.com/apache/kafka/pull/18807) - KAFKA-18728 Move ListOffsetsPartitionStatus to server module (#18807)
* [PR-18851](https://github.com/apache/kafka/pull/18851) - KAFKA-18769: Improve leadership changes handling in ShareConsumeRequestManager. (#18851)
* [PR-18869](https://github.com/apache/kafka/pull/18869) - KAFKA-18777 add PartitionsWithLateTransactionsCount to BrokerMetricNamesTest (#18869)
* [PR-18729](https://github.com/apache/kafka/pull/18729) - KAFKA-18323: Add StreamsGroup class (#18729)
* [PR-18275](https://github.com/apache/kafka/pull/18275) - KAFKA-15443: Upgrade RocksDB to 9.7.3 (#18275)
* [PR-18451](https://github.com/apache/kafka/pull/18451) - KAFKA-18035: TransactionsTest testBumpTransactionalEpochWithTV2Disabled failed on trunk (#18451)
* [PR-17804](https://github.com/apache/kafka/pull/17804) - KAFKA-15995: Adding KIP-877 support to Connect  (#17804)
* [PR-18829](https://github.com/apache/kafka/pull/18829) - KAFKA-18756: Enabled share group configs for queues related system tests (#18829)
* [PR-18858](https://github.com/apache/kafka/pull/18858) - Fix bug in json naming (#18858)
* [PR-18833](https://github.com/apache/kafka/pull/18833) - KAFKA-18758:  NullPointerException in shutdown following InvalidConfigurationException (#18833)
* [PR-18855](https://github.com/apache/kafka/pull/18855) - KAFKA-18764: Throttle on share state RPCs auth failure. (#18855)
* [PR-18039](https://github.com/apache/kafka/pull/18039) - KAFKA-14484: Move UnifiedLog static methods to storage (#18039)
* [PR-18394](https://github.com/apache/kafka/pull/18394) - KAFKA-18396: Migrate log4j1 configuration to log4j2 in KafkaDockerWrapper (#18394)
* [PR-18853](https://github.com/apache/kafka/pull/18853) - KAFKA-18770 close the RM created by testDelayedShareFetchPurgatoryOperationExpiration (#18853)
* [PR-18820](https://github.com/apache/kafka/pull/18820) - KAFKA-18366 Remove KafkaConfig.interBrokerProtocolVersion (#18820)
* [PR-18812](https://github.com/apache/kafka/pull/18812) - KAFKA-18658 add import control for examples module (#18812)
* [PR-18821](https://github.com/apache/kafka/pull/18821) - KAFKA-18743 Remove leader.imbalance.per.broker.percentage as it is not supported by Kraft (#18821)
* [PR-1578](https://github.com/confluentinc/kafka/pull/1578) - CCS CP release test regex updates
* [PR-18196](https://github.com/apache/kafka/pull/18196) - KAFKA-18225 ClientQuotaCallback#updateClusterMetadata is unsupported by kraft (#18196)
* [PR-1582](https://github.com/confluentinc/kafka/pull/1582) - Fix build failure
* [PR-18846](https://github.com/apache/kafka/pull/18846) - KAFKA-18763: changed the assertion statement for acknowledgements to include only successful acks (#18846)
* [PR-18824](https://github.com/apache/kafka/pull/18824) - KAFKA-18745: Handle network related errors in persister. (#18824)
* [PR-18252](https://github.com/apache/kafka/pull/18252) - KAFKA-17833: Convert DescribeAuthorizedOperationsTest to use KRaft (#18252)
* [PR-1577](https://github.com/confluentinc/kafka/pull/1577) - CCS CP release test regex updates
* [PR-18381](https://github.com/apache/kafka/pull/18381) - KAFKA-18275 Restarting broker in testing should use the same port (#18381)
* [PR-18818](https://github.com/apache/kafka/pull/18818) - KAFKA-18741 document the removal of inter.broker.protocol.version (#18818)
* [PR-18496](https://github.com/apache/kafka/pull/18496) - KAFKA-18483 Disable Log4jController and Loggers if Log4j Core absent (#18496)
* [PR-18672](https://github.com/apache/kafka/pull/18672) - KAFKA-18618: Improve leader change handling of acknowledgements [1/N] (#18672)
* [PR-18566](https://github.com/apache/kafka/pull/18566) - KAFKA-18360 Remove zookeeper configurations (#18566)
* [PR-18641](https://github.com/apache/kafka/pull/18641) - KAFKA-18530 Remove ZooKeeperInternals (#18641)
* [PR-18583](https://github.com/apache/kafka/pull/18583) - KAFKA-18499 Clean up zookeeper from LogConfig (#18583)
* [PR-18771](https://github.com/apache/kafka/pull/18771) - KAFKA-18689: Improve metric calculation to avoid NoSuchElementException (#18771)
* [PR-18189](https://github.com/apache/kafka/pull/18189) - KAFKA-18206: EmbeddedKafkaCluster must set features (#18189)
* [PR-18765](https://github.com/apache/kafka/pull/18765) - KAFKA-17379: Fix inexpected state transition from ERROR to PENDING_SHUTDOWN (#18765)
* [PR-18696](https://github.com/apache/kafka/pull/18696) - KAFKA-18494-3: solution for the bug relating to gaps in the share partition cachedStates post initialization (#18696)
* [PR-18748](https://github.com/apache/kafka/pull/18748) - KAFKA-18629: Add persister impl and tests for DeleteShareGroupState RPC. [2/N] (#18748)
* [PR-18671](https://github.com/apache/kafka/pull/18671) - [KAFKA-16720] AdminClient Support for ListShareGroupOffsets (2/2) (#18671)
* [PR-18702](https://github.com/apache/kafka/pull/18702) - KAFKA-18645: New consumer should align close timeout handling with classic consumer (#18702)
* [PR-18791](https://github.com/apache/kafka/pull/18791) - KAFKA-18722: Remove the unreferenced methods in TBRLMM and ConsumerManager (#18791)
* [PR-18782](https://github.com/apache/kafka/pull/18782) - KAFKA-18694: Migrate suitable classes to records in coordinator-common module (#18782)
* [PR-18784](https://github.com/apache/kafka/pull/18784) - KAFKA-18705: Move ConfigRepository to metadata module (#18784)
* [PR-18783](https://github.com/apache/kafka/pull/18783) - KAFKA-18698: Migrate suitable classes to records in server and server-common modules (#18783)
* [PR-18781](https://github.com/apache/kafka/pull/18781) - KAFKA-18675 Add tests for valid and invalid broker addresses (#18781)
* [PR-18304](https://github.com/apache/kafka/pull/18304) - KAFKA-16524; Metrics for KIP-853 (#18304)
* [PR-18277](https://github.com/apache/kafka/pull/18277) - KAFKA-18635: reenable the unclean shutdown detection (#18277)
* [PR-18708](https://github.com/apache/kafka/pull/18708) - KAFKA-18649: complete ClearElrRecord handling (#18708)
* [PR-18148](https://github.com/apache/kafka/pull/18148) - KAFKA-16540: Clear ELRs when min.insync.replicas is changed. (#18148)
* [PR-17952](https://github.com/apache/kafka/pull/17952) - KAFKA-16540: enforce min.insync.replicas config invariants for ELR (#17952)
* [PR-15622](https://github.com/apache/kafka/pull/15622) - KAFKA-16446: Improve controller event duration logging (#15622)
* [PR-18028](https://github.com/apache/kafka/pull/18028) - KAFKA-18131: Improve logs for voters (#18028)
* [PR-18222](https://github.com/apache/kafka/pull/18222) - KAFKA-18305: validate controller.listener.names is not in inter.broker.listener.name for kcontrollers (#18222)
* [PR-18777](https://github.com/apache/kafka/pull/18777) - KAFKA-18690: Keep leader metadata for RE2J-assigned partitions (#18777)
* [PR-18551](https://github.com/apache/kafka/pull/18551) - KAFKA-18538: Add Streams membership manager (#18551)
* [PR-18165](https://github.com/apache/kafka/pull/18165) - KAFKA-18230: Handle not controller or not leader error in admin client (#18165)
* [PR-18700](https://github.com/apache/kafka/pull/18700) - KAFKA-18644: improve generic type names for internal FK-join classes (#18700)
* [PR-18790](https://github.com/apache/kafka/pull/18790) - KAFKA-18693 Remove PasswordEncoder (#18790)
* [PR-18720](https://github.com/apache/kafka/pull/18720) - KAFKA-18654 [1/2]: Transaction Version 2 performance regression due to early return (#18720)
* [PR-18592](https://github.com/apache/kafka/pull/18592) - KAFKA-18545: Remove Zookeeper logic from LogManager (#18592)
* [PR-18676](https://github.com/apache/kafka/pull/18676) - KAFKA-18325: Add TargetAssignmentBuilder (#18676)
* [PR-18786](https://github.com/apache/kafka/pull/18786) - KAFKA-18672; CoordinatorRecordSerde must validate value version (4.0) (#18786)
* [PR-18717](https://github.com/apache/kafka/pull/18717) - KAFKA-18655: Implement the consumer group size counter with scheduled task (#18717)
* [PR-18764](https://github.com/apache/kafka/pull/18764) - KAFKA-18685: Cleanup DynamicLogConfig constructor (#18764)
* [PR-18785](https://github.com/apache/kafka/pull/18785) - KAFKA-18676; Update Benchmark system tests (#18785)
* [PR-18330](https://github.com/apache/kafka/pull/18330) - KAFKA-17631 Convert SaslApiVersionsRequestTest to kraft (#18330)
* [PR-18749](https://github.com/apache/kafka/pull/18749) - KAFKA-18672; CoordinatorRecordSerde must validate value version (#18749)
* [PR-18768](https://github.com/apache/kafka/pull/18768) - KAFKA-18678 Update TestVerifiableProducer system test (#18768)
* [PR-18652](https://github.com/apache/kafka/pull/18652) - KAFKA-17125: Streams Sticky Task Assignor (#18652)
* [PR-18751](https://github.com/apache/kafka/pull/18751) - KAFKA-18674 Document the incompatible changes in parsing –bootstrap-server (#18751)
* [PR-18727](https://github.com/apache/kafka/pull/18727) - KAFKA-18659: librdkafka compressed produce fails unless api versions returns produce v0 (#18727)
* [PR-18759](https://github.com/apache/kafka/pull/18759) - KAFKA-18683: Handle slicing of file records for updated start position (#18759)
* [fc3dca4e](https://github.com/apache/kafka/commit/fc3dca4ed08a6acdcb5b1d5a4ed5b8a7095d318b) - Revert “KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700)”
* [7920fadb](https://github.com/apache/kafka/commit/7920fadbb586a9430ce1a45936d6bbd1555baa2d) - Revert “KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700)”
* [PR-18758](https://github.com/apache/kafka/pull/18758) - KAFKA-18660: Transactions Version 2 doesn’t handle epoch overflow correctly (#18730) (#18758)
* [PR-18750](https://github.com/apache/kafka/pull/18750) - KAFKA-18320; Ensure that assignors are at the right place (#18750)
* [PR-1541](https://github.com/confluentinc/kafka/pull/1541) - Merge trunk
* [PR-17700](https://github.com/apache/kafka/pull/17700) - KAFKA-17182: Consumer fetch sessions are evicted too quickly with AsyncKafkaConsumer (#17700)
* [PR-18766](https://github.com/apache/kafka/pull/18766) - KAFKA-18146; tests/kafkatest/tests/core/upgrade_test.py needs to be re-added as KRaft (#18766)
* [PR-18763](https://github.com/apache/kafka/pull/18763) - KAFKA-18677; Update ConsoleConsumerTest system test (#18763)
* [PR-17511](https://github.com/apache/kafka/pull/17511) - KAFKA-15995: Initial API + make Producer/Consumer plugins Monitorable (#17511)
* [PR-18722](https://github.com/apache/kafka/pull/18722) - KAFKA-18644: improve generic type names for KStreamImpl and KTableImpl (#18722)
* [PR-18754](https://github.com/apache/kafka/pull/18754) - KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18754)
* [PR-18730](https://github.com/apache/kafka/pull/18730) - KAFKA-18660: Transactions Version 2 doesn’t handle epoch overflow correctly (#18730)
* [PR-1556](https://github.com/confluentinc/kafka/pull/1556) - MINOR: Disable publish artifacts for 4.0
* [PR-18548](https://github.com/apache/kafka/pull/18548) - KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18548)
* [PR-18731](https://github.com/apache/kafka/pull/18731) - KAFKA-18570: Update documentation to add remainingLogsToRecover, remainingSegmentsToRecover and LogDirectoryOffline metrics (#18731)
* [PR-18669](https://github.com/apache/kafka/pull/18669) - KAFKA-18621: Add StreamsCoordinatorRecordHelpers (#18669)
* [PR-18681](https://github.com/apache/kafka/pull/18681) - KAFKA-18636 Fix how we handle Gradle exits in CI (#18681)
* [PR-18590](https://github.com/apache/kafka/pull/18590) - KAFKA-18569: New consumer close may wait on unneeded FindCoordinator (#18590)
* [PR-18698](https://github.com/apache/kafka/pull/18698) - KAFKA-13722: remove internal usage of old ProcessorContext (#18698)
* [PR-18314](https://github.com/apache/kafka/pull/18314) - KAFKA-16339: Add Kafka Streams migrating guide from transform to process (#18314)
* [PR-18732](https://github.com/apache/kafka/pull/18732) - KAFKA-18498: Update lock ownership from main thread (#18732)
* [PR-18478](https://github.com/apache/kafka/pull/18478) - KAFKA-18383 Remove reserved.broker.max.id and broker.id.generation.enable (#18478)
* [PR-18733](https://github.com/apache/kafka/pull/18733) - KAFKA-18662: Return CONCURRENT_TRANSACTIONS on produce request in TV2 (#18733)
* [PR-18718](https://github.com/apache/kafka/pull/18718) - KAFKA-18632: Multibroker test improvements. (#18718)
* [PR-18725](https://github.com/apache/kafka/pull/18725) - KAFKA-18653: Fix mocks and potential thread leak issues causing silent RejectedExecutionException in share group broker tests (#18725)
* [PR-18726](https://github.com/apache/kafka/pull/18726) - KAFKA-18646: Null records in fetch response breaks librdkafka (#18726)
* [PR-18668](https://github.com/apache/kafka/pull/18668) - KAFKA-18619: New consumer topic metadata events should set requireMetadata flag (#18668)
* [PR-18728](https://github.com/apache/kafka/pull/18728) - KAFKA-18488: Improve KafkaShareConsumerTest (#18728)
* [PR-18716](https://github.com/apache/kafka/pull/18716) - KAFKA-18648: Add back support for metadata version 0-3 (#18716)
* [PR-18555](https://github.com/apache/kafka/pull/18555) - KAFKA-18528: MultipleListenersWithSameSecurityProtocolBaseTest and GssapiAuthenticationTest should run for async consumer (#18555)
* [PR-18651](https://github.com/apache/kafka/pull/18651) - KAFKA-17951: Share parition rotate strategy (#18651)
* [PR-18712](https://github.com/apache/kafka/pull/18712) - KAFKA-18629: Delete share group state impl [1/N] (#18712)
* [PR-18570](https://github.com/apache/kafka/pull/18570) - KAFKA-17162: join() started thread in DefaultTaskManagerTest (#18570)
* [PR-18602](https://github.com/apache/kafka/pull/18602) - KAFKA-17587 Refactor test infrastructure (#18602)
* [PR-18693](https://github.com/apache/kafka/pull/18693) - KAFKA-18631 Remove ZkConfigs (#18693)
* [PR-18699](https://github.com/apache/kafka/pull/18699) - KAFKA-18642: Increased the timeouts in share_consumer_test.py system tests (#18699)
* [PR-18632](https://github.com/apache/kafka/pull/18632) - KAFKA-18555 Avoid casting MetadataCache to KRaftMetadataCache (#18632)
* [PR-18547](https://github.com/apache/kafka/pull/18547) - KAFKA-18533 Remove KafkaConfig zookeeper related logic (#18547)
* [PR-18554](https://github.com/apache/kafka/pull/18554) - KAFKA-18529: ConsumerRebootstrapTest should run for async consumer (#18554)
* [PR-18292](https://github.com/apache/kafka/pull/18292) - KAFKA-13722: remove usage of old ProcessorContext  (#18292)
* [PR-18444](https://github.com/apache/kafka/pull/18444) - KAFKA-17894: Implemented broker topic metrics for Share Group 1/N (KIP-1103) (#18444)
* [PR-18687](https://github.com/apache/kafka/pull/18687) - KAFKA-18630: Clean ReplicaManagerBuilder (#18687)
* [PR-18477](https://github.com/apache/kafka/pull/18477) - KAFKA-18474: Remove zkBroker listener (#18477)
* [PR-18688](https://github.com/apache/kafka/pull/18688) - KAFKA-18616; Refactor DumpLogSegments’s MessageParsers (#18688)
* [PR-15574](https://github.com/apache/kafka/pull/15574) - KAFKA-16372 Fix producer doc discrepancy with the exception behavior (#15574)
* [PR-18618](https://github.com/apache/kafka/pull/18618) - KAFKA-18590 Cleanup DelegationTokenManager (#18618)
* [PR-18593](https://github.com/apache/kafka/pull/18593) - KAFKA-18559 Cleanup FinalizedFeatures (#18593)
* [PR-18627](https://github.com/apache/kafka/pull/18627) - KAFKA-18597 Fix max-buffer-utilization-percent is always 0 (#18627)
* [PR-18686](https://github.com/apache/kafka/pull/18686) - KAFKA-18620: Remove UnifiedLog#legacyFetchOffsetsBefore (#18686)
* [PR-18621](https://github.com/apache/kafka/pull/18621) - KAFKA-18592 Cleanup ReplicaManager (#18621)
* [PR-18476](https://github.com/apache/kafka/pull/18476) - KAFKA-18324: Add CurrentAssignmentBuilder (#18476)
* [PR-12042](https://github.com/apache/kafka/pull/12042) - KAFKA-13810: Document behavior of KafkaProducer.flush() w.r.t callbacks (#12042)
* [PR-18667](https://github.com/apache/kafka/pull/18667) - KAFKA-18484 [2/2]; Handle exceptions during coordinator unload (#18667)
* [PR-18601](https://github.com/apache/kafka/pull/18601) - KAFKA-18488: Additional protocol tests for share consumption (#18601)
* [PR-18666](https://github.com/apache/kafka/pull/18666) - KAFKA-18486; [1/2] Update LocalLeaderEndPointTest (#18666)
* [d2024436](https://github.com/apache/kafka/commit/d2024436218343a127385e0149a692caf432b772) - KAFKA-18575: Transaction Version 2 doesn’t correctly handle race condition with completing and new transaction(#18604)
* [PR-18532](https://github.com/apache/kafka/pull/18532) - KAFKA-18517: Enable ConsumerBounceTest to run for new async consumer (#18532)
* [PR-18614](https://github.com/apache/kafka/pull/18614) - KAFKA-18519: Remove Json.scala, cleanup AclEntry.scala (#18614)
* [PR-18630](https://github.com/apache/kafka/pull/18630) - KAFKA-18599: Remove Optional wrapping for forwardingManager in ApiVersionManager (#18630)
* [PR-18389](https://github.com/apache/kafka/pull/18389) - KAFKA-18229: Move configs out of “kraft” directory (#18389)
* [PR-18661](https://github.com/apache/kafka/pull/18661) - KAFKA-18484 [1/N]; Handle exceptions from deferred events in coordinator (#18661)
* [PR-18649](https://github.com/apache/kafka/pull/18649) - KAFKA-18392: Ensure client sets member ID for share group (#18649)
* [PR-18527](https://github.com/apache/kafka/pull/18527) - KAFKA-18518: Add processor to handle rebalance events (#18527)
* [PR-18607](https://github.com/apache/kafka/pull/18607) - KAFKA-17402: DefaultStateUpdated should transite task atomically (#18607)
* [PR-18539](https://github.com/apache/kafka/pull/18539) - KAFKA-18454 Publish build scans to develocity.apache.org (#18539)
* [PR-18512](https://github.com/apache/kafka/pull/18512) - KAFKA-18302; Update CoordinatorRecord (#18512)
* [PR-18316](https://github.com/apache/kafka/pull/18316) - KAFKA-15370: Support Participation in 2PC (KIP-939) (2/N) (#18316)
* [PR-18587](https://github.com/apache/kafka/pull/18587) - KAFKA-8862: Improve Producer error message for failed metadata update (#18587)
* [PR-18581](https://github.com/apache/kafka/pull/18581) - KAFKA-17561: add processId tag to thread-state metric (#18581)
* [PR-18629](https://github.com/apache/kafka/pull/18629) - KAFKA-18598: Remove ControllerMetadataMetrics ZK-related Metrics (#18629)
* [PR-18611](https://github.com/apache/kafka/pull/18611) - KAFKA-18585 Fix fail test ValuesTest#shouldConvertDateValues (#18611)
* [PR-18647](https://github.com/apache/kafka/pull/18647) - KAFKA-18487; Remove ReplicaManager#stopReplicas (#18647)
* [PR-18635](https://github.com/apache/kafka/pull/18635) - KAFKA-18583; Fix getPartitionReplicaEndpoints for KRaft (#18635)
* [PR-18442](https://github.com/apache/kafka/pull/18442) - KAFKA-18311: Internal Topic Manager (5/5) (#18442)
* [PR-18636](https://github.com/apache/kafka/pull/18636) - KAFKA-18604; Update transaction coordinator (#18636)
* [PR-18497](https://github.com/apache/kafka/pull/18497) - KAFKA-14552: Assume a baseline of 3.0 for server protocol versions (#18497)
* [PR-18346](https://github.com/apache/kafka/pull/18346) - KAFKA-18363: Remove ZooKeeper mentiosn in broker configs (#18346)
* [PR-18631](https://github.com/apache/kafka/pull/18631) - KAFKA-18595: Remove AuthorizerUtils#sessionToRequestContext (#18631)
* [PR-18626](https://github.com/apache/kafka/pull/18626) - KAFKA-18594: Cleanup BrokerLifecycleManager (#18626)
* [PR-18174](https://github.com/apache/kafka/pull/18174) - KAFKA-18232: Add share group state topic prune metrics. (#18174)
* [PR-18567](https://github.com/apache/kafka/pull/18567) - KAFKA-18553: Update javadoc and comments of ConfigType (#18567)
* [PR-18571](https://github.com/apache/kafka/pull/18571) - [KAFKA-16720] AdminClient Support for ListShareGroupOffsets (1/n) (#18571)
* [PR-18624](https://github.com/apache/kafka/pull/18624) - KAFKA-18588 Remove TopicKey.scala (#18624)
* [PR-18628](https://github.com/apache/kafka/pull/18628) - KAFKA-18578: Remove UpdateMetadataRequest from MetadataCacheTest (#18628)
* [PR-18625](https://github.com/apache/kafka/pull/18625) - KAFKA-18593 Remove ZkCachedControllerId In MetadataCache (#18625)
* [PR-17390](https://github.com/apache/kafka/pull/17390) - KAFKA-17668: Clean-up LogCleaner#maxOverCleanerThreads and LogCleanerManager#maintainUncleanablePartitions (#17390)
* [PR-18616](https://github.com/apache/kafka/pull/18616) - KAFKA-18429 Remove ZkFinalizedFeatureCache and StateChangeFailedException (#18616)
* [PR-18619](https://github.com/apache/kafka/pull/18619) - KAFKA-18589 Remove unused interBrokerProtocolVersion from GroupMetadataManager (#18619)
* [PR-18598](https://github.com/apache/kafka/pull/18598) - KAFKA-18516 Remove RackAwareMode (#18598)
* [PR-18608](https://github.com/apache/kafka/pull/18608) - KAFKA-18492 Cleanup RequestHandlerHelper (#18608)
* [PR-18613](https://github.com/apache/kafka/pull/18613) - KAFKA-18427: Remove ZooKeeperClient (#18613)
* [PR-18591](https://github.com/apache/kafka/pull/18591) - KAFKA-18540: Remove UpdataMetadataRequest from KafkaApisTest (#18591)
* [PR-18594](https://github.com/apache/kafka/pull/18594) - KAFKA-18532: Clean Partition.scala zookeeper logic (#18594)
* [PR-18605](https://github.com/apache/kafka/pull/18605) - KAFKA-18423: Remove ZkData and related unused references (#18605)
* [PR-18586](https://github.com/apache/kafka/pull/18586) - KAFKA-18565 Cleanup SaslSetup (#18586)
* [PR-18606](https://github.com/apache/kafka/pull/18606) - KAFKA-18430 Remove ZkNodeChangeNotificationListener (#18606)
* [PR-18492](https://github.com/apache/kafka/pull/18492) - KAFKA-18480 Fix fail e2e test_offset_truncate (#18492)
* [PR-18012](https://github.com/apache/kafka/pull/18012) - KAFKA-806: Index may not always observe log.index.interval.bytes (#18012)
* [PR-18595](https://github.com/apache/kafka/pull/18595) - KAFKA-18515 Remove DelegationTokenManagerZk (#18595)
* [PR-18579](https://github.com/apache/kafka/pull/18579) - Remove casts to KRaftMetadataCache (#18579)
* [PR-18577](https://github.com/apache/kafka/pull/18577) - Convert BrokerEndPoint to record (#18577)
* [PR-18240](https://github.com/apache/kafka/pull/18240) - KAFKA-17642: PreVote response handling and ProspectiveState (#18240)
* [PR-18585](https://github.com/apache/kafka/pull/18585) - KAFKA-18413: Remove AdminZkClient (#18585)
* [PR-18406](https://github.com/apache/kafka/pull/18406) - KAFKA-18318: Add logs for online/offline migration indication (#18406)
* [PR-18224](https://github.com/apache/kafka/pull/18224) - KAFKA-18150; Downgrade group on classic leave of last consumer member (#18224)
* [PR-18209](https://github.com/apache/kafka/pull/18209) - Infrastructure for system tests for the new share consumer client (#18209)
* [PR-18553](https://github.com/apache/kafka/pull/18553) - KAFKA-18373: Remove ZkMetadataCache (#18553)
* [PR-18582](https://github.com/apache/kafka/pull/18582) - KAFKA-18557 streamline codebase with testConfig() (#18582)
* [PR-18573](https://github.com/apache/kafka/pull/18573) - KAFKA-18431: Remove KafkaController (#18573)
* [PR-18574](https://github.com/apache/kafka/pull/18574) - KAFKA-18407: Remove ZkAdminManager, DelayedCreatePartitions, CreatePartitionsMetadata, ZkConfigRepository, DelayedDeleteTopics (#18574)
* [PR-18568](https://github.com/apache/kafka/pull/18568) - KAFKA-18556: Remove JaasModule#zkDigestModule, JaasTestUtils#zkSections (#18568)
* [PR-18534](https://github.com/apache/kafka/pull/18534) - KAFKA-14485: Move LogCleaner exceptions to storage module (#18534)
* [PR-18565](https://github.com/apache/kafka/pull/18565) - KAFKA-18546: Use mocks instead of a real DNS lookup to the outside (#18565)
* [PR-18140](https://github.com/apache/kafka/pull/18140) - KAFKA-16368: Add a new constraint for segment.bytes to min 1MB for KIP-1030 (#18140)
* [PR-18106](https://github.com/apache/kafka/pull/18106) - KAFKA-16368: Update defaults for LOG_MESSAGE_TIMESTAMP_AFTER_MAX_MS_DEFAULT and NUM_RECOVERY_THREADS_PER_DATA_DIR_CONFIG (#18106)
* [PR-18374](https://github.com/apache/kafka/pull/18374) - KAFKA-7776: Tests for ISO8601 in Connect value parsing (#18374)
* [PR-18562](https://github.com/apache/kafka/pull/18562) - KAFKA-18558: Added check before adding previously subscribed partitions (#18562)
* [PR-18535](https://github.com/apache/kafka/pull/18535) - KAFKA-18521 Cleanup NodeApiVersions zkMigrationEnabled field (#18535)
* [PR-18552](https://github.com/apache/kafka/pull/18552) - KAFKA-18542 Cleanup AlterPartitionManager (#18552)
* [PR-18561](https://github.com/apache/kafka/pull/18561) - KAFKA-18406 Remove ZkBrokerEpochManager.scala (#18561)
* [PR-18508](https://github.com/apache/kafka/pull/18508) - KAFKA-18405 Remove ZooKeeper logic from DynamicBrokerConfig (#18508)
* [PR-18080](https://github.com/apache/kafka/pull/18080) - KAFKA-16368: Update default linger.ms to 5ms for KIP-1030 (#18080)
* [PR-18524](https://github.com/apache/kafka/pull/18524) - KAFKA-18514: Refactor share module code to server and server-common (#18524)
* [PR-18414](https://github.com/apache/kafka/pull/18414) - KAFKA-18331: Make process.roles and node.id required configs (#18414)
* [PR-18559](https://github.com/apache/kafka/pull/18559) - KAFKA-18552: Remove unnecessary version check in `testHandleOffsetFetch*` (#18559)
* [PR-18483](https://github.com/apache/kafka/pull/18483) - KAFKA-18472: Remove MetadataSupport (#18483)
* [PR-18342](https://github.com/apache/kafka/pull/18342) - KAFKA-18026: KIP-1112, clean up graph node grace period resolution (#18342)
* [PR-18491](https://github.com/apache/kafka/pull/18491) - KAFKA-18479: Remove keepPartitionMetadataFile in UnifiedLog and LogMan… (#18491)
* [PR-18365](https://github.com/apache/kafka/pull/18365) - KAFKA-18364 add document to show the changes of metrics and configs after removing zookeeper (#18365)
* [PR-18550](https://github.com/apache/kafka/pull/18550) - KAFKA-18539 Remove optional managers in KafkaApis (#18550)
* [PR-18563](https://github.com/apache/kafka/pull/18563) - Use version.py get_version to get version (#18563)
* [PR-18459](https://github.com/apache/kafka/pull/18459) - KAFKA-18452: Implemented batch size in acquired records (#18459)
* [PR-18448](https://github.com/apache/kafka/pull/18448) - KAFKA-18401: Transaction version 2 does not support commit transaction without records (#18448)
* [PR-18490](https://github.com/apache/kafka/pull/18490) - KAFKA-18479: RocksDBTimeOrderedKeyValueBuffer not initialized correctly (#18490)
* [PR-18536](https://github.com/apache/kafka/pull/18536) - KAFKA-18514 Remove server dependency on share coordinator (#18536)
* [PR-18521](https://github.com/apache/kafka/pull/18521) - KAFKA-18513: Validate share state topic records produced in tests. (#18521)
* [PR-18542](https://github.com/apache/kafka/pull/18542) - KAFKA-18399 Remove ZooKeeper from KafkaApis (12/N): clean up ZKMetadataCache, KafkaController and raftSupport (#18542)
* [PR-18386](https://github.com/apache/kafka/pull/18386) - KAFKA-18346 Fix e2e TestKRaftUpgrade for v3.3.2 (#18386)
* [PR-18530](https://github.com/apache/kafka/pull/18530) - KAFKA-18520: Remove ZooKeeper logic from JaasUtils (#18530)
* [PR-18540](https://github.com/apache/kafka/pull/18540) - KAFKA-18399 Remove ZooKeeper from KafkaApis (11/N): CREATE_ACLS and DELETE_ACLS  (#18540)
* [PR-18432](https://github.com/apache/kafka/pull/18432) - KAFKA-18399 Remove ZooKeeper from KafkaApis (10/N): ALTER_CONFIG and INCREMENETAL_ALTER_CONFIG (#18432)
* [PR-18544](https://github.com/apache/kafka/pull/18544) - Revert “KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18050)” (#18544)
* [PR-18487](https://github.com/apache/kafka/pull/18487) - KAFKA-18476: KafkaStreams should swallow TransactionAbortedException (#18487)
* [PR-18195](https://github.com/apache/kafka/pull/18195) - KAFKA-18026: KIP-1112, clean up StatefulProcessorNode (#18195)
* [PR-18518](https://github.com/apache/kafka/pull/18518) - KAFKA-18502 Remove kafka.controller.Election (#18518)
* [PR-18281](https://github.com/apache/kafka/pull/18281) - KAFKA-18330: Update documentation to remove controller deployment limitations (#18281)
* [PR-18465](https://github.com/apache/kafka/pull/18465) - KAFKA-18399 Remove ZooKeeper from KafkaApis (9/N): ALTER_CLIENT_QUOTAS and ALLOCATE_PRODUCER_IDS (#18465)
* [PR-18511](https://github.com/apache/kafka/pull/18511) - KAFKA-18493: Fix configure :streams:integration-tests project error (#18511)
* [PR-18453](https://github.com/apache/kafka/pull/18453) - KAFKA-18399 Remove ZooKeeper from KafkaApis (8/N): ELECT_LEADERS , ALTER_PARTITION, UPDATE_FEATURES (#18453)
* [PR-18525](https://github.com/apache/kafka/pull/18525) - Rename the variable to reflect its purpose (#18525)
* [PR-18403](https://github.com/apache/kafka/pull/18403) - KAFKA-18211: Override class loaders for class graph scanning in connect. (#18403)
* [PR-18500](https://github.com/apache/kafka/pull/18500) - Add DescribeShareGroupOffsets API [KIP-932] (#18500)
* [PR-17669](https://github.com/apache/kafka/pull/17669) - KAFKA-17915: Convert Kafka Client system tests to use KRaft (#17669)
* [PR-17901](https://github.com/apache/kafka/pull/17901) - KAFKA-18064: SASL mechanisms should throw exception on wrap/unwrap (#17901)
* [PR-18507](https://github.com/apache/kafka/pull/18507) - KAFKA-18491 Remove zkClient & maybeUpdateMetadataCache from ReplicaManager (#18507)
* [PR-18337](https://github.com/apache/kafka/pull/18337) - KAFKA-18274 Failed to restart controller in testing due to closed socket channel [2/2] (#18337)
* [PR-18475](https://github.com/apache/kafka/pull/18475) - KAFKA-18469;KAFKA-18036: AsyncConsumer should request metadata update if ListOffsetRequest encounters a retriable error (#18475)
* [PR-17728](https://github.com/apache/kafka/pull/17728) - KAFKA-17973: Relax Restriction for Voters Set Change (#17728)
* [PR-18050](https://github.com/apache/kafka/pull/18050) - KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18050)
* [PR-17870](https://github.com/apache/kafka/pull/17870) - KAFKA-18404: Remove partitionMaxBytes usage from DelayedShareFetch (#17870)
* [PR-18504](https://github.com/apache/kafka/pull/18504) - KAFKA-18485; Update log4j2.yaml (#18504)
* [PR-18433](https://github.com/apache/kafka/pull/18433) - KAFKA-18399 Remove ZooKeeper from KafkaApis (7/N): CREATE_TOPICS, DELETE_TOPICS, CREATE_PARTITIONS (#18433)
* [PR-18320](https://github.com/apache/kafka/pull/18320) - KAFKA-18341: Remove KafkaConfig GroupType config check and warn log (#18320)
* [PR-18480](https://github.com/apache/kafka/pull/18480) - KAFKA-18457; Update DumpLogSegments to use coordinator record json converters (#18480)
* [PR-18447](https://github.com/apache/kafka/pull/18447) - KAFKA-18399 Remove ZooKeeper from KafkaApis (6/N): handleCreateTokenRequest,  handleRenewTokenRequestZk,  handleExpireTokenRequestZk (#18447)
* [PR-18464](https://github.com/apache/kafka/pull/18464) - KAFKA-18399 Remove ZooKeeper from KafkaApis (5/N): ALTER_PARTITION_REASSIGNMENTS, LIST_PARTITION_REASSIGNMENTS (#18464)
* [PR-18461](https://github.com/apache/kafka/pull/18461) - KAFKA-18399 Remove ZooKeeper from KafkaApis (4/N): OFFSET_COMMIT and OFFSET_FETCH (#18461)
* [PR-18456](https://github.com/apache/kafka/pull/18456) - KAFKA-18399 Remove ZooKeeper from KafkaApis (3/N): USER_SCRAM_CREDENTIALS (#18456)
* [PR-18472](https://github.com/apache/kafka/pull/18472) - KAFKA-18466 Remove log4j-1.2-api from runtime scope while keeping it in distribution package (#18472)
* [PR-18404](https://github.com/apache/kafka/pull/18404) - KAFKA-18400: Don’t use YYYY when formatting/parsing dates in Java client (#18404)
* [PR-18437](https://github.com/apache/kafka/pull/18437) - KAFKA-18446 Remove MetadataCacheControllerNodeProvider (#18437)
* [PR-18468](https://github.com/apache/kafka/pull/18468) - KAFKA-18465: Remove MetadataVersions older than 3.0-IV1 (#18468)
* [PR-18467](https://github.com/apache/kafka/pull/18467) - KAFKA-18464: Empty Abort Transaction can fence producer incorrectly with Transactions V2 (#18467)
* [PR-18471](https://github.com/apache/kafka/pull/18471) - KAFKA-8116: Update Kafka Streams archetype for Java 11 (#18471)
* [PR-17510](https://github.com/apache/kafka/pull/17510) - KAFKA-17792: Efficiently parse decimals with large exponents in Connect Values (#17510)
* [PR-18679](https://github.com/apache/kafka/pull/18679) - KAFKA-18632: Added few share consumer multibroker tests. (#18679)
* [82ccf75a](https://github.com/apache/kafka/commit/82ccf75ae091bffb94cbb3fd173240c48627db17) - KAFKA-18575: Transaction Version 2 doesn’t correctly handle race condition with completing and new transaction(#18604)
* [94a1bfb1](https://github.com/apache/kafka/commit/94a1bfb1281f06263976b1ba8bba8c5ac5d7f2ce) - KAFKA-18575: Transaction Version 2 doesn’t correctly handle race condition with completing and new transaction(#18604)
* [PR-18340](https://github.com/apache/kafka/pull/18340) - KAFKA-18339: Fix parseRequestHeader error handling (#18340)
* [PR-18643](https://github.com/apache/kafka/pull/18643) - Revert “KAFKA-18404: Remove partitionMaxBytes usage from DelayedShareFetch (#17870)” (#18643)
* [21c4539d](https://github.com/apache/kafka/commit/21c4539dfe1134e60a7d8680d9ea19ae48f569a3) - Revert “KAFKA-18034: CommitRequestManager should fail pending requests on fatal coordinator errors (#18050)”
* [PR-18150](https://github.com/apache/kafka/pull/18150) - KAFKA-18026: KIP-1112 migrate KTableSuppressProcessorSupplier (#18150)
* [0186534a](https://github.com/apache/kafka/commit/0186534a992a123a7f53dd32860c6ba5787dbb18) - Revert “KAFKA-17411: Create local state Standbys on start (#16922)” and “KAFKA-17978: Fix invalid topology on Task assignment (#17778)”
* [PR-18378](https://github.com/apache/kafka/pull/18378) - KAFKA-18340: Change Dockerfile to use log4j2 yaml instead log4j properties (#18378)
* [PR-18397](https://github.com/apache/kafka/pull/18397) - KAFKA-18311: Enforcing copartitioned topics (4/N) (#18397)
* [PR-17454](https://github.com/apache/kafka/pull/17454) - KAFKA-17671: Create better documentation for transactions (#17454)
* [PR-18455](https://github.com/apache/kafka/pull/18455) - KAFKA-18308; Update CoordinatorSerde (#18455)
* [PR-18435](https://github.com/apache/kafka/pull/18435) - KAFKA-18440: Convert AuthorizationException to fatal error in AdminClient (#18435)
* [PR-18458](https://github.com/apache/kafka/pull/18458) - KAFKA-18304; Introduce json converter generator (#18458)
* [PR-18422](https://github.com/apache/kafka/pull/18422) - KAFKA-18399 Remove ZooKeeper from KafkaApis (2/N): CONTROLLED_SHUTDOWN and ENVELOPE (#18422)
* [PR-18146](https://github.com/apache/kafka/pull/18146) - KAFKA-18073: Prevent dropped records from failed retriable exceptions (#18146)
* [PR-18321](https://github.com/apache/kafka/pull/18321) - KAFKA-13093: Log compaction should write new segments with record version v2 (KIP-724) (#18321)
* [PR-18100](https://github.com/apache/kafka/pull/18100) - KAFKA-18180: Move OffsetResultHolder to storage module (#18100)
* [PR-17527](https://github.com/apache/kafka/pull/17527) - KAFKA-17455: fix stuck producer when throttling or retrying (#17527)
* [PR-18367](https://github.com/apache/kafka/pull/18367) - KAFKA-17915: Convert remaining Kafka Client system tests to use KRaft (#18367)
* [PR-18247](https://github.com/apache/kafka/pull/18247) - KAFKA-18277 Convert network_degrade_test to Kraft mode (#18247)
* [PR-18175](https://github.com/apache/kafka/pull/18175) - KAFKA-17986 Fix ConsumerRebootstrapTest and ProducerRebootstrapTest (#18175)
* [PR-18445](https://github.com/apache/kafka/pull/18445) - KAFKA-18445 Remove LazyDownConversionRecords and LazyDownConversionRecordsSend (#18445)
* [PR-18417](https://github.com/apache/kafka/pull/18417) - KAFKA-18399 Remove ZooKeeper from KafkaApis (1/N): LEADER_AND_ISR, STOP_REPLICA, UPDATE_METADATA (#18417)
* [PR-18382](https://github.com/apache/kafka/pull/18382) - KAFKA-17730 ReplicaFetcherThreadBenchmark is broken (#18382)
* [PR-18423](https://github.com/apache/kafka/pull/18423) - KAFKA-18437: Correct version of ShareUpdateRecord value (#18423)
* [PR-18457](https://github.com/apache/kafka/pull/18457) - KAFKA-18397: Added null check before sending background event from ShareConsumeRequestManager. (#18419) (#18457)
* [PR-18462](https://github.com/apache/kafka/pull/18462) - KAFKA-18449: Add share group state configs to reconfig-server.properties (#18440) (#18462)
* [PR-18395](https://github.com/apache/kafka/pull/18395) - KAFKA-18311: Configuring repartition topics (3/N) (#18395)
* [PR-18446](https://github.com/apache/kafka/pull/18446) - KAFKA-18453: Add StreamsTopology class to group coordinator (#18446)
* [PR-18450](https://github.com/apache/kafka/pull/18450) - KAFKA-18435 Remove zookeeper dependencies in build.gradle (#18450)
* [PR-18452](https://github.com/apache/kafka/pull/18452) - KAFKA-18111: Add Kafka Logo to README (#18452)
* [PR-18438](https://github.com/apache/kafka/pull/18438) - KAFKA-18432 Remove unused code from AutoTopicCreationManager (#18438)
* [PR-18436](https://github.com/apache/kafka/pull/18436) - KAFKA-18434: enrich the authorization error message of connecting to controller (#18436)
* [PR-18441](https://github.com/apache/kafka/pull/18441) - KAFKA-18426 Remove FinalizedFeatureChangeListener (#18441)
* [PR-18276](https://github.com/apache/kafka/pull/18276) - KAFKA-18321: Add StreamsGroupMember, MemberState and Assignment classes (#18276)
* [PR-18443](https://github.com/apache/kafka/pull/18443) - KAFKA-18425 Remove OffsetTrackingListener (#18443)
* [PR-18439](https://github.com/apache/kafka/pull/18439) - KAFKA-18433: Add BatchSize to ShareFetch request (1/N) (#18439)
* [PR-18428](https://github.com/apache/kafka/pull/18428) - Backport some GHA changes from trunk (#18428)
* [PR-18415](https://github.com/apache/kafka/pull/18415) - KAFKA-18428: Measure share consumers performance (#18415)
* [PR-18419](https://github.com/apache/kafka/pull/18419) - KAFKA-18397: Added null check before sending background event from ShareConsumeRequestManager. (#18419)
* [PR-18440](https://github.com/apache/kafka/pull/18440) - KAFKA-18449: Add share group state configs to reconfig-server.properties (#18440)
* [PR-18296](https://github.com/apache/kafka/pull/18296) - KAFKA-18173 Remove duplicate assertFutureError (#18296)
* [PR-18094](https://github.com/apache/kafka/pull/18094) - KAFKA-15599: Move SegmentPosition/TimingWheelExpirationService to raft module (#18094)
* [PR-18329](https://github.com/apache/kafka/pull/18329) - KAFKA-18353 Remove zk config control.plane.listener.name (#18329)
* [PR-18429](https://github.com/apache/kafka/pull/18429) - KAFKA-18443 Remove ZkFourLetterWords (#18429)
* [PR-18431](https://github.com/apache/kafka/pull/18431) - KAFKA-18417 Remove controlled.shutdown.max.retries and controlled.shutdown.retry.backoff.ms (#18431)
* [PR-18287](https://github.com/apache/kafka/pull/18287) - KAFKA-18326: fix merge iterator with cache tombstones (#18287)
* [PR-18413](https://github.com/apache/kafka/pull/18413) - KAFKA-18411 Remove ZkProducerIdManager (#18413)
* [PR-18421](https://github.com/apache/kafka/pull/18421) - KAFKA-18408 tweak the ‘tag’ field for BrokerHeartbeatRequest.json, BrokerRegistrationChangeRecord.json and RegisterBrokerRecord.json (#18421)
* [PR-18401](https://github.com/apache/kafka/pull/18401) - KAFKA-18414 Remove KRaftRegistrationResult (#18401)
* [PR-17671](https://github.com/apache/kafka/pull/17671) - KAFKA-17921 Support SASL_PLAINTEXT protocol with java.security.auth.login.config (#17671)
* [PR-18411](https://github.com/apache/kafka/pull/18411) - KAFKA-18436: Revert Multiversioning Changes from 4.0 release. (#18411)
* [PR-18364](https://github.com/apache/kafka/pull/18364) - KAFKA-18384 Remove ZkAlterPartitionManager (#18364)
* [PR-17946](https://github.com/apache/kafka/pull/17946) - KAFKA-10790: Add deadlock detection to producer#flush (#17946)
* [PR-18399](https://github.com/apache/kafka/pull/18399) - KAFKA-18412: Remove EmbeddedZookeeper (#18399)
* [PR-18352](https://github.com/apache/kafka/pull/18352) - KAFKA-18368 Remove TestUtils#MockZkConnect and remove zkConnect from TestUtils#createBrokerConfig (#18352)
* [PR-18396](https://github.com/apache/kafka/pull/18396) - KAFKA-18303; Update ShareCoordinator to use new record format (#18396)
* [PR-18370](https://github.com/apache/kafka/pull/18370) - KAFKA-18388 test-kraft-server-start.sh should use log4j2.yaml (#18370)
* [PR-17742](https://github.com/apache/kafka/pull/17742) - KAFKA-18419: KIP-891 Connect Multiversion Support (Transformation and Predicate Changes) (#17742)
* [PR-18355](https://github.com/apache/kafka/pull/18355) - KAFKA-18374 Remove EncryptingPasswordEncoder, CipherParamsEncoder, GcmParamsEncoder, IvParamsEncoder, and the unused static variables in PasswordEncoder (#18355)
* [PR-18379](https://github.com/apache/kafka/pull/18379) - KAFKA-18311: Configuring changelog topics (2/N) (#18379)
* [PR-18318](https://github.com/apache/kafka/pull/18318) - KAFKA-18307: Don’t report on disabled/removed tests (#18318)
* [PR-17801](https://github.com/apache/kafka/pull/17801) - KAFKA-17278; Add KRaft RPC compatibility tests (#17801)
* [PR-18377](https://github.com/apache/kafka/pull/18377) - KAFKA-17539: Application metrics extension for share consumer (#18377)
* [PR-18384](https://github.com/apache/kafka/pull/18384) - KAFKA-17616: Remove KafkaServer (#18384)
* [PR-18268](https://github.com/apache/kafka/pull/18268) - KAFKA-18311: Add internal datastructure for configuring topologies (1/N) (#18268)
* [PR-18343](https://github.com/apache/kafka/pull/18343) - KAFKA-18358: Replace Deprecated $buildDir variable in build.gradle  (#18343)
* [PR-18353](https://github.com/apache/kafka/pull/18353) - KAFKA-18365 Remove zookeeper.connect in Test (#18353)
* [PR-18373](https://github.com/apache/kafka/pull/18373) - Use instanceof pattern to avoid explicit cast (#18373)
* [PR-18270](https://github.com/apache/kafka/pull/18270) - KAFKA-18319: Add task assignor interfaces (#18270)
* [PR-18259](https://github.com/apache/kafka/pull/18259) - KAFKA-18273: KIP-1099 verbose display share group options (#18259)
* [PR-18363](https://github.com/apache/kafka/pull/18363) - KAFKA-18367 Remove ZkConfigManager (#18363)
* [PR-18351](https://github.com/apache/kafka/pull/18351) - KAFKA-18347 Add tools-log4j2.yaml to config and remove unsed tools-log4j.properties from config (#18351)
* [PR-18359](https://github.com/apache/kafka/pull/18359) - KAFKA-18375 Update the LICENSE-binary (#18359)
* [PR-18345](https://github.com/apache/kafka/pull/18345) - KAFKA-18026: KIP-1112, configure all StoreBuilder & StoreFactory layers (#18345)
* [PR-18232](https://github.com/apache/kafka/pull/18232) - KAFKA-12469: Deprecated and corrected topic metrics for consumer (KIP-1109) (#18232)
* [PR-18254](https://github.com/apache/kafka/pull/18254) - KAFKA-17421 Add integration tests for ConsumerRecord#leaderEpoch (#18254)
* [PR-18347](https://github.com/apache/kafka/pull/18347) - KAFKA-18361 Remove PasswordEncoderConfigs (#18347)
* [PR-18271](https://github.com/apache/kafka/pull/18271) - KAFKA-17615 Remove KafkaServer from tests (#18271)
* [PR-18308](https://github.com/apache/kafka/pull/18308) - KAFKA-18280 fix e2e TestSecurityRollingUpgrade.test_rolling_upgrade_sasl_mechanism_phase_one (#18308)
* [PR-18327](https://github.com/apache/kafka/pull/18327) - KAFKA-18313 Fix to Kraft or remove tests associate with Zk Broker config in SocketServerTest and ReplicaFetcherThreadTest (#18327)
* [PR-18279](https://github.com/apache/kafka/pull/18279) - KAFKA-18316 Fix to Kraft or remove tests associate with Zk Broker config in ConnectionQuotasTest (#18279)
* [PR-18185](https://github.com/apache/kafka/pull/18185) - KAFKA-18243 Fix compatibility of Loggers class between log4j and log4j2 (#18185)
* [PR-18269](https://github.com/apache/kafka/pull/18269) - KAFKA-18315 Fix to Kraft or remove tests associate with Zk Broker config in DynamicBrokerConfigTest, ReplicaManagerTest, DescribeTopicPartitionsRequestHandlerTest, KafkaConfigTest (#18269)
* [PR-18338](https://github.com/apache/kafka/pull/18338) - KAFKA-18354 Use log4j2 APIs to refactor LogCaptureAppender (#18338)
* [PR-18309](https://github.com/apache/kafka/pull/18309) - KAFKA-18314 Fix to Kraft or remove tests associate with Zk Broker config in KafkaApisTest (#18309)
* [PR-18344](https://github.com/apache/kafka/pull/18344) - KAFKA-18359 Set zkConnect to null in LocalLeaderEndPointTest, HighwatermarkPersistenceTest, IsrExpirationTest, ReplicaManagerQuotasTest, OffsetsForLeaderEpochTest (#18344)
* [PR-18101](https://github.com/apache/kafka/pull/18101) - KAFKA-18135: ShareConsumer HB UnsupportedVersion msg mixed with Consumer HB (#18101)
* [PR-18283](https://github.com/apache/kafka/pull/18283) - KAFKA-18317 Remove zookeeper.connect from RemoteLogManagerTest (#18283)
* [PR-18295](https://github.com/apache/kafka/pull/18295) - KAFKA-18339: Remove raw unversioned direct SASL protocol (KIP-896) (#18295)
* [PR-18313](https://github.com/apache/kafka/pull/18313) - KAFKA-18272: Deprecated protocol api usage should be logged at info level (#18313)
* [PR-18282](https://github.com/apache/kafka/pull/18282) - KAFKA-18295 Remove deprecated function Partitioner#onNewBatch (#18282)
* [PR-18317](https://github.com/apache/kafka/pull/18317) - KAFKA-18348 Remove the deprecated MockConsumer#setException (#18317)
* [PR-18324](https://github.com/apache/kafka/pull/18324) - KAFKA-18352: Add back DeleteGroups v0, it incorrectly tagged as deprecated (#18324)
* [PR-18310](https://github.com/apache/kafka/pull/18310) - KAFKA-18274 Failed to restart controller in testing due to closed socket channel [1/2] (#18310)
* [PR-18250](https://github.com/apache/kafka/pull/18250) - KAFKA-18093 Remove deprecated DeleteTopicsResult#values (#18250)
* [PR-18312](https://github.com/apache/kafka/pull/18312) - KAFKA-18343: Use java_pids to implement pids (#18312)
* [PR-18294](https://github.com/apache/kafka/pull/18294) - KAFKA-18338 add log4j.yaml to test-common-api and remove unsed log4j.properties from test-common (#18294)
* [PR-18306](https://github.com/apache/kafka/pull/18306) - KAFKA-18342 Use File.exist instead of File.exists to ensure the Vagrantfile works with Ruby 3.2+ (#18306)
* [PR-18246](https://github.com/apache/kafka/pull/18246) - KAFKA-18290 Remove deprecated methods of FeatureUpdate (#18246)
* [PR-18255](https://github.com/apache/kafka/pull/18255) - KAFKA-18289 Remove deprecated methods of DescribeTopicsResult (#18255)
* [PR-18265](https://github.com/apache/kafka/pull/18265) - KAFKA-18291 Remove deprecated methods of ListConsumerGroupOffsetsOptions (#18265)
* [PR-18223](https://github.com/apache/kafka/pull/18223) - KAFKA-18278: Correct name and description for run-gradle step (#18223)
* [PR-18267](https://github.com/apache/kafka/pull/18267) - KAFKA-17393: Remove log.message.format.version/message.format.version (KIP-724) (#18267)
* [PR-18132](https://github.com/apache/kafka/pull/18132) - KAFKA-17705: Add Transactions V2 system tests and mark as production ready (#18132)
* [PR-18291](https://github.com/apache/kafka/pull/18291) - KAFKA-18269: Remove deprecated protocol APIs support (KIP-896, KIP-724) (#18291)
* [PR-18218](https://github.com/apache/kafka/pull/18218) - KAFKA-18269: Remove deprecated protocol APIs support (KIP-896, KIP-724) (#18218)
* [PR-18288](https://github.com/apache/kafka/pull/18288) - KAFKA-18334: Produce v4-v6 should be undeprecated (#18288)
* [PR-18262](https://github.com/apache/kafka/pull/18262) - KAFKA-18270: FindCoordinator v0 incorrectly tagged as deprecated (#18262)
* [PR-18221](https://github.com/apache/kafka/pull/18221) - KAFKA-18270: SaslHandshake v0 incorrectly tagged as deprecated (#18221)
* [PR-18249](https://github.com/apache/kafka/pull/18249) - KAFKA-13722: code cleanup after deprecated StateStore.init() was removed (#18249)
* [PR-17687](https://github.com/apache/kafka/pull/17687) - KAFKA-15370: Support Participation in 2PC (KIP-939) (1/N) (#17687)
* [PR-18285](https://github.com/apache/kafka/pull/18285) - KAFKA-18312: Added entityType: topicName to SubscribedTopicNames in ShareGroupHeartbeatRequest.json (#18285)
* [PR-18261](https://github.com/apache/kafka/pull/18261) - KAFKA-18301; Make coordinator records first class citizen (#18261)
* [PR-18204](https://github.com/apache/kafka/pull/18204) - KAFKA-18262 Remove DefaultPartitioner and UniformStickyPartitioner (#18204)
* [PR-18257](https://github.com/apache/kafka/pull/18257) - KAFKA-18296 Remove deprecated KafkaBasedLog constructor (#18257)
* [PR-18238](https://github.com/apache/kafka/pull/18238) - KAFKA-12829: Remove old Processor and ProcessorSupplier interfaces (#18238)
* [PR-18245](https://github.com/apache/kafka/pull/18245) - KAFKA-18292 Remove deprecated methods of UpdateFeaturesOptions (#18245)
* [PR-18154](https://github.com/apache/kafka/pull/18154) - KAFKA-12829: Remove deprecated Topology#addProcessor of old Processor API (#18154)
* [PR-18136](https://github.com/apache/kafka/pull/18136) - KAFKA-18207: Serde for handling transaction records (#18136)
* [PR-18243](https://github.com/apache/kafka/pull/18243) - KAFKA-13722: Refactor Kafka Streams store interfaces (#18243)
* [PR-18241](https://github.com/apache/kafka/pull/18241) - KAFKA-17131: Refactor TimeDefinitions (#18241)
* [PR-18228](https://github.com/apache/kafka/pull/18228) - KAFKA-18284: Add group coordinator records for Streams rebalance protocol (#18228)
* [PR-18242](https://github.com/apache/kafka/pull/18242) - KAFKA-13722: Refactor SerdeGetter (#18242)
* [PR-18176](https://github.com/apache/kafka/pull/18176) - KAFKA-18227: Ensure v2 partitions are not added to last transaction during upgrade (#18176)
* [PR-18251](https://github.com/apache/kafka/pull/18251) - Add IT for share consumer with duration base offet auto reset (#18251)
* [PR-18230](https://github.com/apache/kafka/pull/18230) - KAFKA-18283: Add StreamsGroupDescribe RPC definitions (#18230)
* [PR-18260](https://github.com/apache/kafka/pull/18260) - KAFKA-18294 Remove deprecated SourceTask#commitRecord (#18260)
* [PR-18211](https://github.com/apache/kafka/pull/18211) - KAFKA-18264 Remove NotLeaderForPartitionException (#18211)
* [PR-18248](https://github.com/apache/kafka/pull/18248) - KAFKA-18094 Remove deprecated TopicListing(String, Boolean) (#18248)
* [PR-18227](https://github.com/apache/kafka/pull/18227) - KAFKA-18282: Add StreamsGroupHeartbeat RPC definitions (#18227)
* [PR-18205](https://github.com/apache/kafka/pull/18205) - KAFKA-18026: transition KTable#filter impl to use processor wrapper (#18205)
* [PR-18244](https://github.com/apache/kafka/pull/18244) - KAFKA-18293 Remove org.apache.kafka.common.security.oauthbearer.secured.OAuthBearerLoginCallbackHandler and org.apache.kafka.common.security.oauthbearer.secured.OAuthBearerValidatorCallbackHandler (#18244)
* [PR-18234](https://github.com/apache/kafka/pull/18234) - KAFKA-17960; PlaintextAdminIntegrationTest.testConsumerGroups fails with CONSUMER group protocol (#18234)
* [PR-18144](https://github.com/apache/kafka/pull/18144) - KAFKA-18200; Handle empty batches in coordinator runtime (#18144)
* [PR-18180](https://github.com/apache/kafka/pull/18180) - KAFKA-18237: Upgrade system tests from using 3.7.1 to 3.7.2 (#18180)
* [PR-18210](https://github.com/apache/kafka/pull/18210) - KAFKA-18259: Documentation for consumer auto.offset.reset contains invalid HTML (#18210)
* [PR-18207](https://github.com/apache/kafka/pull/18207) - KAFKA-18263; Group lock must be acquired when reverting static membership rejoin (#18207)
* [PR-18190](https://github.com/apache/kafka/pull/18190) - KAFKA-18244: Fix empty SHA on “Pull Request Labeled” workflow (#18190)
* [PR-18166](https://github.com/apache/kafka/pull/18166) - KAFKA-18226: Disable CustomQuotaCallbackTest and remove isKRaftTest (#18166)


#### NOTE
Ansible offers a simpler way to configure and deploy RBAC and MDS.
Refer to [Ansible RBAC settings](https://docs.confluent.io/ansible/current/ansible-authorize.html)
for details.

To set up RBAC:

* [Install Confluent Platform](../../../installation/overview.md#installation), including the `confluent-server`
  commercial component. For more information, see [Migrate Confluent Platform to Confluent Server](../../../installation/migrate-confluent-server.md#migrate-confluent-server).
* Work with your security team to evaluate the needs of the users in your
  organization and and, based on the resources they require to perform their
  duties, identify which roles should be assigned to users and groups.

For a description of some typical use cases and required roles for each, refer
to [RBAC role use cases](rbac-predefined-roles.md#rbac-roles-use-cases).

To bootstrap RBAC, you must identify an ACL-level `super.user` in the
Confluent Server broker’s `server.properties` file on the cluster that hosts MDS. This
`super.user` can then assign the SystemAdmin role to another user who can create
the required clusters and scope the required role bindings for users and groups.
Be sure to identify which user will serve as a bootstrap `super.user`. For
details, refer to [Use Predefined RBAC Roles in Confluent Platform](rbac-predefined-roles.md#rbac-predefined-roles).

* [Configure the Metadata Service (MDS)](../../../kafka/configure-mds/index.md#rbac-mds-config).

The MDS implements the core RBAC functionality and
[communicates with LDAP](../../../kafka/configure-mds/ldap-auth-mds.md#ldap-auth-mds) to get user and group information
and authenticate users. After configuring MDS, you can proceed with role
bindings and configuration of other Confluent Platform components.

Refer to [Configure the LDAP identity provider](../../../kafka/configure-mds/index.md#mds-id-provider-settings) to view an LDAP configuration for MDS.

* After you have determined which roles must be assigned to users and groups,
  create the appropriate [role bindings](rbac-cli-quickstart.md#rbac-rolebinding-sysadmin-role) for
  users to access the resources (for example, Schema Registry, ksqlDB, Connect, and Confluent Control Center )
  they require to perform their duties.
* Confirm the user and group roles you defined using the
  [confluent iam rbac role-binding list](https://docs.confluent.io/confluent-cli/current/command-reference/iam/rbac/role-binding/confluent_iam_rbac_role-binding_list.html)
  command.
* Configure Confluent Platform components to communicate with MDS for authentication and
  authorization. For details, see:
  * [Configure RBAC for Control Center on Confluent Platform](/control-center/current/security/c3-rbac.html)
  * [Kafka Connect and RBAC](../../../connect/rbac-index.md#connect-rbac-index)
  * [Deploy Secure ksqlDB with RBAC in Confluent Platform](ksql-rbac.md#ksql-rbac)
  * [Configure Role-Based Access Control for Schema Registry in Confluent Platform](../../../schema-registry/security/rbac-schema-registry.md#schemaregistry-rbac)
  * [Configure RBAC for REST Proxy](../../../kafka-rest/production-deployment/rest-proxy/security.md#rbac-rest-proxy-security)
  * [Configure RBAC using the REST API in Confluent Platform](rbac-config-using-rest-api.md#rbac-config-using-rest-api)


# Use Predefined RBAC Roles in Confluent Platform

Confluent Platform provides predefined roles to help implement granular permissions for specific
resources and to simplify access control across the Confluent Platform. A predefined role is a
Confluent-defined job function that is assigned a set of permissions required
to perform specific actions or operations on Confluent resources. Each role is
bound to a principal and Confluent resources. Users can have multiple roles
assigned to them. You cannot use a predefined role to override denial-of-access
(DENY) that is configured in an ACL.

When a role is assigned at the **cluster-level** (Kafka cluster, Schema Registry cluster, ksqlDB cluster,
or Connect cluster) it means that users who are assigned this role have access to
all resources in a cluster. For example, the `ClusterAdmin` of a Kafka cluster
has access to Confluent Control Center alerts. There are corresponding resource types for each
cluster type. For example, you can assign the `ResourceOwner` role to the
resource types `KsqlCluster:ksql-cluster` or `Cluster:kafka-cluster`
to provide a user all the `ResourceOwner` privileges for a ksqlDB or Kafka
cluster.

When a role is assigned at the **resource-level** it means that users assigned this
role only have access to specific resources as defined in the role binding.

The resource types for which you can assign RBAC roles and role bindings are:

- Kafka cluster
- Topic
- Consumer group
- TransactionalID
- Schema Registry cluster
- Schema Registry subject
- ksqlDB cluster
- Connect cluster
- Connector
- Kek (CSFLE key resource in Schema Registry)

Confluent Platform provides the following predefined roles:

| Role Name         | Level of Role Scope   | View Role Bindings of Others   | Manage Role Bindings   | Monitor   | Resource Read   | Resource Write   | Resource Manage        |
|-------------------|-----------------------|--------------------------------|------------------------|-----------|-----------------|------------------|------------------------|
| `super.user`      | Cluster               | Yes                            | Yes                    | Yes       | Yes             | Yes              | Yes                    |
| `SystemAdmin`     | Cluster               | Yes                            | Yes                    | Yes       | Yes             | Yes              | Yes                    |
| `ClusterAdmin`    | Cluster               | No                             | No                     | Yes       | No              | No               | Yes                    |
| `UserAdmin`       | Cluster               | Yes                            | Yes                    | No        | No              | No               | No                     |
| `SecurityAdmin`   | Cluster               | Yes                            | No                     | No        | No              | No               | No                     |
| `AuditAdmin`      | Cluster               | No                             | No                     | No        | No              | No               | Yes [(1)](#role-notes) |
| `Operator`        | Cluster               | No                             | No                     | Yes       | No              | No               | Yes [(2)](#role-notes) |
| `ResourceOwner`   | Resource              | Yes                            | Yes                    | No        | Yes             | Yes              | Yes                    |
| `DeveloperRead`   | Resource              | No                             | No                     | No        | Yes             | No               | No                     |
| `DeveloperWrite`  | Resource              | No                             | No                     | No        | No              | Yes              | No                     |
| `DeveloperManage` | Resource              | No                             | No                     | No        | No              | No               | Yes                    |


Notes:
: 1. The AuditAdmin role provides sufficient access for creating and managing the audit log configuration.
  2. For Operator Resource Manage, Operators can only pause, resume, and scale
     Connectors.

super.user
: The purpose of `super.user` is to have a bootstrap user who can initially
  grant another user the `SystemAdmin` role.


  Technically speaking, `super.user` is not a predefined role. It is a
  `server.properties` attribute that defines a user who has full access to all
  resources within a Metadata Service (MDS) cluster. A `super.user`
  has no access to resources in other clusters (unless also configured as a `super.user` on
  other clusters). The primary use of super.user is to bootstrap
  Confluent Platform and assign a SystemAdmin. On MDS clusters, `super.user` can create role
  bindings for all other clusters. Permissions granted by `super.user` apply
  only to the broker where the `super.user` attribute is specified, and not to
  other brokers, clusters, or Confluent Platform components. No authorization is enforced on
  users defined as `super.user`. It is strongly recommended that this role is
  assigned only to a limited number of users (for example, 1-2 users who are
  responsible for bootstrapping).

SystemAdmin
: Provides full access to all scoped [resources](overview.md#rbac-resource) in the
  cluster (ksqlDB cluster, Kafka cluster, or Schema Registry cluster).


  It is strongly recommended that this role is assigned only to a limited number
  of users (one or two per cluster) who need full permission
  for initial setup or to address urgent issues when absolutely
  necessary in production instances. You may wish to assign this role more
  liberally in small test and development use cases, or when working in ksqlDB
  clusters that are primarily single tenant. Otherwise, it is recommended that
  you do not assign this role.

ClusterAdmin
: Sets up clusters (ksqlDB cluster, Kafka cluster, or Schema Registry cluster).


  Responsible for setting up and managing Kafka clusters, brokers, networking,
  ksqlDB clusters, Connect clusters, and adding or removing nodes and
  performing upgrades. The `ClusterAdmin` typically creates topics and sets the
  properties of those topics, for example performance and capacity, but cannot
  read or write to topics, and has no access to data. For monitoring applications,
  it is recommended that this role is delegated to the operator who monitors your
  applications. Typically, the `ClusterAdmin` user does not possess knowledge
  about the content of the cluster data and he/she delegates the ownership
  responsibility of those resources to users assigned the `ResourceOwner` role.
  For example, after creating topics the `ClusterAdmin` can set ownership to
  a specific user familiar with the topic data.

UserAdmin
: Manages role bindings for users and groups in all clusters managed by MDS.


  Manages users and groups in a cluster, including the mapping of users and
  groups to roles. Has no access to any other resources. Typically, users with
  the `UserAdmin` role are tasked with setting up access to [resources](overview.md#rbac-resource). Users
  granted this role should be extremely trustworthy because they can grant roles
  to themselves and others. You can monitor the actions of the `UserAdmin`
  using audit logs.

SecurityAdmin
: Enables management of platform-wide security initiatives.


  Sets up security-related features (for example, encryption, tracking of audit
  logs, and watching for abnormal behavior).  Provides a dedicated set of users
  for the initial setup and ongoing management of security functions.

AuditAdmin
: Users or groups assigned this role on the MDS cluster and every
  registered Kafka cluster can manage the audit log configuration using
  the [Confluent Metadata API](mds-api.md#mds-api).

Operator
: Provides operational management of clusters and scale applications as needed.


  Monitors the health of applications and clusters, including monitoring uptime.
  This role cannot create applications, nor does it allow you to view or edit
  the content of the topics. However, you can view what topics and partitions
  exist.

ResourceOwner
: Transfers the ownership of critical resources and to scale the ability to manage
  authorizations for those resources.


  Owns the [resource](overview.md#rbac-resource) and has full access to it, including
  read, write, and list. ResourceOwner can grant permission to
  others who need access to resources. Owner cannot change some of the
  configurations, for example the number of partitions. Must own the resource
  to grant others access to it. Enables scaling of authorization for critical resources.

DeveloperRead, DeveloperWrite, DeveloperManage
: Allows developers to drive the implementation of applications they are working
  on and manage the content.


### Load the Connector

This quick start assumes that security is not configured for HDFS and Hive
metastore. To make the necessary security configurations, see
[Secure HDFS and Hive Metastore](#dataproc-secure-hdfs-hive-metastore).

First, start all the necessary services using the Confluent CLI.

```bash
confluent local start
```

Next, start the Avro console producer to import a few records to Kafka:

```bash
  ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test_dataproc \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'
```

Then in the console producer, type:

```bash
{"f1": "value1"}
{"f1": "value2"}
{"f1": "value3"}
```

The three records entered are published to the Kafka topic `test_dataproc` in Avro format.

Before starting the connector, make sure that the configurations in
`etc/gcp-dataproc-sink-quickstart.properties` are properly set to your
configurations of Dataproc. For example, `$home` is replaced by your home
directory path; `YOUR-PROJECT-ID`, `YOUR-CLUSTER-REGION`, and
`YOUR-CLUSTER-NAME` are replaced by your perspective values.

Then, start connector by loading its configuration with the following command.

```bash
confluent local load dataproc-sink --config etc/gcp-dataproc-sink-quickstart.properties

{
  "name": "dataproc-sink",
  "config": {
    "topics": "test_dataproc",
    "tasks.max": "1",
    "flush.size": "3",
    "connector.class": "io.confluent.connect.gcp.dataproc.DataprocSinkConnector",
    "gcp.dataproc.credentials.path": "/home/user/credentials.json",
    "gcp.dataproc.projectId": "dataproc-project-id",
    "gcp.dataproc.region": "us-west1",
    "gcp.dataproc.cluster": "dataproc-cluster-name",
    "confluent.license": "",
    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1",
    "name": "dataproc-sink"
  },
  "tasks": [],
  "type": "sink"
}
```

To check that the connector started successfully, view the Connect worker’s log
by running:

```bash
confluent local services connect log
```

Towards the end of the log you should see that the connector starts, logs a few
messages, and then exports data from Kafka to HDFS. After the connector finishes
ingesting data to HDFS, check that the data is available in HDFS. From the HDFS
namenode in Dataproc:

```bash
hadoop fs -ls /topics/test_dataproc/partition=0
```

You should see a file with the name
`/topics/test_dataproc/partition=0/test_dataproc+0+0000000000+0000000002.avro`
The file name is encoded as
`topic+kafkaPartition+startOffset+endOffset.format`.

You can use `avro-tools-1.9.1.jar` (available in [Apache mirrors](https://archive.apache.org/dist/avro/avro-1.9.1/java/avro-tools-1.9.1.jar))
to extract the content of the file. Run `avro-tools` directly on Hadoop as:

```bash
  hadoop jar avro-tools-1.9.1.jar tojson \
hdfs://<namenode>/topics/test_dataproc/partition=0/test_dataproc+0+0000000000+0000000002.avro
```

where “<namenode>” is the HDFS Namenode hostname. Usually, the Namenode
hostname will be your clustername with “-m” postfix attached.

Or, if you experience issues, first copy the avro file from HDFS to the local
filesystem and try again with Java:

```bash
  hadoop fs -copyToLocal /topics/test_dataproc/partition=0/test_dataproc+0+0000000000+0000000002.avro \
/tmp/test_dataproc+0+0000000000+0000000002.avro

  java -jar avro-tools-1.9.1.jar tojson /tmp/test_dataproc+0+0000000000+0000000002.avro
```

You should see the following output:

```bash
{"f1":"value1"}
{"f1":"value2"}
{"f1":"value3"}
```

Finally, stop the Connect worker as well as all the rest of Confluent Platform by running:

```bash
confluent local stop
```

or stop all the services and additionally wipe out any data generated during
this quick start by running:

```bash
confluent local destroy
```


# Comma separate input topic list
topics=connect-test
```

Note that the configuration contains similar settings to the file source. A key difference is that multiple input
topics are specified with `topics` whereas the file source allows for only one output topic specified with `topic`.

Now start the FileStreamSinkConnector. The sink connector will run within the same worker as the source connector,
but each connector task will have its own dedicated thread.

```bash
  confluent local load file-sink

{
  "name": "file-sink",
  "config": {
    "connector.class": "FileStreamSink",
    "tasks.max": "1",
    "file": "test.sink.txt",
    "topics": "connect-test",
    "name": "file-sink"
  },
  "tasks": []
}
```

To ensure the sink connector is up and running, use the following command to get the state of the connector:

```bash
  confluent local status file-sink

{
  "name": "file-sink",
  "connector": {
    "state": "RUNNING",
    "worker_id": "192.168.10.1:8083"
  },
  "tasks": [
    {
      "state": "RUNNING",
      "id": 0,
      "worker_id": "192.168.10.1:8083"
    }
  ]
}
```

as well as the list of all loaded connectors:

```bash
  confluent local status connectors

[
  "file-source",
  "file-sink"
]
```

By opening the file `test.sink.txt` you should see the two log lines written to it by the sink connector.

With both connectors running, you can see data flowing end-to-end in real time. To check this out, use another
terminal to tail the output file:

```bash
tail -f test.sink.txt
```

and in a different terminal start appending additional lines to the text file:

```bash
for i in {4..1000}; do echo "log line $i"; done >> test.txt
```

You should see the lines being added to `test.sink.txt`. The new data was picked up by the
source connector, written to Kafka, read by the sink connector from Kafka, and finally appended to the file.

```bash
"log line 1"
"log line 2"
"log line 3"
"log line 4"
"log line 5"
 ...
```

After you are done experimenting with reading from and writing to a file with Connect, you have a few options with
respect to shutting down the connectors:

* Unload the connectors but leave the Connect worker running.
  ```bash
  confluent local unload file-source
  confluent local unload file-sink
  ```
* Stop the Connect worker altogether.
  ```bash
    confluent local services connect stop
  Stopping Connect
  Connect is [DOWN]
  ```
* Stop the Connect worker as well as all the rest Confluent services.
  ```bash
  confluent local stop
  ```

Your output should resemble:

```none
ksqlDB Server is [DOWN]
Connect is [DOWN]
Kafka REST is [DOWN]
Schema Registry is [DOWN]
Kafka is [DOWN]
KRaft Controller is [DOWN]
```

* Stop all the services and wipe out any data of this particular run of Confluent services.

```bash
confluent local destroy
```

Your output should resemble:

```bash
ksqlDB Server is [DOWN]
Connect is [DOWN]
Kafka REST is [DOWN]
Schema Registry is [DOWN]
Kafka is [DOWN]
KRaft Controller is [DOWN]
Deleting: /var/folders/ty/rqbqmjv54rg_v10ykmrgd1_80000gp/T/confluent.PkQpsKfE
```

Both source and sink connectors can track offsets, so you can start and stop the process any number of times and add more
data to the input file and both will resume where they previously left off.

The connectors demonstrated in this tutorial are intentionally simple so no additional dependencies are necessary.
Most connectors will require a bit more configuration to specify how to connect to the source or sink system and what
data to copy, and for many you will want to execute on a Kafka Connect cluster for scalability and fault tolerance.
To get started with Kafka Connect you’ll want to see the [user guide](/kafka-connectors/self-managed/userguide.html) for more details on running and managing
Kafka Connect, including how to run in distributed mode. The [Connectors](/kafka-connectors/self-managed/supported.html) section includes
details on configuring and deploying the connectors that ship with Confluent Platform.


#### NOTE
If you’re deploying with Docker, you can skip setting
`ksql.connect.worker.config`. ksqlDB will look for environment
variables prefixed with `KSQL_CONNECT_`. If it finds any, it will
remove the `KSQL_` prefix and place them into a Connect
configuration file. Embedded mode will use that configuration file. This
is a convenience to avoid creating and mounting a separate configuration
file.

To get started, here is a Docker Compose example with a server
configured for embedded mode. All `KSQL_` environment variables are
converted automatically to server configuration properties. Any
connectors installed on your host at `confluent-hub-components` are
loaded. Save this in a file named `docker-compose.yml`:

```yaml

version: '2'

services:

  broker:
    image: confluentinc/cp-enterprise-kafka:8.1.0
    hostname: broker
    container_name: broker
    ports:
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1

  schema-registry:
    image: confluentinc/cp-schema-registry:8.1.0
    hostname: schema-registry
    container_name: schema-registry
    depends_on:
      - broker
    ports:
      - "8081:8081"
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry

  ksqldb-server:
    image: confluentinc/ksqldb-server:8.1.0
    hostname: ksqldb-server
    container_name: ksqldb-server
    depends_on:
      - broker
      - schema-registry
    ports:
      - "8088:8088"
    volumes:
      - "./confluent-hub-components:/usr/share/kafka/plugins"
    environment:
      KSQL_LISTENERS: "http://0.0.0.0:8088"
      KSQL_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
      # Configuration to embed Kafka Connect support.
      KSQL_CONNECT_GROUP_ID: "ksql-connect-cluster"
      KSQL_CONNECT_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.storage.StringConverter"
      KSQL_CONNECT_VALUE_CONVERTER: "io.confluent.connect.avro.AvroConverter"
      KSQL_CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE: "false"
      KSQL_CONNECT_CONFIG_STORAGE_TOPIC: "ksql-connect-configs"
      KSQL_CONNECT_OFFSET_STORAGE_TOPIC: "ksql-connect-offsets"
      KSQL_CONNECT_STATUS_STORAGE_TOPIC: "ksql-connect-statuses"
      KSQL_CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_PLUGIN_PATH: "/usr/share/kafka/plugins"

  ksqldb-cli:
    image: confluentinc/ksqldb-cli:8.1.0
    container_name: ksqldb-cli
    depends_on:
      - broker
      - ksqldb-server
    entrypoint: /bin/sh
    tty: true
```

Bring up the stack with:

```bash
docker-compose up
```


### Resources

Users access and perform operations on specific Kafka and Confluent Platform resources. A
resource can be a cluster, group, Kafka topic, transactional ID, or Delegation
token. ACLs specify which users can access a specified resource and the
operations they can perform on that resource. Within Kafka, resources
include:

Cluster
: The Kafka cluster. To run operations that impact the entire Kafka cluster,
  such as a controlled shutdown or creating new topics, must be assigned
  privileges on the cluster resource.

Delegation Token
: Delegation tokens are shared secrets between Apache Kafka® brokers and clients.
  Authentication based on delegation tokens is a lightweight authentication
  mechanism that you can use to complement existing SASL/SSL methods. Refer to
  [Use Delegation Tokens for Authentication in Confluent Platform](../../authentication/delegation-tokens/overview.md#kafka-sasl-delegate-auth) for more details.

Group
: Groups in the brokers. All protocol calls that work with groups, such as
  joining a group, must have corresponding privileges with the group in the
  subject. Group (`group.id`) includes Consumer Group, Stream Group
  (`application.id`), Connect Worker Group, or any other group that uses the
  Consumer Group protocol, like Schema Registry cluster.


  When using the `kafka-acls` command’s `--group` flag with  a wildcard,
  you must encapsulate the wildcard with quotes. Failure to do this can result in unexpected
  results.

Topic
: All Kafka messages are organized into topics (and partitions). To access a topic,
  you must have a corresponding operation (such as READ or WRITE) defined in an
  ACL.


  When using the `kafka-acls` command’s `--topic` flag with  a wildcard,
  you must encapsulate the wildcard with quotes. Failure to do this can result in unexpected
  results.

Transactional ID
: A transactional ID (`transactional.id`) identifies a single producer
  instance across application restarts and provides a way to ensure a single
  writer; this is necessary for exactly-once semantics (EOS). Only one producer
  can be active for each `transactional.id`. When a producer starts, it first
  checks whether or not there is a pending transaction by a producer with its
  own `transactional.id`. If there is, then it waits until the transaction
  has finished (abort or commit). This guarantees that the producer always
  starts from a consistent state.


  When used, a producer must be able to manipulate transactional IDs and have all the
  permissions set. For example, the following ACL allows all users in the system
  access to an EOS producer:


  ```shell
  kafka-acls --bootstrap-server localhost:9092 \
    --command-config adminclient-configs.conf \
    --add \
    --transactional-id * \
    --allow-principal User:* \
    --operation write
  ```


  In cases where you need to create ACLs for a Kafka cluster to allow
  Streams exactly once (EOS) processing:


  ```shell
  # Allow Streams EOS:
  kafka-acls ...
    --add \
    --allow-principal User:team1 \
    --operation WRITE \
    --operation DESCRIBE \
    --transactional-id team1-streams-app1 \
    --resource-pattern-type prefixed
  ```


  For additional information about the role of transactional IDs, refer to
  [Transactions in Apache Kafka](https://www.confluent.io/blog/transactions-apache-kafka).

The [Operations](#acl-operations) available to a user depend on the resources to which
the user has been granted access. All resources have a unique resource identifier.
For example, for the topic resource type, the resource identity is the topic name,
and for the group resource type, the resource identity is the group name.

You can view the ACLs for a specific resource using the `--list` option. For
example, to view all ACLs for the topic `test-topic` run the following
command:

```shell
kafka-acls --bootstrap-server localhost:9092 \
  --command-config adminclient-configs.conf \
  --list \
  --topic test-topic
```


### ksqlDB

In this example, ksqlDB is authenticated and authorized to connect to the secured Kafka cluster, and it is already running queries as defined in the [ksqlDB command file](https://github.com/confluentinc/cp-demo/tree/latest/scripts/ksqlDB/statements.sql).
Its embedded producer is configured to be idempotent, exactly-once in order semantics per partition (in the event of an error that causes a producer retry, the same message—which is still sent by the producer multiple times—will only be written to the Kafka log on the broker once).

1. In the navigation bar, click **ksqlDB**.
2. From the list of ksqlDB applications, select `wikipedia`.
   ![image](tutorials/cp-demo/images/ksql_link.png)
3. View the ksqlDB Flow to see the streams and tables created in the example, and how they relate to one another.
   ![image](tutorials/cp-demo/images/ksqldb_flow.png)
4. Use Confluent Control Center to interact with ksqlDB, or run ksqlDB CLI to get to the ksqlDB CLI prompt.
   ```bash
   docker compose exec ksqldb-cli bash -c 'ksql -u ksqlDBUser -p ksqlDBUser http://ksqldb-server:8088'
   ```
5. View the existing ksqlDB streams. (If you are using the ksqlDB CLI, at the `ksql>` prompt, type `SHOW STREAMS;`)
   ![image](tutorials/cp-demo/images/ksql_streams_list.png)
6. Click **WIKIPEDIA** to describe the schema (fields or columns) of an existing ksqlDB stream. (If you are using the ksqlDB CLI, at the `ksql>` prompt, type `DESCRIBE WIKIPEDIA;`)
   ![image](tutorials/cp-demo/images/wikipedia_describe.png)
7. View the existing ksqlDB tables. (If you are using the ksqlDB CLI, at the `ksql>` prompt, type `SHOW TABLES;`). One table is called `WIKIPEDIA_COUNT_GT_1`, which counts occurrences within a [tumbling window](../../ksqldb/concepts/time-and-windows-in-ksqldb-queries.md#ksqldb-time-and-windows-tumbling-window).
   ![image](tutorials/cp-demo/images/ksql_tables_list.png)
8. View the existing ksqlDB queries, which are continuously running. (If you are using the ksqlDB CLI, at the `ksql>` prompt, type `SHOW QUERIES;`).
   ![image](tutorials/cp-demo/images/ksql_queries_list.png)
9. View messages from different ksqlDB streams and tables. Click on your stream of choice and then click **Query stream** to open the Query Editor. The editor shows a pre-populated query, like `select * from WIKIPEDIA EMIT CHANGES;`, and it shows results for newly arriving data.
   ![image](tutorials/cp-demo/images/ksql_query_topic.png)
10. Click **ksqlDB Editor** and run the `SHOW PROPERTIES;` statement. You can see the configured ksqlDB server properties and check these values with the [docker-compose.yml](https://github.com/confluentinc/cp-demo/tree/latest/docker-compose.yml) file.
    ![image](tutorials/cp-demo/images/ksql_properties.png)
11. The [ksqlDB processing log](../../ksqldb/reference/processing-log.md#ksqldb-reference-processing-log) captures per-record errors during processing to help developers debug their ksqlDB queries. In this example, the processing log uses mutual TLS (mTLS) authentication, as configured in the custom [log4j properties file](https://github.com/confluentinc/cp-demo/tree/latest/scripts/helper/log4j-secure.properties), to write entries into a Kafka topic. To see it in action, in the ksqlDB editor run the following “bad” query for 20 seconds:
    ```bash
    SELECT 1/0 FROM wikipedia EMIT CHANGES;
    ```

    No records should be returned from this query. ksqlDB writes errors into the processing log for each record. View the processing log topic `ksql-clusterksql_processing_log` with topic inspection (jump to offset 0/partition 0) or the corresponding ksqlDB stream `KSQL_PROCESSING_LOG` with the ksqlDB editor (set `auto.offset.reset=earliest`).
    ```bash
    SELECT * FROM KSQL_PROCESSING_LOG EMIT CHANGES;
    ```


## Data governance with Schema Registry

All the applications and connectors used in this example are configured to automatically read and write Avro-formatted data, leveraging the [Confluent Schema Registry](../../schema-registry/index.md#schemaregistry-intro).

The security in place between Schema Registry and the end clients, e.g. `appSA`, is as follows:

- Encryption: TLS, e.g. client has `schema.registry.ssl.truststore.*` configurations
- Authentication: bearer token authentication from HTTP basic auth headers, e.g. client has `basic.auth.user.info` and `basic.auth.credentials.source` configurations
- Authorization: Schema Registry uses the bearer token with RBAC to authorize the client

1. View the Schema Registry subjects for topics that have registered schemas for their keys and/or values. Notice the `curl` arguments include (a) TLS information required to interact with Schema Registry which is listening for HTTPS on port 8085, and (b) authentication credentials required for RBAC (using superUser:superUser to see all of them).
   ```text
   docker exec schemaregistry curl -s -X GET \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      -u superUser:superUser \
      https://schemaregistry:8085/subjects | jq .
   ```

   Your output should resemble:
   ```JSON
   [
     "WIKIPEDIA_COUNT_GT_1-value",
     "wikipedia-activity-monitor-KSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition-value",
     "wikipedia.parsed.replica-value",
     "WIKIPEDIABOT-value",
     "WIKIPEDIANOBOT-value",
     "_confluent-ksql-ksql-clusterquery_CTAS_WIKIPEDIA_COUNT_GT_1_7-Aggregate-GroupBy-repartition-value",
     "wikipedia.parsed.count-by-domain-value",
     "wikipedia.parsed-value",
     "_confluent-ksql-ksql-clusterquery_CTAS_WIKIPEDIA_COUNT_GT_1_7-Aggregate-Aggregate-Materialize-changelog-value"
   ]
   ```
2. Instead of using the superUser credentials, now use client credentials noexist:noexist (user does not exist in LDAP) to try to register a new Avro schema (a record with two fields `username` and `userid`) into Schema Registry for the value of a new topic `users`. It should fail due to an authorization error.
   ```text
   docker compose exec schemaregistry curl -X POST \
      -H "Content-Type: application/vnd.schemaregistry.v1+json" \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      --data '{ "schema": "[ { \"type\":\"record\", \"name\":\"user\", \"fields\": [ {\"name\":\"userid\",\"type\":\"long\"}, {\"name\":\"username\",\"type\":\"string\"} ]} ]" }' \
      -u noexist:noexist \
      https://schemaregistry:8085/subjects/users-value/versions
   ```

   Your output should resemble:
   ```JSON
   {"error_code":401,"message":"Unauthorized"}
   ```
3. Instead of using credentials for a user that does not exist, now use the client credentials appSA:appSA (the user appSA exists in LDAP) to try to register a new Avro schema (a record with two fields `username` and `userid`) into Schema Registry for the value of a new topic `users`. It should fail due to an authorization error, with a different message than above.
   ```text
   docker compose exec schemaregistry curl -X POST \
      -H "Content-Type: application/vnd.schemaregistry.v1+json" \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      --data '{ "schema": "[ { \"type\":\"record\", \"name\":\"user\", \"fields\": [ {\"name\":\"userid\",\"type\":\"long\"}, {\"name\":\"username\",\"type\":\"string\"} ]} ]" }' \
      -u appSA:appSA \
      https://schemaregistry:8085/subjects/users-value/versions
   ```

   Your output should resemble:
   ```JSON
   {"error_code":40301,"message":"User is denied operation Write on Subject: users-value"}
   ```
4. Create a role binding for the `appSA` client permitting it access to Schema Registry.

   Get the Kafka cluster ID:
   ```none
   KAFKA_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id --tlsv1.2 --cacert scripts/security/snakeoil-ca-1.crt | jq -r ".id")
   ```

   Create the role binding:
   ```text
   # Create the role binding for the subject ``users-value``, i.e., the topic-value (versus the topic-key)
   docker compose exec tools bash -c "confluent iam rbac role-binding create \
       --principal User:appSA \
       --role ResourceOwner \
       --resource Subject:users-value \
       --kafka-cluster-id $KAFKA_CLUSTER_ID \
       --schema-registry-cluster schema-registry"
   ```
5. Again try to register the schema. It should pass this time.  Note the schema id that it returns, e.g. below schema id is `9`.
   ```text
   docker compose exec schemaregistry curl -X POST \
      -H "Content-Type: application/vnd.schemaregistry.v1+json" \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      --data '{ "schema": "[ { \"type\":\"record\", \"name\":\"user\", \"fields\": [ {\"name\":\"userid\",\"type\":\"long\"}, {\"name\":\"username\",\"type\":\"string\"} ]} ]" }' \
      -u appSA:appSA \
      https://schemaregistry:8085/subjects/users-value/versions
   ```

   Your output should resemble:
   ```JSON
   {"id":9}
   ```
6. View the new schema for the subject `users-value`. From Confluent Control Center, click **Topics**. Scroll down to and click on the topic users and select “SCHEMA”.
   ![image](tutorials/cp-demo/images/schema1.png)

   You may alternatively request the schema via the command line:
   ```text
   docker exec schemaregistry curl -s -X GET \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      -u appSA:appSA \
      https://schemaregistry:8085/subjects/users-value/versions/1 | jq .
   ```

   Your output should resemble:
   ```JSON
   {
     "subject": "users-value",
     "version": 1,
     "id": 9,
     "schema": "{\"type\":\"record\",\"name\":\"user\",\"fields\":[{\"name\":\"username\",\"type\":\"string\"},{\"name\":\"userid\",\"type\":\"long\"}]}"
   }
   ```
7. Describe the topic `users`. Notice that it has a special configuration `confluent.value.schema.validation=true` which enables [Schema Validation](../../schema-registry/schema-validation.md#schema-validation),  a data governance feature in Confluent Server that gives operators a centralized location within the Kafka cluster itself to enforce data format correctness. Enabling Schema ID Validation allows brokers configured with `confluent.schema.registry.url` to validate that data produced to the topic is using a valid schema.
   ```bash
   docker compose exec kafka1 kafka-topics \
      --describe \
      --topic users \
      --bootstrap-server kafka1:9091 \
      --command-config /etc/kafka/secrets/client_sasl_plain.config
   ```

   Your output should resemble:
   ```bash
   Topic: users      PartitionCount: 2       ReplicationFactor: 2    Configs: confluent.value.schema.validation=true
           Topic: users      Partition: 0    Leader: 1       Replicas: 1,2   Isr: 1,2        Offline:
           Topic: users      Partition: 1    Leader: 2       Replicas: 2,1   Isr: 2,1        Offline:
   ```
8. Now produce a non-Avro message to this topic using `kafka-console-producer`.
   ```bash
   docker compose exec connect kafka-console-producer \
        --topic users \
        --broker-list kafka1:11091 \
        --producer-property security.protocol=SSL \
        --producer-property ssl.truststore.location=/etc/kafka/secrets/kafka.appSA.truststore.jks \
        --producer-property ssl.truststore.password=confluent \
        --producer-property ssl.keystore.location=/etc/kafka/secrets/kafka.appSA.keystore.jks \
        --producer-property ssl.keystore.password=confluent \
        --producer-property ssl.key.password=confluent
   ```

   After starting the console producer, it will wait for input. Enter a few characters and press enter.
   It should result in a failure with an error message that resembles:
   ```bash
   ERROR Error when sending message to topic users with key: null, value: 5 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
   org.apache.kafka.common.InvalidRecordException: This record has failed the validation on broker and hence be rejected.
   ```

   Close the console producer by entering `CTRL+C`.
9. Describe the topic `wikipedia.parsed`, which is the topic that the kafka-connect-sse source connector is writing to. Notice that it also has enabled Schema ID Validation.
   ```bash
   docker compose exec kafka1 kafka-topics \
      --describe \
      --topic wikipedia.parsed \
      --bootstrap-server kafka1:9091 \
      --command-config /etc/kafka/secrets/client_sasl_plain.config
   ```
10. Next step: Learn more about Schema Registry with the [Schema Registry Tutorial](../../schema-registry/schema_registry_onprem_tutorial.md#schema-registry-tutorial).


## Quick Start

Use this quick start to get up and running with the Confluent Cloud Amazon Redshift Sink connector. The quick start provides the basics of selecting the connector
and configuring it to stream events to Amazon Redshift.


Prerequisites
: - Authorized access to a [Confluent Cloud](https://www.confluent.io/confluent-cloud/) cluster on Amazon Web Services.
  - The Confluent CLI installed and configured for the cluster. See [Install the Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html).
  - [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
  - The Amazon Redshift database must be in the same region as your Confluent Cloud cluster.
  - For networking considerations, see [Networking and DNS](overview.md#connect-internet-access-resources). To use a set of public egress IP addresses, see [Public Egress IP Addresses for Confluent Cloud Connectors](static-egress-ip.md#cc-static-egress-ips).
  - The connector configuration requires a Redshift user (and password) with Redshift database privileges. For example:
    ```sql
    CREATE DATABASE <DB_NAME>;


    CREATE USER <DB_USER> PASSWORD '<DB_PASSWORD>';


    GRANT USAGE ON SCHEMA public TO <DB_USER>;
    GRANT CREATE ON SCHEMA public TO <DB_USER>;
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO <DB_USER>;
    GRANT ALL ON SCHEMA public TO <DB_USER>;


    GRANT CREATE ON DATABASE <DB_NAME> TO <DB_USER>;
    ```


    For additional information, see the [Redshift docs](https://docs.aws.amazon.com/redshift/latest/gsg/database-tasks.html).


  - Kafka cluster credentials. The following lists the different ways you can provide credentials.
    - Enter an existing [service account](service-account.md#s3-cloud-service-account) resource ID.
    - Create a Confluent Cloud [service account](service-account.md#s3-cloud-service-account) for the connector. Make sure to review the ACL entries required in the [service account documentation](service-account.md#s3-cloud-service-account). Some connectors have specific ACL requirements.
    - Create a Confluent Cloud API key and secret. To create a key and secret, you can use [confluent api-key create](https://docs.confluent.io/confluent-cli/current/command-reference/api-key/confluent_api-key_create.html) *or* you can autogenerate the API key and secret directly in the Cloud Console when setting up the connector.


### Standalone Cluster

1. Create `my-connect-standalone.properties` in the config directory, whose
   contents look like the following (note the security configs with
   `consumer.*` and `producer.*` prefixes).
   ```bash
   bootstrap.servers=<cloud-bootstrap-servers>

   # The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
   # need to configure these based on the format they want their data in when loaded from or stored into Kafka
   key.converter=org.apache.kafka.connect.json.JsonConverter
   value.converter=org.apache.kafka.connect.json.JsonConverter
   # Converter-specific settings can be passed in by prefixing the Converter's setting with the converter you want to apply
   # it to
   key.converter.schemas.enable=false
   value.converter.schemas.enable=false

   # The internal converter used for offsets and config data is configurable and must be specified, but most users will
   # always want to use the built-in default. Offset and config data is never visible outside of Kafka Connect in this format.
   internal.key.converter=org.apache.kafka.connect.json.JsonConverter
   internal.value.converter=org.apache.kafka.connect.json.JsonConverter
   internal.key.converter.schemas.enable=false
   internal.value.converter.schemas.enable=false

   # Store offsets on local filesystem
   offset.storage.file.filename=/tmp/connect.offsets
   # Flush much faster than normal, which is useful for testing/debugging
   offset.flush.interval.ms=10000

   ssl.endpoint.identification.algorithm=https
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
   username="<api-key>" password="<api-secret>";
   security.protocol=SASL_SSL

   consumer.ssl.endpoint.identification.algorithm=https
   consumer.sasl.mechanism=PLAIN
   consumer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
   username="<api-key>" password="<api-secret>";
   consumer.security.protocol=SASL_SSL

   producer.ssl.endpoint.identification.algorithm=https
   producer.sasl.mechanism=PLAIN
   producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
   username="<api-key>" password="<api-secret>";
   producer.security.protocol=SASL_SSL

   # Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
   # (connectors, converters, transformations).
   plugin.path=/usr/share/java,/Users/<username>/confluent-6.2.1/share/confluent-hub-components
   ```
2. (Optional) Add the configs to `my-connect-standalone.properties` to
   connect to Confluent Cloud Schema Registry per the example in
   [connect-ccloud.delta](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs/connect-ccloud.delta)
   on GitHub at [ccloud/examples/template_delta_configs](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs).
   ```bash
   # Confluent Schema Registry for Kafka Connect
   value.converter=io.confluent.connect.avro.AvroConverter
   value.converter.basic.auth.credentials.source=USER_INFO
   value.converter.schema.registry.basic.auth.user.info=<SCHEMA_REGISTRY_API_KEY>:<SCHEMA_REGISTRY_API_SECRET>
   value.converter.schema.registry.url=https://<SCHEMA_REGISTRY_ENDPOINT>
   ```

   In addition to the above settings shown in the referenced GitHub example, add these key and value converter configurations to provide valid credentials.
   ```bash
   "key.converter": "io.confluent.connect.avro.AvroConverter",
   "value.converter": "io.confluent.connect.avro.AvroConverter",

   "key.converter.schema.registry.url": "${file:/data/confluent-cloud/server.properties:SCHEMA_REGISTRY_URL}",
   "key.converter.schema.registry.basic.auth.credentials.source":"USER_INFO",
   "key.converter.schema.registry.basic.auth.user.info": "${file:/data/confluent-cloud/server.properties:BASIC_AUTH_INFO}",

   "value.converter.schema.registry.url": "${file:/data/confluent-cloud/server.properties:SCHEMA_REGISTRY_URL}",
   "value.converter.schema.registry.basic.auth.credentials.source":"USER_INFO",
   "value.converter.schema.registry.basic.auth.user.info": "${file:/data/confluent-cloud/server.properties:BASIC_AUTH_INFO}",
   ```
3. Create `my-file-sink.properties` in the config directory, whose contents
   look like the following (note the security configs with `consumer.*` prefix):
   ```text
   name=my-file-sink
   connector.class=org.apache.kafka.connect.file.FileStreamSinkConnector
   tasks.max=1
   topics=page_visits
   file=my_file.txt
   ```

   #### IMPORTANT
   You must include the following properties in the connector configuration
   if you are using a self-managed connector that requires an enterprise
   license.
   ```text
   confluent.topic.bootstrap.servers=<cloud-bootstrap-servers>
   confluent.topic.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule \
   required username="<CLUSTER_API_KEY>" password="<CLUSTER_API_SECRET>";
   confluent.topic.security.protocol=SASL_SSL
   confluent.topic.sasl.mechanism=PLAIN
   ```

   #### IMPORTANT
   You must include the following properties in the connector configuration
   if you are using a self-managed connector that uses Reporter to write
   response back to Kafka (for example, the [Azure Functions Sink Connector
   for Confluent Platform](../../../kafka-connect-azure-functions/current/index.html)
   or the [Google Cloud Functions Sink Connector for Confluent Platform](../../../kafka-connect-gcp-functions/current/index.html)
   connector) .
   ```text
   reporter.admin.bootstrap.servers=<cloud-bootstrap-servers>
   reporter.admin.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule \
   required username="<CLUSTER_API_KEY>" password="<CLUSTER_API_SECRET>";
   reporter.admin.security.protocol=SASL_SSL
   reporter.admin.sasl.mechanism=PLAIN

   reporter.producer.bootstrap.servers=<cloud-bootstrap-servers>
   reporter.producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule \
   required username="<CLUSTER_API_KEY>" password="<CLUSTER_API_SECRET>";
   reporter.producer.security.protocol=SASL_SSL
   reporter.producer.sasl.mechanism=PLAIN
   ```

   #### IMPORTANT
   You must include the following properties in the connector configuration if you are using the following connectors:

   ### Debezium 2 and later

   ```text
   "schema.history.internal.kafka.bootstrap.servers": "<cloud-bootstrap-servers>",

   "schema.history.internal.consumer.security.protocol": "SASL_SSL",
   "schema.history.internal.consumer.ssl.endpoint.identification.algorithm": "https",
   "schema.history.internal.consumer.sasl.mechanism": "PLAIN",
   "schema.history.internal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",

   "schema.history.internal.producer.security.protocol": "SASL_SSL",
   "schema.history.internal.producer.ssl.endpoint.identification.algorithm": "https",
   "schema.history.internal.producer.sasl.mechanism": "PLAIN",
   "schema.history.internal.producer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";"
   ```

   ### Debezium 1.9 and earlier

   ```text
   "database.history.kafka.bootstrap.servers": "<cloud-bootstrap-servers>",

   "database.history.consumer.security.protocol": "SASL_SSL",
   "database.history.consumer.ssl.endpoint.identification.algorithm": "https",
   "database.history.consumer.sasl.mechanism": "PLAIN",
   "database.history.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",

   "database.history.producer.security.protocol": "SASL_SSL",
   "database.history.producer.ssl.endpoint.identification.algorithm": "https",
   "database.history.producer.sasl.mechanism": "PLAIN",
   "database.history.producer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";"
   ```

   ### Oracle XStream CDC Source

   ```text
   "schema.history.internal.kafka.bootstrap.servers": "<cloud-bootstrap-servers>",

   "schema.history.internal.consumer.security.protocol": "SASL_SSL",
   "schema.history.internal.consumer.ssl.endpoint.identification.algorithm": "https",
   "schema.history.internal.consumer.sasl.mechanism": "PLAIN",
   "schema.history.internal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",

   "schema.history.internal.producer.security.protocol": "SASL_SSL",
   "schema.history.internal.producer.ssl.endpoint.identification.algorithm": "https",
   "schema.history.internal.producer.sasl.mechanism": "PLAIN",
   "schema.history.internal.producer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule
   required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";",

   # Uncomment and include the following properties only if the connector is configured to use Kafka topics for signaling
   #"signal.kafka.bootstrap.servers": "<cloud-bootstrap-servers>",
   #"signal.consumer.security.protocol": "SASL_SSL",
   #"signal.consumer.ssl.endpoint.identification.algorithm": "https",
   #"signal.consumer.sasl.mechanism": "PLAIN",
   #"signal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<CLUSTER_API_KEY>\" password=\"<CLUSTER_API_KEY>\";"
   ```
4. Run the `connect-standalone` script with the filenames as arguments:
   ```bash
   ./bin/connect-standalone  ./etc/my-connect-standalone.properties ./etc/my-file-sink.properties
   ```

   This should start a connect worker on your machine which will consume the
   records produced earlier using the `ccloud` command. If you tail the
   contents of `my_file.txt`, it should resemble the following:
   ```text
   tail -f my_file.txt
   {"field1": "hello", "field2": 1}
   {"field1": "hello", "field2": 2}
   {"field1": "hello", "field2": 3}
   {"field1": "hello", "field2": 4}
   {"field1": "hello", "field2": 5}
   {"field1": "hello", "field2": 6}
   ```


## Procedure

1. Download [Confluent Platform](https://www.confluent.io/download/) and extract the contents.
2. Create a topic named `rest-proxy-test` by using the Confluent CLI:
   ```bash
   confluent kafka topic create --partitions 4 rest-proxy-test
   ```
3. Create a properties file.
   1. Find the client settings for your cluster by clicking **CLI & client
      configuration** from the Cloud Console interface.
   2. Click the **Clients** tab.
   3. Click the **Java** client selection. This example uses the Java client.
      ![Java client configuration properties](images/ccloud-client-rest-config.png)
   4. Create a properties file named `ccloud-kafka-rest.properties` where the Confluent Platform files are location.
      ```none
      cd <path-to-confluent>
      ```

      ```none
      touch ccloud-kafka-rest.properties
      ```
   5. Copy and paste the Java client configuration properties into the file. Add the `client.` prefix to each of security properties. For example:
      ```none
      # Kafka
      bootstrap.servers=<myproject>.cloud:9092
      security.protocol=SASL_SSL
      sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule \
      required username="<kafka-cluster-api-key>" password="<kafka-cluster-api-secret>";
      ssl.endpoint.identification.algorithm=https
      sasl.mechanism=PLAIN
      client.bootstrap.servers=<myproject>.cloud:9092
      client.security.protocol=SASL_SSL
      client.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule \
      required username="<kafka-cluster-api-key>" password="<kafka-cluster-api-secret>";
      client.ssl.endpoint.identification.algorithm=https
      client.sasl.mechanism=PLAIN
      # Confluent Cloud Schema Registry
      schema.registry.url=<schema-registry-endpoint>
      client.basic.auth.credentials.source=USER_INFO
      client.schema.registry.basic.auth.user.info=<schema-registry-api-key>:<schema-registry-api-secret>
      ```

   Producers, consumers, and the admin client share the `client.` properties. Refer to the following table to specify additional properties for the producer, consumer, or admin client.

   | Component    | Prefix      | Example                     |
   |--------------|-------------|-----------------------------|
   | Admin Client | `admin.`    | admin.request.timeout.ms    |
   | Consumer     | `consumer.` | consumer.request.timeout.ms |
   | Producer     | `producer.` | producer.acks               |

   An example of adding these properties is shown below:
   ```none
   # Kafka
   bootstrap.servers=<myproject>.cloud:9092
   security.protocol=SASL_SSL
   client.security.protocol=SASL_SSL
   client.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-cluster-api-key>" password="<kafka-cluster-api-secret>";
   client.ssl.endpoint.identification.algorithm=https
   sasl.mechanism=PLAIN
   client.sasl.mechanism=PLAIN
   # Confluent Cloud Schema Registry
   schema.registry.url=<schema-registry-endpoint>
   client.basic.auth.credentials.source=USER_INFO
   client.schema.registry.basic.auth.user.info=<schema-registry-api-key>:<schema-registry-api-secret>

   # consumer only properties must be prefixed with consumer.
   consumer.retry.backoff.ms=600
   consumer.request.timeout.ms=25000

   # producer only properties must be prefixed with producer.
   producer.acks=1

   # admin client only properties must be prefixed with admin.
   admin.request.timeout.ms=50000
   ```

   For details about how to create a Confluent Cloud API key and API secret so that
   you can communicate with the REST API, refer to
   [Create credentials to access the Kafka cluster resources](../kafka-rest/krest-qs.md#rest-api-qs-create-creds).
4. Start the REST Proxy.
   ```none
   ./bin/kafka-rest-start ccloud-kafka-rest.properties
   ```
5. Make REST calls using [REST API v2](/platform/current/kafka-rest/api.html#crest_v2_api).

   Example request:
   ```none
   GET /topics/test HTTP/1.1
   Accept: application/vnd.kafka.v2+json
   ```


#### Distributed worker configuration

1. Create your `my-connect-distributed-json.properties` file based on the following example.
   ```text
   bootstrap.servers=<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.json.JsonConverter
   value.converter.schemas.enable=false

   ssl.endpoint.identification.algorithm=https
   security.protocol=SASL_SSL
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-api-key>" password="<kafka-api-secret>";
   request.timeout.ms=20000
   retry.backoff.ms=500

   producer.bootstrap.servers=<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>
   producer.ssl.endpoint.identification.algorithm=https
   producer.security.protocol=SASL_SSL
   producer.sasl.mechanism=PLAIN
   producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-api-key>" password="<kafka-api-secret>";
   producer.request.timeout.ms=20000
   producer.retry.backoff.ms=500

   consumer.bootstrap.servers=<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>
   consumer.ssl.endpoint.identification.algorithm=https
   consumer.security.protocol=SASL_SSL
   consumer.sasl.mechanism=PLAIN
   consumer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-api-key>" password="<kafka-api-secret>";
   consumer.request.timeout.ms=20000
   consumer.retry.backoff.ms=500

   offset.flush.interval.ms=10000
   offset.storage.file.filename=/tmp/connect.offsets
   group.id=connect-cluster
   offset.storage.topic=connect-offsets
   offset.storage.replication.factor=3
   offset.storage.partitions=3
   config.storage.topic=connect-configs
   config.storage.replication.factor=3
   status.storage.topic=connect-status
   status.storage.replication.factor=3

   # Confluent license settings
   confluent.topic.bootstrap.servers=<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>
   confluent.topic.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<kafka-api-key>" password="<kafka-api-secret>";
   confluent.topic.security.protocol=SASL_SSL
   confluent.topic.sasl.mechanism=PLAIN


   # Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
   # (connectors, converters, transformations). The list should consist of top level directories that include
   # any combination of:
   # a) directories immediately containing jars with plugins and their dependencies
   # b) uber-jars with plugins and their dependencies
   # c) directories immediately containing the package directory structure of classes of plugins and their dependencies
   # Examples:
   # plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
   plugin.path=/usr/share/java,<path-to>/confluent-6.0.0/share/confluent-hub-components

   # Enable source connectors to create topics
   # KIP-158
   topic.creation.enable=true
   ```
2. Start Kafka Connect with the following command:
   ```text
   <path-to-confluent-home>/bin/connect-distributed my-connect-distributed-json.properties
   ```


## Quick start

In this quick start guide, you use the SQS Source connector to export messages
from an SQS FIFO queue to a Kafka topic. Before running the quick start, ensure
the following:

- [Confluent Platform](/platform/current/installation/installing_cp/index.html) is
  installed and services are running by using the [Confluent
  CLI](https://docs.confluent.io/confluent-cli/current/index.html). This quick start assumes that you are using the
  Confluent CLI, but standalone installations are also supported. By default
  ZooKeeper, Kafka, Schema Registry, Connect, and the Connect REST API are started with the
  `confluent local start` command. For more information, see
  [Quick Start for Apache Kafka using Confluent Platform
  (Local)](/platform/current/quickstart/ce-quickstart.html). Note that as of Confluent Platform 7.5, ZooKeeper is
  deprecated for new deployments. Confluent recommends KRaft mode for new
  deployments.
- You must install the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html) and
  configure it by running the `aws configure` command.
- Ensure the IAM user or role you configure has full access to SQS.
- Create a Kafka topic called `sqs-quickstart`.

1. Create a FIFO queue by running the following command:
   ```bash
   aws sqs create-queue --queue-name sqs-source-connector-demo
   ```

   You should see output similar to the following:
   ```bash
   {
     "QueueUrl": "https://queue.amazonaws.com/940887362971/sqs-source-connector-demo"
   }
   ```
2. Add some records to the newly created queue by first creating a file called
   `send-message-batch.json` with the following content:
   ```bash
   [
     {
         "Id":"FuelReport-0001-2015-09-16T140731Z",
         "MessageBody":"Fuel report for account 0001 on 2015-09-16 at 02:07:31 PM.",
         "DelaySeconds":10,
         "MessageAttributes":{
           "SellerName":{
               "DataType":"String",
               "StringValue":"Example Store"
           },
           "City":{
               "DataType":"String",
               "StringValue":"Any City"
           },
           "Region":{
               "DataType":"String",
               "StringValue":"WA"
           },
           "PostalCode":{
               "DataType":"String",
               "StringValue":"99065"
           },
           "PricePerGallon":{
               "DataType":"Number",
               "StringValue":"1.99"
           }
         }
     },
     {
         "Id":"FuelReport-0002-2015-09-16T140930Z",
         "MessageBody":"Fuel report for account 0002 on 2015-09-16 at 02:09:30 PM.",
         "DelaySeconds":10,
         "MessageAttributes":{
           "SellerName":{
               "DataType":"String",
               "StringValue":"Example Fuels"
           },
           "City":{
               "DataType":"String",
               "StringValue":"North Town"
           },
           "Region":{
               "DataType":"String",
               "StringValue":"WA"
           },
           "PostalCode":{
               "DataType":"String",
               "StringValue":"99123"
           },
           "PricePerGallon":{
               "DataType":"Number",
               "StringValue":"1.87"
           }
         }
     }
   ]
   ```
3. Add the records to the queue by running the following command:
   ```bash
   aws sqs send-message-batch --queue-url https://queue.amazonaws.com/940887362971/sqs-source-connector-demo --entries file://send-message-batch.json``
   ```
4. Load the SQS Source connector. Note that you must ensure the `sqs.url`
   configuration parameter points to the correct SQS URL. The `sqs.url` parameter
   format is:

   `sqs.url=https://sqs.<region-code>.amazonaws.com/<account_no>/<topic-name.fifo>`.

   For example, if the AWS CLI returns the queue URL:
   `https://eu-central-1.queue.amazonaws.com/829250931565/sqs-source-connector-demo`,
   the `sqs.url` for the SQS Source connector is
   `https://sqs.eu-central-1.amazonaws.com/829250931565/sqs-source-connector-demo`.
   ```bash
   confluent local load sqs-source
   ```

   Your output should resemble:
   ```bash
   {
     "name": "sqs-source",
     "config": {
       "connector.class": "io.confluent.connect.sqs.source.SqsSourceConnector",
       "tasks.max": "1",
       "kafka.topic": "test-sqs-source",
       "sqs.url": "https://sqs.us-east-1.amazonaws.com/942288736285822/sqs-fifo-queue.fifo",
       "name": "sqs-source"
     },
     "tasks": [],
     "type": null
   }
   ```
5. After the connector finishes ingesting data to Kafka, check that the data is
   available in the Kafka topic.
   ```bash
   bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-sqs-source --from-beginning
   ```

   You should see two records, similar to the following:
   ```bash
   {
     "schema":{
         "type":"struct",
         "fields":[
           {
               "type":"int64",
               "optional":false,
               "field":"ApproximateFirstReceiveTimestamp"
           },
           {
               "type":"int32",
               "optional":false,
               "field":"ApproximateReceiveCount"
           },
           {
               "type":"string",
               "optional":false,
               "field":"SenderId"
           },
           {
               "type":"int64",
               "optional":false,
               "field":"SentTimestamp"
           },
           {
               "type":"string",
               "optional":true,
               "field":"MessageDeduplicationId"
           },
           {
               "type":"string",
               "optional":true,
               "field":"MessageGroupId"
           },
           {
               "type":"string",
               "optional":true,
               "field":"SequenceNumber"
           },
           {
               "type":"string",
               "optional":false,
               "field":"Body"
           }
         ],
         "optional":false,
         "version":1
     },
     "payload":{
         "ApproximateFirstReceiveTimestamp":1563430750668,
         "ApproximateReceiveCount":2,
         "SenderId":"AIDA5WEKBZWN3QYIY7KAJ",
         "SentTimestamp":1563430591780,
         "MessageDeduplicationId":null,
         "MessageGroupId":null,
         "SequenceNumber":null,
         "Body":"Fuel report for account 0001 on 2015-09-16 at 02:07:31 PM."
     }
   }
   ```


## Requirements

From a high level, Replicator works like a consumer group with the partitions of the
replicated topics from the source cluster divided between the connector’s tasks.
Replicator periodically polls the source cluster for changes to the
configuration of replicated topics and the number of partitions, and updates the
destination cluster accordingly by creating topics or updating configuration.
For this to work correctly, the following is required:

* The Origin and Destination clusters must be Apache Kafka® or Confluent Platform. For version compatibility see [connector interoperability](../../installation/versions-interoperability.md#interoperability-versions-connectors)
* The Replicator version must match the Kafka Connect version it is deployed on. For instance Replicator 8.1 should
  only be deployed to Kafka Connect 8.1.
* The ACLs mentioned in [here](#replicator-security-overview) are required.
* The default topic configurations in the source and destination
  clusters must match. In general, aside from any broker-specific
  settings (such as `broker.id`), you should use the same broker
  configuration in both clusters.
* The destination Kafka cluster must have a similar capacity as the
  source cluster. In particular, since Replicator will preserve
  the replication factor of topics in the source cluster, which means
  that there must be at least as many brokers as the maximum
  replication factor used. If not, topic creation will fail until the
  destination cluster has the capacity to support the same
  replication factor. Note in this case, that topic creation will be
  retried automatically by the connector, so replication will begin
  as soon as the destination cluster has enough brokers.
* The `dest.kafka.bootstrap.servers` destination connection setting in the Replicator
  properties file must be configured to use a single destination cluster, even when
  using multiple source clusters. For example, the figure shown at the start of this
  section shows two source clusters in different datacenters targeting a single
  *aggregate* destination cluster. Note that the aggregate destination cluster must
  have a similar capacity as the total of all associated source clusters.
* On Confluent Platform versions 5.3.0 and later, Confluent Replicator requires the enterprise edition of
  [Kafka Connect](../../connect/index.md#kafka-connect). Replicator does not support
  the community edition of Connect. You can install the enterprise edition of Connect
  as part of the Confluent Platform on-premises bundle, as described in [Production Environments](../../installation/overview.md#on-prem-production) and in
  the [Quick Start for Confluent Platform](../../get-started/platform-quickstart.md#quickstart) (choose self-managed Confluent Platform). Demos of enterprise Connect are available at [Quick Start for Confluent Platform](../../get-started/platform-quickstart.md#quickstart) and
  on Docker Hub at [confluentinc/cp-server-connect](https://hub.docker.com/r/confluentinc/cp-server-connect).
* The `timestamp-interceptor` for consumers supports only Java clients, as described in [Configuring the consumer for failover (timestamp preservation)](replicator-failover.md#configuring-the-consumer-for-failover).


## Resume applications after failover

After a disaster event occurs, switch your Java consumer application to
a different datacenter, and then it can automatically restart consuming
data in the destination cluster where it left off in the origin cluster.

To use this capability, configure Java consumer applications with the
[Consumer Timestamps Interceptor](replicator-failover.md#configuring-the-consumer-for-failover), which is shown in this
[sample code](https://github.com/confluentinc/examples/tree/latest/multi-datacenter/src/main/java/io/confluent/examples/clients/ConsumerMultiDatacenterExample.java).

1. After starting the demo (see [previous
   section](#start-the-services)), run the consumer to connect to the
   `dc1` Kafka cluster. It automatically configures the consumer group
   ID as `java-consumer-topic1` and uses the Consumer Timestamps
   Interceptor.
   ```bash
   mvn clean package
   mvn exec:java -Dexec.mainClass=io.confluent.examples.clients.ConsumerMultiDatacenterExample -Dexec.args="topic1 localhost:9091 http://localhost:8081 localhost:9092"
   ```
2. Verify in the consumer output that it is reading data originating from both dc1 and dc2:
   ```bash
   ...
   key = User_1, value = {"userid": "User_1", "dc": "dc1"}
   key = User_9, value = {"userid": "User_9", "dc": "dc2"}
   key = User_6, value = {"userid": "User_6", "dc": "dc2"}
   ...
   ```
3. Even though the consumer is consuming from dc1, there are dc2
   consumer offsets committed for the consumer group
   `java-consumer-topic1`. Run the following command to read from the
   `__consumer_offsets` topic in dc2.
   ```bash
   docker-compose exec broker-dc2 \
       kafka-console-consumer \
       --topic __consumer_offsets \
       --bootstrap-server localhost:9092 \
       --formatter "kafka.coordinator.group.GroupMetadataManager\$OffsetsMessageFormatter" | grep java-consumer
   ```
4. Verify that there are committed offsets:
   ```bash
   ...
   [java-consumer-topic1,topic1,0]::OffsetAndMetadata(offset=1142, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1547146285084, expireTimestamp=None)
   [java-consumer-topic1,topic1,0]::OffsetAndMetadata(offset=1146, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1547146286082, expireTimestamp=None)
   [java-consumer-topic1,topic1,0]::OffsetAndMetadata(offset=1150, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1547146287084, expireTimestamp=None)
   ...
   ```
5. Kafka clients include any
   application that uses the Apache Kafka client API to connect to Kafka
   brokers, such as custom client code or any service that has embedded
   producers or consumers, such as Kafka Connect, ksqlDB, or a Kafka Streams
   application. Control Center uses that topic to ensure that all messages are
   delivered and to provide statistics on throughput and latency performance.
   From that same topic, you can also derive which producers are writing to
   which topics and which consumers are reading from which topics, and an
   example [script](https://github.com/confluentinc/examples/tree/latest/multi-datacenter/map_topics_clients.py) is
   provided with the repo.
   ```bash
   ./map_topics_clients.py
   ```

   #### NOTE
   This script is for demo purposes only. It is not suitable for production.
6. In steady state with the Java consumer running, you should see:
   ```bash
   Reading topic _confluent-monitoring for 60 seconds...please wait

   __consumer_timestamps
     producers
       consumer-1
       producer-10
       producer-11
       producer-6
       producer-8
     consumers
       replicator-dc1-to-dc2-topic1
       replicator-dc1-to-dc2-topic2
       replicator-dc2-to-dc1-topic1

   _schemas
     producers
       connect-worker-producer-dc2
     consumers
       replicator-dc1-to-dc2-topic1

   topic1
     producers
       connect-worker-producer-dc1
       connect-worker-producer-dc2
       datagen-dc1-topic1
       datagen-dc2-topic1
     consumers
       java-consumer-topic1
       replicator-dc1-to-dc2-topic1
       replicator-dc2-to-dc1-topic1

   topic2
     producers
       datagen-dc1-topic2
     consumers
       replicator-dc1-to-dc2-topic2

   topic2.replica
     producers
       connect-worker-producer-dc2
   ```
7. Shut down `dc1`:
   ```bash
   docker-compose stop connect-dc1 schema-registry-dc1 broker-dc1 zookeeper-dc1
   ```
8. Stop and restart the consumer to connect to the `dc2` Kafka
   cluster. It will still use the same consumer group ID
   `java-consumer-topic1` so it can resume where it left off:
   ```bash
   mvn exec:java -Dexec.mainClass=io.confluent.examples.clients.ConsumerMultiDatacenterExample -Dexec.args="topic1 localhost:9092 http://localhost:8082 localhost:9092"
   ```
9. Verify that see data sourced only from `dc2`:
   ```bash
   ...
   key = User_8, value = {"userid": "User_8", "dc": "dc2"}
   key = User_9, value = {"userid": "User_9", "dc": "dc2"}
   key = User_5, value = {"userid": "User_5", "dc": "dc2"}
   ...
   ```


## Kafka Connect

This section describes how to enable security for Kafka Connect. Securing Kafka Connect requires that you configure security for:

1. Kafka Connect workers: part of the Kafka Connect API, a worker is really just an advanced client, underneath the covers
2. Kafka Connect connectors: connectors may have embedded producers or consumers, so you must override the default configurations for Connect producers used with source connectors and Connect consumers used with sink connectors
3. Kafka Connect REST: Kafka Connect exposes a REST API that can be configured to use TLS/SSL using [additional properties](../../protect-data/encrypt-tls.md#encryption-ssl-rest)

Configure security for Kafka Connect as described in the section below. Additionally, if you are using Confluent Control Center streams monitoring for Kafka Connect, configure security for:

* [Confluent Metrics Reporter](#authentication-ssl-metrics-reporter)


Configure the top-level settings in the Connect workers to use TLS by adding
these properties in `connect-distributed.properties`. These top-level settings
are used by the Connect worker for group coordination and to read and write to
the internal topics that are used to track the cluster’s state (for example,
configs and offsets). The assumption here is that client authentication is
required by the brokers.

```bash
bootstrap.servers=kafka1:9093
security.protocol=SSL
ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
ssl.truststore.password=test1234
ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
ssl.keystore.password=test1234
ssl.key.password=test1234
```

Connect workers manage the producers used by source connectors and the consumers
used by sink connectors. So, for the connectors to leverage security, must also
override the default producer/consumer configuration that the worker uses.
The assumption here is that client authentication is required by the brokers.

* For source connectors: configure the same properties adding the `producer` prefix.
  ```bash
  producer.bootstrap.servers=kafka1:9093
  producer.security.protocol=SSL
  producer.ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
  producer.ssl.truststore.password=test1234
  producer.ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
  producer.ssl.keystore.password=test1234
  producer.ssl.key.password=test1234
  ```
* For sink connectors: configure the same properties adding the `consumer` prefix.
  ```bash
  consumer.bootstrap.servers=kafka1:9093
  consumer.security.protocol=SSL
  consumer.ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
  consumer.ssl.truststore.password=test1234
  consumer.ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
  consumer.ssl.keystore.password=test1234
  consumer.ssl.key.password=test1234
  ```


## 1 - Establish trust between the IdP and Confluent Platform

Use the procedure below to ensure that you configure your IdP correctly. Under
each step, select the tab that corresponds to your IdP for details.

1. Create an OIDC client application configured with an authorization code grant
   type in your IdP.

   The following tabs contain provider-specific configuration instructions:

   ### Okta

   In the Okta documentation, complete [Create OIDC app integrations](https://help.okta.com/en-us/content/topics/apps/apps_app_integration_wizard_oidc.htm).

   ### Keycloak

   > In the Keycloak documentation, complete [Managing OpenID Connect clients](https://www.keycloak.org/docs/latest/server_admin/index.html#oidc-clients).

   ### Microsoft Entra ID

   In the Microsoft Azure documentation, complete [Quickstart: Register an application with the Microsoft identity
   platform](https://docs.microsoft.com/en-us/azure/active-directory/develop/quickstart-register-app).

   #### WARNING
   Microsoft Entra ID users must create a
   separate registered application for OAuth. Do not combine OAuth
   together with SAML in a single application configuration. Such a
   dual configuration can lead to issues with single sign-on (SSO) and
   other authentication flows.
2. Add a redirect (callback) URL to Confluent Control Center on Confluent Platform in the client application.

   The URL should follow this format:
   ```html
   https://<c3-hostname>:<c3-port>/api/metadata/security/1.0/oidc/authorization-code/callback
   ```
3. Enable identity tokens.

   Identity tokens are enabled by default when you create an OIDC application in
   your IdP.

   ### Okta

   > Creating the authorization server by default enables identity (ID) tokens. For more information, see [ID tokens](https://developer.okta.com/docs/reference/api/oidc/#id-token).

   ### Keycloak

   > For more information, see [Server Administration Guide](https://www.keycloak.org/docs/latest/server_admin/) in the
   > Keycloak documentation.

   ### Microsoft Entra ID

   For more information on enabling identity tokens, see [ID tokens in
   the Microsoft identity platform](https://learn.microsoft.com/en-us/azure/active-directory/develop/id-tokens)
   in the Azure documentation.
4. Enable refresh tokens.

   ### Okta

   > Check the **Refresh Token** option in the **Grant type** section of
   > the **Applications** page.

   ### Keycloak

   > Refresh tokens are enabled by default. For more information, see
   > [Authorization Code Flow](https://www.keycloak.org/docs/latest/server_admin/#_oidc-auth-flows-authorization)
   > in the Keycloak documentation.

   ### Microsoft Entra ID

   See [Refresh tokens](https://docs.microsoft.com/en-us/azure/active-directory/develop/refresh-tokens)
   as documented in the Microsoft Azure documentation.
5. Include group claims in the ID tokens. Following are some details and links
   to help you get started.

   ### Okta

   > 1. Navigate to your authorization server under **Security** > **API**.
   > 2. Go to **Claims** and configure a claim for groups.
   > 3. [Add a Groups claim for the org authorization server](https://developer.okta.com/docs/guides/customize-tokens-groups-claim/main/#request-a-token-that-contains-the-custom-claim).

   ### Keycloak

   > Configure a new **Group Membership mapper**. Make sure to have **Full
   > group path** enabled.

   ### Microsoft Entra ID

   To add groups claims, navigate to **App registrations** > **Token
   configuration** and follow these instructions [Configuring group
   claims and app roles in tokens](https://learn.microsoft.com/en-us/security/zero-trust/develop/configure-tokens-group-claims-app-roles)
   in the Microsoft Azure documentation.
6. Assign users to the client application in your IdP.

   If you are using groups to control access to Confluent Control Center , you will assign users to the
   groups in the group configuration steps below.
7. Get the IdP endpoints.

   You can use the OpenID provider configuration response to get the identity
   provider endpoints required to fetch, authorize, and verify tokens.
   * Token endpoint URL (`token_endpoint`)
   * Authorization endpoint URL (`authorization_endpoint`)
   * JSON Web Key Set (JWKS) URL (`jwks_uri`)
   * Issuer URL (`issuer`)

   Use the OIDC metadata discovery URI listed below for your IdP to get the
   following IdP endpoints and save them for later use:

   ### Okta

   > ```html
   > https://<okta-domain>/oauth2/default/.well-known/openid-configuration
   > ```

   > For more information, see [/.well-known/openid-configuration [Okta
   > documentation]](https://developer.okta.com/docs/reference/api/oidc/#well-known-openid-configuration).

   ### Keycloak

   > ```html
   > https://<keycloak-domain>/realms/<realm-name>/.well-known/openid-configuration
   > ```

   > For more information, see [Using OpenID Connect to secure applications
   > and services](https://www.keycloak.org/docs/latest/securing_apps/#_oidc).

   ### Microsoft Entra ID

   ```html
   https://login.microsoftonline.com/<tenant-id>/v2.0/.well-known/openid-configuration
   ```

   For more information, see [OpenID Connect authentication with Azure
   Active Directory](https://learn.microsoft.com/en-us/azure/active-directory/architecture/auth-oidc#implement-oidc-with-azure-ad)
   and [OpenID Connect on the Microsoft identity platform](https://learn.microsoft.com/en-us/azure/active-directory/develop/v2-protocols-oidc)
8. Get the client credential details

   From the client application you created in the IdP, get the following
   client credentials and save them for later use:
   * Client ID (`client_id`)
   * Client secret (`client_secret`)
9. Configure IdP client credentials and endpoints.

   On each Confluent Server broker node, add or update the following parameters in the Confluent Platform
   broker configuration file using the endpoints obtained in step 7 above.
   ```properties
   confluent.oidc.idp.issuer=<issuer>
   confluent.oidc.idp.jwks.endpoint.uri=<jwks_uri>
   confluent.oidc.idp.authorize.base.endpoint.uri=<authorization_endpoint>
   confluent.oidc.idp.token.base.endpoint.uri=<token_endpoint>
   confluent.oidc.idp.client.id=<client-id>
   confluent.oidc.idp.client.secret=<client-secret>
   ```
10. Configure groups in your IdP.

    In Confluent Platform you can use group authorization to control user access to any resource, for
    example, a Confluent Platform cluster or a topic. To support this behavior, you must
    create groups and assign users to them in your IdP.
11. Add the following `confluent.oidc.idp.groups.claim.name` parameter to the
    Confluent Platform broker configuration file on each Confluent Server broker.
    ```properties
    confluent.oidc.idp.groups.claim.name=groups
    ```

    The `confluent.oidc.idp.groups.claim.name` is required and must match the
    value of the groups claim configured in your IdP. The default value is
    `groups`, but it should match the claim value on your IdP setup. If the
    values do not match, problems occur during authorization.
12. For KRaft clusters, you must add the following parameter to each Confluent Platform
    controller configuration file for the listener that is used for inter-broker
    communication.
    ```properties
    listener.name.${listenerName}.principal.builder.class=io.confluent.kafka.security.authenticator.OAuthKafkaPrincipalBuilder
    ```

    The `io.confluent.kafka.security.authenticator.OAuthKafkaPrincipalBuilder`
    parameter enables administration requests to process the group extraction
    logic. Without this parameter, group-based authorization does not work.

    In a Confluent Platform cluster, the `DefaultPrincipalBuilder` creates a `KafkaPrincipal`
    that does not include groups. This becomes evident during interactions between
    the Confluent Control Center and Confluent Server brokers. For example, when using the `KafkaAdminClient` to
    retrieve topics, the `DefaultPrincipalBuilder` produces the
    `KafkaPrincipal`.

    In contrast, the `OAuthKafkaPrincipalBuilder` employs an `OAuthBearer`
    token to generate a `KafkaPrincipal`. This builder also incorporates
    groups, allowing them to be passed to the authorizer where necessary. By
    using the `OAuthKafkaPrincipalBuilder` the `KafkaPrincipal` can include
    groups.


## Security and connection requirements

All connections to Confluent Cloud are encrypted using
[Transport Layer Security (TLS)](../security/encrypt/tls.md#manage-data-in-transit-with-tls) and
require specific client configurations for successful connection.

For comprehensive information about TLS encryption, SNI requirements, certificate
management, and client prerequisites, see [Client Configuration Properties](client-configs.md#client-producer-consumer-config-recs-cc).


You can use Apache Kafka® clients to write distributed applications and microservices
that read, write, and process streams of events in parallel, at scale, and in a
fault-tolerant manner, even those related to network problems or machine failures.

The Kafka client library provides functions, classes, and utilities that you can use to create Kafka
[producer](../_glossary.md#term-producer) clients and [consumer](../_glossary.md#term-consumer) clients using your choice of programming languages. The primary way to build
production-ready producers and consumers is by using a programming language and a Kafka client
library.

The official Confluent supported clients are:

* Java: The official Java client library supports the producer, consumer,
  Streams, and Connect APIs.
* [librdkafka](https://docs.confluent.io/platform/current/clients/librdkafka/html/md_INTRODUCTION.html):
  The librdkafka and the following derived clients libraries only support the
  admin, producer, and consumer APIs.
  * C/C++
  * Python
  * Go
  * .NET
  * JavaScript

When you use the official Confluent-supported clients, you get the same
enterprise-level support that you get with the rest of Confluent Platform:

* The release cycle for Confluent-provided clients follow the Confluent release
  cycle, as opposed to the Kafka release cycle.
* Confluent Platform maintenance fixes are provided for the 2-3 years (2 years with the
  Standard Support and 3 years with the Platinum Support) after the initial
  release of a minor version.

Additional open-source and community-developed Kafka client libraries are
available for other programming languages. Some of these include Scala,
Ruby, Rust, PHP, and Elixir.

The core APIs in the Kafka client library are:

* Producer API: This API provides classes and methods for creating and sending
  messages to Kafka topics. It allows developers to specify message payloads,
  keys, and metadata and to control message delivery and acknowledgment.
* Consumer API: This API provides classes and methods for consuming messages
  from Kafka topics. It allows developers to subscribe to one or more topics,
  receive messages in batches or individually, and process messages using custom
  logic.
* Streams API: This API provides a high-level abstraction for building real-time
  data processing applications that consume, transform, and produce data streams
  from Kafka topics.
* Connector API: This API provides a framework for building connectors that can
  transfer data between Kafka topics and external data systems, such as
  databases, message queues, and cloud storage services.
* Admin API: This API provides functions for managing Kafka topics, partitions,
  and configurations. It allows developers to create, delete, and update topics
  and retrieve metadata about Kafka clusters and brokers.

In addition to these core APIs, the Kafka client library includes various tools
and utilities for configuring and monitoring Kafka clients and clusters,
handling errors and exceptions and optimizing client performance and
scalability.


# Configure Clients


* [Consumer](consumer.md)
  * [A Quick Consumer Review](consumer.md#a-quick-consumer-review)
  * [Consumer groups](consumer.md#consumer-groups)
  * [Groups and rebalance protocols](consumer.md#groups-and-rebalance-protocols)
    * [Overview of the rebalance protocols](consumer.md#overview-of-the-rebalance-protocols)
    * [How a group’s rebalance protocol is determined](consumer.md#how-a-group-s-rebalance-protocol-is-determined)
    * [Upgrading or switching consumer protocols](consumer.md#upgrading-or-switching-consumer-protocols)
      * [Before migrating to the consumer rebalance protocol](consumer.md#before-migrating-to-the-consumer-rebalance-protocol)
      * [How to do a rolling deployment](consumer.md#how-to-do-a-rolling-deployment)
      * [How to do an empty group restart](consumer.md#how-to-do-an-empty-group-restart)
  * [Offset management](consumer.md#offset-management)
    * [Committing offsets and reset policy](consumer.md#committing-offsets-and-reset-policy)
    * [Auto-commit offsets](consumer.md#auto-commit-offsets)
    * [Manual commit API](consumer.md#manual-commit-api)
    * [Asynchronous commits](consumer.md#asynchronous-commits)
    * [Dealing with commit failures and rebalances](consumer.md#dealing-with-commit-failures-and-rebalances)
    * [Sync vs async: safety and performance tradeoffs](consumer.md#sync-vs-async-safety-and-performance-tradeoffs)
    * [Coordinating offset commits with external systems](consumer.md#coordinating-offset-commits-with-external-systems)
    * [Exactly-once processing and transactions](consumer.md#exactly-once-processing-and-transactions)
  * [Kafka consumer configuration](consumer.md#ak-consumer-configuration)
    * [Core configuration properties](consumer.md#core-configuration-properties)
    * [Consumer rebalance protocol configuration](consumer.md#consumer-rebalance-protocol-configuration)
    * [Configure a consumer for classic rebalance protocol](consumer.md#configure-a-consumer-for-classic-rebalance-protocol)
  * [Configure partition assignment](consumer.md#configure-partition-assignment)
    * [For the new consumer rebalance protocol](consumer.md#for-the-new-consumer-rebalance-protocol)
    * [For the classic rebalance protocol](consumer.md#for-the-classic-rebalance-protocol)
  * [Message handling](consumer.md#message-handling)
  * [Kafka consumer group tool](consumer.md#ak-consumer-group-tool)
    * [List consumer groups](consumer.md#list-consumer-groups)
    * [Describe groups](consumer.md#describe-groups)
    * [Reset offsets](consumer.md#reset-offsets)
  * [Consumer examples](consumer.md#consumer-examples)
  * [Related content](consumer.md#related-content)
* [Share Consumers](sharegroups.md)
  * [Key capabilities and differences from consumer groups](sharegroups.md#key-capabilities-and-differences-from-consumer-groups)
  * [Configuration for client developers](sharegroups.md#configuration-for-client-developers)
  * [Manage share groups using the kafka-share-groups tool](sharegroups.md#manage-share-groups-using-the-kafka-share-groups-tool)
  * [Monitoring Share Groups](sharegroups.md#monitoring-share-groups)
  * [Using the KafkaShareConsumer for Share Groups](sharegroups.md#using-the-kafkashareconsumer-for-share-groups)
    * [Configuration](sharegroups.md#configuration)
    * [Subscribing to Topics](sharegroups.md#subscribing-to-topics)
    * [Polling for Records and Liveness](sharegroups.md#polling-for-records-and-liveness)
    * [Record Delivery and Acknowledgement](sharegroups.md#record-delivery-and-acknowledgement)
      * [Implicit Acknowledgement](sharegroups.md#implicit-acknowledgement)
      * [Explicit Acknowledgement](sharegroups.md#explicit-acknowledgement)
    * [Multithreaded Processing](sharegroups.md#multithreaded-processing)
    * [Transactional Records and Isolation Level](sharegroups.md#transactional-records-and-isolation-level)
* [Producer](producer.md)
  * [Kafka Producer Configuration](producer.md#ak-producer-configuration)
    * [Core Configuration](producer.md#core-configuration)
    * [Message Durability](producer.md#message-durability)
    * [Message Ordering](producer.md#message-ordering)
    * [Batching and Compression](producer.md#batching-and-compression)
    * [Queuing Limit](producer.md#queuing-limit)
  * [Producer examples](producer.md#producer-examples)
  * [Learn More](producer.md#learn-more)
* [Configuration Properties](client-configs.md)
  * [Recommendations](client-configs.md#recommendations)
  * [Transport Layer Security (TLS) and Connection Requirements](client-configs.md#transport-layer-security-tls-and-connection-requirements)
    * [TLS SNI extension requirement](client-configs.md#tls-sni-extension-requirement)
    * [Manage TLS certificates](client-configs.md#manage-tls-certificates)
  * [Client Prerequisites and Version Requirements](client-configs.md#client-prerequisites-and-version-requirements)
  * [JVM settings for Java clients](client-configs.md#jvm-settings-for-java-clients)
  * [Cluster upgrades and error handling](client-configs.md#cluster-upgrades-and-error-handling)
  * [Client configuration properties](client-configs.md#client-configuration-properties)
  * [Why tuning client configurations is important](client-configs.md#why-tuning-client-configurations-is-important)
  * [Configuration categories](client-configs.md#configuration-categories)
  * [Configuration properties](client-configs.md#configuration-properties)
    * [Before you modify properties](client-configs.md#before-you-modify-properties)
    * [Common properties](client-configs.md#common-properties)
    * [Producer properties](client-configs.md#producer-properties)
    * [Consumer properties](client-configs.md#consumer-properties)
  * [OpenId Connect (OIDC) and token retry behavior](client-configs.md#openid-connect-oidc-and-token-retry-behavior)
    * [Java Client](client-configs.md#java)
    * [Schema Registry Java Client](client-configs.md#sr-java)
    * [JavaScript Client for Kafka](client-configs.md#nodejs-for-ak)
    * [Schema Registry JavaScript Client](client-configs.md#sr-nodejs)
    * [librdkafka derived (non-Java) clients](client-configs.md#librdkafka-derived-non-java-clients)


### Confluent for Kubernetes

For details about the supported Kubernetes environments, refer to [Confluent for
Kubernetes Supported Environments](https://docs.confluent.io/operator/current/co-plan.html#supported-environments-and-prerequisites).

The following table summarizes the Confluent Platform features supported with Confluent for Kubernetes.

| Confluent Platform 8.1 Feature        | Availability in CFK 3.00\*           |
|---------------------------------------|--------------------------------------|
| Kafka Broker                          | Available, only via Confluent Server |
| Schema Registry                       | Available                            |
| REST Proxy                            | Available                            |
| ksqlDB                                | Available                            |
| Connect                               | Available                            |
| Control Center                        | Available                            |
| Replicator                            | Available                            |
| Security: Role-based Access Control   | Available <sub>[2]</sub>             |
| Security: Authentication              | Available <sub>[3]</sub>             |
| Security: Network Encryption          | Available                            |
| Structured Audit Logs                 | Available <sub>[4]</sub>             |
| MDS-based Access Control Lists (ACLs) | Available                            |
| Secrets Protection                    | Available <sub>[5]</sub>             |
| Schema Validation                     | Available                            |
| FIPS                                  | Available                            |
| Multi-region Clusters                 | Available                            |
| Tiered Storage                        | Available                            |
| Self-Balancing Clusters               | Available                            |
| Auto Data Balancer                    | Use Self-Balancing Clusters          |
| Confluent REST API                    | Available                            |
| Cluster Registry                      | Not Available                        |
| Cluster Linking                       | Available                            |
| Health+                               | Available                            |
- <sub>[1]</sub> Confluent Control Center is a separate download. See [Installation](/control-center/current/installation/overview.html).
- <sub>[2]</sub> Only available for new installations.
- <sub>[3]</sub> Supports SASL/Plain and mTLS for Kafka authentication. Does not
  support Kerberos or SASL/Scram.
- <sub>[4]</sub> Supported through
  [Kafka configuration overrides](https://docs.confluent.io/operator/current/co-configure.html).
  See [Use Properties Files to Configure Audit Logs in Confluent Platform](../security/compliance/audit-logs/audit-logs-properties-config.md#audit-logs-properties-config) for the properties you need to set in
  config overrides.
  Does not support centrally managed Audit Logs.
- <sub>[5]</sub> Kubernetes Secrets are integrated. CFK does not enable you to use
  Confluent Secret Protection.


# Metadata Service Configuration Settings

To enable the [Metadata Service](../../security/authorization/rbac/overview.md#metadata-service) (also known as the
[Confluent Server Authorizer](../../security/csa-introduction.md#confluent-server-authorizer)), the broker
configuration in the `server.properties` file must set `authorizer.class.name`
to `io.confluent.kafka.security.authorizer.ConfluentServerAuthorizer`.

To retain ACLs (that have already been enabled) and enable RBAC, set
`confluent.authorizer.access.rule.providers=ZK_ACL,CONFLUENT`.

For more details about how to configure RBAC, refer to
[Enable RBAC for Authorization on a Running Cluster in Confluent Platform](../../security/authorization/rbac/enable-rbac-running-cluster.md#enable-rbac-running-cluster).

RBAC supports the following Kafka configurations of the Metadata Service (MDS) back end,
which you can override by using the prefixes specified below:

* [Topic configurations](../../installation/configuration/topic-configs.md#cp-config-topics) used for creating the security metadata topics (`confluent.metadata.topic.`)
* [Administration Client configurations](../../installation/configuration/admin-configs.md#cp-config-admin) used for creating administration clients (`confluent.metadata.admin.`)
* [Consumer Coordinator configurations](../../installation/configuration/consumer-configs.md#cp-config-consumer) used for creating consumers (`confluent.metadata.coordinator.`)
* [Producer configurations](../../installation/configuration/producer-configs.md#cp-config-producer) used for creating producers (`confluent.metadata.producer.`)
* [HTTP configurations](#https-configs-for-ssl) used for connecting to MDS over HTTPS (`confluent.metadata.server.ssl.`)
* [Centralized Audit Log configurations](../../security/compliance/audit-logs/mds-config-for-centralized-audit-logs.md#mds-config-for-centralized-audit-logs) used to provide API endpoints to register
  a list of the Kafka clusters in an organization and to centrally manage the audit log configurations of those clusters
  (`confluent.security.event.logger.destination.admin.`).


## ksqlDB videos

See the latest videos on Confluent Platform ksqlDB and Confluent Cloud ksqlDB at the
[Confluent YouTube channel](https://www.youtube.com/channel/UCmZz-Gj3caLLzEWBtbYUXaA).

| Video                                                                                                           | Description                                                                                                                      |
|-----------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------|
| [Flink vs Kafka Streams/ksqlDB](https://www.youtube.com/watch?v=Wqko7MunKZs)                                    | Jeff Bean and Matthias Sax compare stream processing tools.                                                                      |
| [Build a Plant Monitoring System with ksqlDB](https://www.youtube.com/watch?v=yi-KOg2LSY4)                      | Robin Moffatt’s quick videos about ksqlDB, based on demo scripts that you can run for yourself.                                  |
| [Apache Kafka 101: ksqlDB](https://www.youtube.com/watch?v=Da6MwowCGHo)                                         | Tim Berglund provides a gentle introduction to ksqlDB concepts and queries.                                                      |
| [Confluent Cloud Quick Start, ksqlDB, and Project Reactor (Redux)](https://www.youtube.com/watch?v=xorcbmFDwYA) | Viktor Gamov provisions Kafka, Connect, and ksqlDB clusters in Confluent Cloud and accesses them with the ksqlDB Reactor client. |
| [Demo: The Event Streaming Database in Action](https://www.youtube.com/watch?v=D5QMqapzX8o)                     | Tim Berglund builds a movie rating system with ksqlDB to write movie records into a Kafka topic.                                 |
| [Demo: Seamless Stream Processing with Kafka Connect & ksqlDB](https://www.youtube.com/watch?v=4odZGWl-yZo)     | Set up and build ksqlDB applications using the AWS source, Azure sink, and MongoDB source connectors in Confluent Cloud.         |
| [Introduction to ksqlDB and stream processing](https://www.youtube.com/watch?v=-kFU6mCnOFw)                     | Vish Srinivasan talks Kafka stream processing fundamentals and discusses ksqlDB.                                                 |
| [Ask Confluent #16: ksqlDB edition](https://www.youtube.com/watch?v=SHKjuN2iXyk)                                | Gwen Shapira hosts Vinoth Chandar in a wide-ranging talk on ksqlDB.                                                              |
| [An introduction to ksqlDB](https://www.youtube.com/watch?v=7mGBxG2NhVQ)                                        | Robin Moffatt describes how ksqlDB helps you build scalable and fault-tolerant stream processing systems.                        |
| [ksqlDB and the Kafka Connect JDBC Sink](https://www.youtube.com/watch?v=ad02yDTAZx0)                           | Robin Moffatt demonstrates how to use ksqlDB with the Connect JDBC sink.                                                         |
| [How to Transform a Stream of Events Using ksqlDB](https://www.youtube.com/watch?v=PaHv4fGq-9k)                 | Viktor Gamov demonstrates how to transform a stream of movie data.                                                               |
| [ksqlDB Java Client and Confluent Cloud](https://www.youtube.com/watch?v=6mBY_GL_D5g)                           | Viktor Gamov takes the ksqlDB Java client for a spin and tests it against Confluent Cloud.                                       |


## Known limitations and best practices

- When deleting a cluster link, first check that all mirror topics are in the `STOPPED` state. If any are in the `PENDING_STOPPED` state,
  deleting a cluster link can cause irrecoverable errors on those mirror topics due to a temporary limitation.
- In Confluent Platform 7.1 and later, REST API calls to list and get source-initiated cluster links will have their destination cluster IDs
  returned under the parameter `destination_cluster_id`, or with Confluent CLI v4 as `destination_cluster`. (This is a change from previous releases, where these were returned under `source_cluster_id`.)
- For Confluent Platform in general, you should not use unauthenticated listeners. For Cluster Linking, this is even more important because Cluster Linking can access the listeners.
  As a best practice, always configure authentication on listeners. To learn more, see the [Enable Security for a KRaft-Based Cluster in Confluent Platform](../../security/security_tutorial.md#security-tutorial), the [Authentication in Confluent Platform](../../security/authentication/overview.md#authentication-overview), and the
  listener configuration examples in the brokers for the various protocols such as [SASL](../../security/authentication/overview.md#kafka-sasl-auth) and [Use TLS Authentication in Confluent Platform](../../security/authentication/mutual-tls/overview.md#kafka-ssl-authentication). See also, [Manage Security for Cluster Linking on Confluent Platform](security.md#cluster-link-security).
- All TLS/SSL key stores, trust stores and Kerberos keytab files must be stored at the same location on
  each broker in a given cluster. If not, cluster links may fail. Alternatively, you can
  [configure a PEM certificate in-line](https://cwiki.apache.org/confluence/display/KAFKA/KIP-651+-+Support+PEM+format+for+SSL+certificates+and+private+key)
  on the cluster link configuration.
- Cluster link configurations stored in files (TLS/SSL key stores, trust stores, Kerberos keytab files)
  should not be stored in `/tmp` because `/tmp` files may get deleted, leaving links and mirrors in a bad state on some brokers.
- Confluent Control Center will only display mirror topics correctly if the Confluent Platform cluster and Control Center are connected to a
  [REST Proxy API v3](../../kafka-rest/api.md#rest-proxy-v3). If not connected to the v3 Confluent REST API, Control Center will display mirror topics
  as regular topics, which can lead to showing features that are not actually available on mirror topics;
  for example, producing messages or editing configurations. To learn how to configure these clusters
  for the v3 REST API, see [Required Configurations for Control Center](configs.md#cluster-linking-configs-c3).
- Prerequisites are provided per tutorial or use case because these differ depending on the context.
  Tutorials are provided on [topic data sharing](topic-data-sharing.md#tutorial-topic-data-sharing) and [Tutorial: Link Confluent Platform and Confluent Cloud Clusters](hybrid-cp.md#cluster-link-hybrid-cp).
  Additional requirements for secure setups are provided in [Manage Security for Cluster Linking on Confluent Platform](security.md#cluster-link-security).
- Cluster Linking has not yet been fully tested to mirror topics that contain records produced using the Kafka transactions feature. Therefore, using Cluster Linking to mirror such topics is not supported and not recommended.
- [Cluster Linking for Confluent Platform](#cluster-linking) between a source cluster running Confluent Platform 7.0.x or earlier (non-KRaft) and a
  destination cluster running in KRaft mode is not supported. Link creation may succeed, but the
  connection will ultimately fail (with a `SOURCE_UNAVAILABLE` error message). To work around this issue,
  make sure the source cluster is running Confluent Platform version 7.1.0 or later. If you have links from a Confluent Platform source
  cluster to a Confluent Cloud destination cluster, you must upgrade your source clusters to Confluent Platform 7.1.0 or later to avoid this issue.
- ACL migration (ACL sync), previously available in Confluent Platform 6.0.0 through 6.2.x, was removed in Confluent Platform 7.0.0 due to a security vulnerability,
  then re-introduced in Confluent Platform 7.1.0 with the vulnerability resolved. If you are using ACL migration in your pre-7.1.0 deployments,
  you should disable it or upgrade to 7.1.x. To learn more, see [Authorization (ACLs)](security.md#cluster-link-acls).
- Any customer-owned firewall that allows the cluster link connection from source cluster brokers to destination cluster brokers
  must allow the TCP connection to persist in order for Cluster Linking to work.
- Prefixing is not supported in 7.1.0. For more information, see the note at the top of this section: [Prefix Mirror Topics and Consumer Group Names](mirror-topics-cp.md#cluster-link-prefix-concepts).
- Cluster Linking cannot replicate messages that use the v0 or v1 message format from the earliest versions of Kafka. Cluster Linking
  can replicate messages in the v2 format (introduced in Apache Kafka® v 0.11) and later. If Cluster Linking encounters a message with the v0 or v1 format,
  it will fail that mirror topic; that is, it will transition to a FAILED state and stop replication for that topic. To replicate a topic that
  contains messages in the v0 or v1 format, either begin replication for that topic after the last message in the v0 or v1 format, using the cluster link
  configuration `mirror.start.offset.spec`, or use [Confluent Replicator](../replicator/index.md#replicator-detail) to replicate topics and messages.
- An issue exists where [consumer group offsets](mirror-topics-cp.md#mirror-topics-consumer-offsets) that are deleted on the destination cluster (especially auto-deleted)
  persist, instead of being removed as expected. (Under the hood, the offsets are being re-replicated to the destination before retention settings delete the offsets from source.
  This results in extended retention of inactive consumer group offsets.) To prevent this from happening, you can extend retention on the destination to make sure data is deleted
  on the source before it is deleted on the destination. To do this, increase `offsets.retention.minutes` on destination cluster by at least double `offsets.retention.check.interval.ms`.
- Cluster Linking does not support the use of a proxy for authentication to the cluster. For supported security configurations, see [Manage Security for Cluster Linking on Confluent Platform](security.md#cluster-link-security).


### Other Kafka Clients


The objective of this tutorial is to learn about Avro and Schema Registry centralized schema management and compatibility checks.
To keep examples simple, this tutorial focuses on Java producers and consumers, but other Kafka clients work in similar ways.
For examples of other Kafka clients interoperating with Avro and Schema Registry:

* [Other client languages](/platform/current/clients/index.html#kafka-clients)
* [Configure ksqlDB for Avro](/platform/current/ksqldb/operate-and-deploy/installation/avro-schema.html)
* [Kafka Streams](/platform/current/streams/developer-guide/datatypes.html#streams-data-avro)
* [Kafka Connect](/platform/current/schema-registry/connect.html#schemaregistry-kafka-connect)
* [Confluent REST Proxy](/platform/current/kafka-rest/api.html#post-topic-string-avro)


## Features

Confluent Schema Registry currently supports all Kafka security features, including:

* Encryption
  * [TLS/SSL encryption](../../security/protect-data/encrypt-tls.md#encryption-ssl-schema-registry) with a secure Kafka cluster
  * [End-user REST API calls over HTTPS](#schema-registry-http-https)
* Authentication
  * [Open Authentication (OAuth)](oauth-schema-registry.md#schemaregistry-oauth) for Schema Registry server
  * [TLS/SSL authentication](../../security/authentication/mutual-tls/overview.md#authentication-ssl-schema-registry) with a secure Kafka Cluster
  * [SASL authentication](../../security/authentication/overview.md#kafka-sasl-auth) with a secure Kafka Cluster
  * Jetty authentication as described in [Role-Based Access Control](rbac-schema-registry.md#schemaregistry-rbac) steps
* Authorization (provided through the [Schema Registry Security Plugin for Confluent Platform](../../confluent-security-plugins/schema-registry/introduction.md#confluentsecurityplugins-schema-registry-security-plugin))
  * [Role-Based Access Control](rbac-schema-registry.md#schemaregistry-rbac)
  * [Schema Registry ACL Authorizer for Confluent Platform](../../confluent-security-plugins/schema-registry/authorization/sracl_authorizer.md#confluentsecurityplugins-sracl-authorizer)
  * [Schema Registry Topic ACL Authorizer for Confluent Platform](../../confluent-security-plugins/schema-registry/authorization/topicacl_authorizer.md#confluentsecurityplugins-topicacl-authorizer)
  * [Schema Registry Authorization (reference of supported operations and resource URIs)](../../confluent-security-plugins/schema-registry/authorization/index.md#confluentsecurityplugins-schema-registry-authorization)

For configuration details, check the [configuration options](../installation/config.md#schemaregistry-config).


# Secure Deployment for Kafka Streams in Confluent Platform

Kafka Streams natively integrates with the Apache Kafka® [security features](../../security/overview.md#security) and supports all of the
client-side security features in Kafka. Kafka Streams leverages the [Java Producer and Consumer API](../../clients/overview.md#kafka-clients).

To secure your Stream processing applications, configure the security settings in the corresponding Kafka producer
and consumer clients, and then specify the corresponding configuration settings in your Kafka Streams application.

Kafka supports cluster encryption and authentication, including a mix of authenticated and unauthenticated,
and encrypted and non-encrypted clients. Using security is optional.

Here a few relevant client-side security features:

Encrypt data-in-transit between your applications and Kafka brokers
: You can enable the encryption of the client-server communication between your applications and the Kafka brokers.
  For example, you can configure your applications to always use encryption when reading and writing data to and from
  Kafka. This is critical when reading and writing data across security domains such as internal network, public
  internet, and partner networks.

Client authentication
: You can enable client authentication for connections from your application to Kafka brokers. For example, you can
  define that only specific applications are allowed to connect to your Kafka cluster.

Client authorization
: You can enable client authorization of read and write operations by your applications. For example, you can define
  that only specific applications are allowed to read from a Kafka topic.  You can also restrict write access to Kafka
  topics to prevent data pollution or fraudulent activities.

For more information about the security features in Kafka, see [Kafka Security](../../security/overview.md#security) and the
blog post [Apache Kafka Security 101](http://www.confluent.io/blog/apache-kafka-security-authorization-authentication-encryption).


## Quick Start

This quick start uses the FTPS Sink connector to export data produced by the
Avro console producer to FTPS directory.

1. Start all the necessary services using Confluent CLI.
   ```bash
   confluent local start
   ```

   Every service will start in order, printing a message with its status:
   ```bash
   Starting Zookeeper
   Zookeeper is [UP]
   Starting Kafka
   Kafka is [UP]
   Starting Schema Registry
   Schema Registry is [UP]
   Starting Kafka REST
   Kafka REST is [UP]
   Starting Connect
   Connect is [UP]
   Starting KSQL Server
   KSQL Server is [UP]
   Starting Control Center
   Control Center is [UP]
   ```
2. Next, start the Avro console producer to import a few records to Kafka:
   ```bash
     ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test_ftps_sink \
   --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'
   ```
3. In the console producer, enter the following:
   ```bash
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   ```

   The three records entered are published to the Kafka topic `test_ftps_sink` in Avro format.
4. Configure your connector by first creating a `.properties` file named `quickstart-ftps.properties` with the following properties.
   ```bash
   # substitute <> with your information

      name=FTPSConnector
      connector.class=io.confluent.connect.ftps.FtpsSinkConnector
      key.converter=io.confluent.connect.avro.AvroConverter
      key.converter.schema.registry.url=http://localhost:8081
      value.converter=io.confluent.connect.avro.AvroConverter
      value.converter.schema.registry.url=http://localhost:8081
      tasks.max=3
      topics=test_ftps_sink
      confluent.topic.bootstrap.servers=localhost:9092
      confluent.topic.replication.factor=1
      confluent.license=<License. Leave it empty for evaluation license>
      format.class=io.confluent.connect.ftps.sink.format.avro.AvroFormat
      flush.size=100
      ftps.host=<host-address>
      ftps.port=<port-number>
      ftps.username=<username>
      ftps.password=<password>
      ftps.working.dir=<destination-directory-on-the-server>
      ftps.ssl.key.password=<password>
      ftps.ssl.keystore.location=<path-to-keystore>
      ftps.ssl.keystore.password=<password>
      ftps.ssl.truststore.location=<path-to-truststore>
      ftps.ssl.truststore.password=<password>
   ```
5. Start the connector by loading its configuration:
   ```bash
   confluent local load ftps-sink --config etc/kafka-connect-ftps/quickstart-ftps.properties
   ```
6. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status FTPSConnector
   ```
7. After some time, check that the data is available in the FTPS working directory:

   You should see a file with name
   `/test_ftps_sink/partition=0/test_ftps_sink+0+0000000000.avro`. The file
   name is encoded as `topic+kafkaPartition+startOffset+endOffset.format`.

   To extract the content of the file, you can use `avro-tools-1.8.2.jar`
   (available in the [Apache Archives](https://archive.apache.org/dist/avro/avro-1.8.2/java/)).
8. Move `avro-tools-1.8.2.jar` to the FTPS working directory and run the following command:
   ```bash
   java -jar avro-tools-1.8.2.jar tojson /<working dir>/test_ftps_sink/partition=0/test_ftps_sink+0+0000000000.avro
   ```

   You should see the following output:
   ```bash
   {"f1":"value1"}
   {"f1":"value2"}
   {"f1":"value3"}
   ```


## Quick start

This quick start uses the HDFS connector to export data produced by the Avro
console producer to HDFS and assumes the following:

- You have started the required services with the default configurations and
  you should make necessary changes according to the actual configurations
  used.
- Security is not configured for HDFS and Hive metastore. To make the necessary
  security configurations, see [Secure HDFS and Hive metastore](#secure-hdfs-hive-metastore).

Before you start Confluent Platform, make sure Hadoop is running locally or remotely and that
you know the HDFS URL. For Hive integration, you need to have Hive installed and
to know the metastore thrift URI. You also need to ensure the connector user
has write access to the directories specified in `topics.dir` and
`logs.dir`. The default value of `topics.dir` is `/topics` and the default
value of `logs.dir` is `/logs`, if you don’t specify the two configurations,
make sure that the connector user has write access to `/topics` and `/logs`.
You may need to create `/topics` and `/logs` before running the connector as
the connector usually don’t have write access to `/`.

Complete the following steps:

1. Start all the necessary services using the Confluent CLI.

   If not already in your PATH, add Confluent’s `bin` directory by running:
   `export PATH=<path-to-confluent>/bin:$PATH`
   ```bash
   confluent local start
   ```

   Every service will start in order, printing a message with its status:
   ```bash
   Starting Zookeeper
   Zookeeper is [UP]
   Starting Kafka
   Kafka is [UP]
   Starting Schema Registry
   Schema Registry is [UP]
   Starting Kafka REST
   Kafka REST is [UP]
   Starting Connect
   Connect is [UP]
   Starting KSQL Server
   KSQL Server is [UP]
   Starting Control Center
   Control Center is [UP]
   ```
2. Start the Avro console producer to import a few records to Kafka:
   ```bash
     ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test_hdfs \
   --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'
   ```
3. In the console producer, enter the following:
   ```bash
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   ```

   The three records entered are published to the Kafka topic `test_hdfs` in Avro
   format.
4. Before starting the connector, ensure the configurations in
   `etc/kafka-connect-hdfs/quickstart-hdfs.properties` are properly set to
   your configurations of Hadoop (for example, `hdfs.url` points to the proper HDFS
   and using FQDN in the host). Then start connector by loading its configuration
   with the following command.

   Note that you must include a double dash (`--`) between the topic name and
   your flag. For more information, see [this post](https://unix.stackexchange.com/questions/11376/what-does-double-dash-mean-also-known-as-bare-double-dash).
   ```bash
   confluent local load hdfs-sink --config etc/kafka-connect-hdfs/quickstart-hdfs.properties

   {
     "name": "hdfs-sink",
     "config": {
       "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector",
       "tasks.max": "1",
       "topics": "test_hdfs",
       "hdfs.url": "hdfs://localhost:9000",
       "flush.size": "3",
       "name": "hdfs-sink"
     },
     "tasks": []
   }
   ```
5. Verify the connector started successfully by viewing the Connect worker’s log:
   ```bash
   confluent local services connect log
   ```

   Towards the end of the log you should see that the connector starts, logs a few
   messages, and then exports data from Kafka to HDFS. Once the connector finishes
   ingesting data to HDFS, check that the data is available in HDFS:
   ```bash
   hadoop fs -ls /topics/test_hdfs/partition=0
   ```

   You should see a file with name
   `/topics/test_hdfs/partition=0/test_hdfs+0+0000000000+0000000002.avro` The
   file name is encoded as `topic+kafkaPartition+startOffset+endOffset.format`.

   You can use `avro-tools-1.8.2.jar` (available in [Apache mirrors](https://archive.apache.org/dist/avro/avro-1.8.2/java/avro-tools-1.8.2.jar))
   to extract the content of the file. Run `avro-tools` directly on Hadoop as:
   ```bash
       hadoop jar avro-tools-1.8.2.jar tojson \
     hdfs://<namenode>/topics/test_hdfs/partition=0/test_hdfs+0+0000000000+0000000002.avro

   where "<namenode>" is the HDFS name node hostname.

   Or, if you experience issues, first copy the avro file from HDFS to the local
   filesystem and try again with Java:

   .. codewithvars:: bash

       hadoop fs -copyToLocal /topics/test_hdfs/partition=0/test_hdfs+0+0000000000+0000000002.avro \
       /tmp/test_hdfs+0+0000000000+0000000002.avro

       java -jar avro-tools-1.8.2.jar tojson /tmp/test_hdfs+0+0000000000+0000000002.avro

   You should see the following output:

   .. codewithvars:: bash

     {"f1":"value1"}
     {"f1":"value2"}
     {"f1":"value3"}
   ```
6. Stop the Kafka Connect worker as well as all the rest of Confluent Platform by running:
   ```bash
   confluent local stop
   ```

   Your output should resemble:
   ```none
   Stopping Control Center
   Control Center is [DOWN]
   Stopping KSQL Server
   KSQL Server is [DOWN]
   Stopping Connect
   Connect is [DOWN]
   Stopping Kafka REST
   Kafka REST is [DOWN]
   Stopping Schema Registry
   Schema Registry is [DOWN]
   Stopping Kafka
   Kafka is [DOWN]
   Stopping Zookeeper
   Zookeeper is [DOWN]
   ```

   You may also stop all the services and wipe out any data generated during
   this quick start by running the following command:
   ```bash
   confluent local destroy
   ```

   Your output should resemble:
   ```bash
   Stopping Control Center
   Control Center is [DOWN]
   Stopping KSQL Server
   KSQL Server is [DOWN]
   Stopping Connect
   Connect is [DOWN]
   Stopping Kafka REST
   Kafka REST is [DOWN]
   Stopping Schema Registry
   Schema Registry is [DOWN]
   Stopping Kafka
   Kafka is [DOWN]
   Stopping Zookeeper
   Zookeeper is [DOWN]
   Deleting: /var/folders/ty/rqbqmjv54rg_v10ykmrgd1_80000gp/T/confluent.PkQpsKfE
   ```

   Note that if you want to run the quick start with Hive integration, before starting the
   connector, you need to add the following configurations to
   `etc/kafka-connect-hdfs/quickstart-hdfs.properties`:
   ```text
   hive.integration=true
   hive.metastore.uris=thrift uri to your Hive metastore
   schema.compatibility=BACKWARD
   ```

   After the connector finishes ingesting data to HDFS, you can use Hive to
   check the data:
   ```text
   $hive>SELECT * FROM test_hdfs;
   ```

   If you leave the `hive.metastore.uris` empty, an embedded Hive metastore
   will be created in the directory the connector is started. You need to start
   Hive in that specific directory to query the data.


## Quick start

This quick start uses the HDFS 3 Sink connector to export data produced by
the Avro console producer to HDFS.

Before you start Confluent Platform, ensure the following:

- Hadoop is running locally or remotely and that you know the HDFS URL. For Hive
  integration, you must have Hive installed and know the metastore thrift URI.
- The connector user has write access to the directories specified in
  `topics.dir` and `logs.dir`. The default value of `topics.dir` is
  `/topics` and the default value of `logs.dir` is `/logs`. If you don’t
  specify the two configurations, ensure the connector user has write access to
  `/topics` and `/logs`. You may need to create `/topics` and `/logs`
  before running the connector, as the connector likely doesn’t have write
  access to `/`.

This quick start assumes that you started the required services with the default
configurations; you should make necessary changes according to the actual
configurations used. This quick start also assumes that security is not
configured for HDFS and Hive metastore. To make the necessary security
configurations, see the [Secure HDFS and Hive Metastore](#hdfs3-connector)
section.

To get started, complete the following steps:

1. Install the connector using the following [CLI
   command](https://docs.confluent.io/confluent-cli/current/command-reference/connect/plugin/confluent_connect_plugin_install.html):
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-hdfs3:latest
   ```
2. Start Confluent Platform.
   ```bash
   confluent local start
   ```
3. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html) test Avro data to the `test_hdfs` topic in Kafka.
   ```bash
   ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test_hdfs \
   --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'

    # paste each of these messages

    {"f1": "value1"}
    {"f1": "value2"}
    {"f1": "value3"}
   ```
4. Create a `hdfs3-sink.json` file with the following contents:
   ```json
   {
     "name": "hdfs3-sink",
     "config": {
       "connector.class": "io.confluent.connect.hdfs3.Hdfs3SinkConnector",
       "tasks.max": "1",
       "topics": "test_hdfs",
       "hdfs.url": "hdfs://localhost:9000",
       "flush.size": "3",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "io.confluent.connect.avro.AvroConverter",
       "value.converter.schema.registry.url":"http://localhost:8081",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```

   Note that the first few settings are common settings you’ll specify for all
   connectors. The `topics` parameter specifies the topics to export data
   from. In this case, `test_hdfs`. The HDFS connection URL, `hdfs.url`,
   specifies the HDFS to export data to. You should set this according to your
   configuration. `flush.size` specifies the number of records the connector
   needs to write before invoking file commits. For high availability HDFS
   deployments, set `hadoop.conf.dir` to a directory that includes
   `hdfs-site.xml` and `core-site.xml`. After `hdfs-site.xml` is in place
   and `hadoop.conf.dir` has been set, `hdfs.url` may be set to the
   namenodes nameservice ID, such as `nameservice1`.
5. Load the HDFS 3 Sink connector.
   ```bash
   confluent local load hdfs3-sink --config hdfs3-sink.json
   ```
6. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status hdfs3-sink
   ```
7. Validate that the Avro data is in HDFS.
   ```bash
   # list files in partition 0
   hadoop fs -ls /topics/test_hdfs/partition=0

   # the following should appear in the list
   # /topics/test_hdfs/partition=0/test_hdfs+0+0000000000+0000000002.avro
   ```

   The file name is encoded as `topic+kafkaPartition+startOffset+endOffset.format`.
8. Extract the contents of the file using
   the [avro-tools-1.8.2.jar](https://repo1.maven.org/maven2/org/apache/avro/avro-tools/1.8.2/avro-tools-1.8.2.jar).
   ```bash
   # substitute "<namenode>" for the HDFS name node hostname
   hadoop jar avro-tools-1.8.2.jar tojson \
   hdfs://<namenode>/topics/test_hdfs/partition=0/test_hdfs+0+0000000000+0000000002.avro
   ```
9. If you experience issues with the previous step, first copy the Avro file
   from HDFS to the local filesystem and try again with java.
   ```bash
   hadoop fs -copyToLocal /topics/test_hdfs/partition=0/test_hdfs+0+0000000000+0000000002.avro \
   /tmp/test_hdfs+0+0000000000+0000000002.avro

   java -jar avro-tools-1.8.2.jar tojson /tmp/test_hdfs+0+0000000000+0000000002.avro

   # expected output
   {"f1":"value1"}
   {"f1":"value2"}
   {"f1":"value3"}
   ```

   If you want to run the quick start with Hive integration, add the following
   configurations to `hdfs-sink.json`:
   ```text
   "hive.integration": "true",
   "hive.metastore.uris": "<thrift uri to your Hive metastore>"
   "schema.compatibility": "BACKWARD"
   ```

   After the connector finishes ingesting data to HDFS, you can use Hive to check the data:
   ```text
   beeline -e "SELECT * FROM test_hdfs;"
   ```

   If the `hive.metastore.uris` setting is empty, an embedded Hive metastore
   is created in the directory the connector is started in. Start Hive in that
   specific directory to query the data.


## Quick Start

The following uses the `S3SinkConnector` to write a file from the Kafka topic named `s3_topic` to S3.
Then, the `S3SourceConnector` loads that Avro file from S3 to the Kafka topic named `copy_of_s3_topic`.

1. Follow the instructions from [the S3 Sink connector quick start](https://docs.confluent.io/kafka-connect-s3-sink/current/overview.html#quick-start) to set up the data to use below.
2. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-s3-source:latest
   ```
3. Create a `quickstart-s3source.properties` file with the following contents or use the `quickstart-s3source.properties`.:
   ```properties
   name=s3-source
   tasks.max=1
   connector.class=io.confluent.connect.s3.source.S3SourceConnector
   s3.bucket.name=confluent-kafka-connect-s3-testing
   format.class=io.confluent.connect.s3.format.avro.AvroFormat
   confluent.license=
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   ```
4. Edit the `quickstart-s3source.properties` to add the following properties:
   ```properties
   transforms=AddPrefix
   transforms.AddPrefix.type=org.apache.kafka.connect.transforms.RegexRouter
   transforms.AddPrefix.regex=.*
   transforms.AddPrefix.replacement=copy_of_$0
   ```

   #### IMPORTANT
   Adding this renames the output of topic of the messages to `copy_of_s3_topic`. This prevents a continuous feedback loop of messages.
5. Load the Backup and Restore S3 Source connector.
   ```bash
   confluent local load s3-source --config quickstart-s3source.properties
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production environments.
6. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status s3-source
   ```
7. Confirm that the messages are being sent to Kafka.
   ```bash
   kafka-avro-console-consumer \
       --bootstrap-server localhost:9092 \
       --property schema.registry.url=http://localhost:8081 \
       --topic copy_of_s3_topic \
       --from-beginning | jq '.'
   ```
8. The response should be 18 records as follows.
   ```bash
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```


## Quick Start

This quick start uses the SFTP Sink Connector to export data produced by the
Avro console producer to SFTP directory.

First, start all the necessary services using Confluent CLI.

```bash
confluent local start
```

Every service will start in order, printing a message with its status:

```bash
Starting Zookeeper
Zookeeper is [UP]
Starting Kafka
Kafka is [UP]
Starting Schema Registry
Schema Registry is [UP]
Starting Kafka REST
Kafka REST is [UP]
Starting Connect
Connect is [UP]
Starting KSQL Server
KSQL Server is [UP]
Starting Control Center
Control Center is [UP]
```

Next, start the Avro console producer to import a few records to Kafka:

```bash
  ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test_sftp_sink \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'
```

In the console producer, enter the following:

```bash
{"f1": "value1"}
{"f1": "value2"}
{"f1": "value3"}
```

The three records entered are published to the Kafka topic `test_sftp_sink` in
Avro format.

Before starting the connector, ensure the configurations in
`etc/kafka-connect-sftp/quickstart-sftp.properties` are properly set to your
configurations of SFTP (for example, `sftp.hostname` must point to the proper
SFTP host). Then, start connector by loading its configuration with the
following command:

```bash
confluent local load sftp-sink --config etc/kafka-connect-sftp/quickstart-sftp.properties

{
  "name": "sftp-sink",
  "config": {
    "topics": "test_sftp_sink",
    "tasks.max": "1",
    "connector.class": "io.confluent.connect.sftp.SftpSinkConnector",
    "confluent.topic.bootstrap.servers": "localhost:9092",
    "partitioner.class": "io.confluent.connect.storage.partitioner.DefaultPartitioner",
    "schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
    "flush.size": "3",
    "schema.compatibility": "NONE",
    "format.class": "io.confluent.connect.sftp.sink.format.avro.AvroFormat",
    "storage.class": "io.confluent.connect.sftp.sink.storage.SftpSinkStorage",
    "sftp.host": "localhost",
    "sftp.port": "2222",
    "sftp.username": "foo",
    "sftp.password": "pass",
    "sftp.working.dir": "/share",
    "name": "sftpconnector"
  },
  "tasks": []
}
```

To check that the connector started successfully, view the Connect worker’s log
by entering:

```bash
confluent local services connect log
```

Towards the end of the log you should see that the connector starts, logs a few
messages, and then exports data from Kafka to SFTP. Once the connector finishes
ingesting data to SFTP, check that the data is available in the SFTP working
directory:

You should see a file with name
`/topics/test_sftp_sink/partition=0/test_sftp_sink+0+0000000000.avro` The file
name is encoded as `topic+kafkaPartition+startOffset+endOffset.format`.

To extract the contents of the file, use `avro-tools-1.8.2.jar` (available in
the [Apache Archives](http://archive.apache.org/dist/avro/avro-1.8.2/java/avro-tools-1.8.2.jar)).

Move `avro-tools-1.8.2.jar` to SFTP’s working directory and run the following
command:

```bash
java -jar avro-tools-1.8.2.jar tojson /<working dir>/topics/test_sftp_sink/partition=0/test_sftp_sink+0+0000000000.avro
```

You should see the following output:

```bash
{"f1":"value1"}
{"f1":"value2"}
{"f1":"value3"}
```

Finally, stop the Connect worker as well as all the rest of Confluent Platform by running:

```bash
confluent local stop
```

Your output should resemble:

```none
Stopping Control Center
Control Center is [DOWN]
Stopping KSQL Server
KSQL Server is [DOWN]
Stopping Connect
Connect is [DOWN]
Stopping Kafka REST
Kafka REST is [DOWN]
Stopping Schema Registry
Schema Registry is [DOWN]
Stopping Kafka
Kafka is [DOWN]
Stopping Zookeeper
Zookeeper is [DOWN]
```

*Or*, stop all the services and additionally wipe out any data generated during
this quick start by running:

```bash
confluent local destroy
```

Your output should resemble:

```bash
Stopping Control Center
Control Center is [DOWN]
Stopping KSQL Server
KSQL Server is [DOWN]
Stopping Connect
Connect is [DOWN]
Stopping Kafka REST
Kafka REST is [DOWN]
Stopping Schema Registry
Schema Registry is [DOWN]
Stopping Kafka
Kafka is [DOWN]
Stopping Zookeeper
Zookeeper is [DOWN]
Deleting: /var/folders/ty/rqbqmjv54rg_v10ykmrgd1_80000gp/T/confluent.PkQpsKfE
```


### Property-based example

1. Create a `snmp-trap-source-quickstart.properties` file with the following contents or use the `snmp-trap-source-quickstart.properties`. This configuration is used typically along with [standalone workers](/platform/current/connect/concepts.html#standalone-workers).:
   ```properties
   name=SnmpTrapSourceConnector
   tasks.max=1
   connector.class=io.confluent.connect.snmp.SnmpTrapSourceConnector
   kafka.topic=snmp-kafka-topic
   snmp.batch.size=50
   snmp.listen.address=<ip-address to listen trap from>
   snmp.listen.port=<port to listen trap from>
   snmp.v3.enabled=true
   v3.security.context.users=<list-of-usernames>
   v3.$username.auth.password=<auth-password>
   v3.$username.privacy.password=<privacy-password>
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   confluent.license=<your-confluent-license>
   ```

   The following defines the Confluent license stored in Kafka, so you need the
   Kafka bootstrap addresses. The `replication.factor` may not be larger than
   the number of Kafka brokers in the destination cluster, so here set this to a
   value of 1 for demonstration purposes. Always set this to a value of at
   least 3 in production configurations.
2. Load the SNMP Trap Source connector.
   ```bash
   confluent local load snmp-trap-source --config snmp-trap-source-quickstart.properties
   ```

   It’s important that you don’t use the [Confluent
   CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production environments.
3. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status snmp-trap-source
   ```
4. The SNMP device should be running and generating PDUs. The connector will listen and push PDUs of type trap to a Kafka topic.
5. Confirm that the messages are being sent to Kafka.
   ```bash
   kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081 --topic snmp-kafka-topic --from-beginning
   ```

A sample SNMP PDU of type trap might look like this for `sysDescr` Oid. Refer - [https://www.alvestrand.no/objectid/1.3.6.1.2.1.1.1.html](https://www.alvestrand.no/objectid/1.3.6.1.2.1.1.1.html):

```bash
TRAP[
      {
        contextEngineID=80:00:00:59:03:78:d2:94:b8:9f:95,
        contextName=
      },
      requestID=2058388122,
      errorStatus=0,
      errorIndex=0,
      VBS[
           1.3.6.1.2.1.1.1.0 = 24-Port Gigabit Smart Switch with PoE and 4 SFP uplinks
         ]
    ]
```

Data in Kafka topic:

```bash
{
  "peerAddress":"127.0.0.1/55159",
  "securityName":"admin",
  "variables":[
    {
      "oid":"1.3.6.1.2.1.1.1.0",
      "type":"octetString",
      "counter32":null,
      "counter64":null,
      "gauge32":null,
      "integer":null,
      "ipaddress":null,
      "null":null,
      "objectIdentifier":null,
      "octetString":null,
      "opaque":null,
      "timeticks":null,
      "metadata":{
        "string":"24-Port Gigabit Smart Switch with PoE and 4 SFP uplinks"
      }
   }]
}
```


## Enable monitoring

You must enable monitoring explicitly on each ksqlDB server. To enable
it in a Docker-based deployment, export an environment variable named
`KSQL_JMX_OPTS` with your JMX configuration and expose the port that
JMX will communicate over.

The following Docker Compose example shows how you can configure
monitoring for ksqlDB server. The surrounding components, like the
broker and CLI, are omitted for brevity. You can see an example of a
complete setup in the
[ksqlDB Quick Start](../quickstart.md#ksqldb-quick-start).

```yaml
ksqldb-server:
  image: confluentinc/cp-ksqldb-server:8.1.0
  hostname: ksqldb-server
  container_name: ksqldb-server
  depends_on:
    - broker
    - schema-registry
  ports:
    - "8088:8088"
    - "1099:1099"
  environment:
    KSQL_LISTENERS: "http://0.0.0.0:8088"
    KSQL_BOOTSTRAP_SERVERS: "broker:9092"
    KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
    KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
    KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
    KSQL_KSQL_QUERY_PULL_METRICS_ENABLED: "true"
    KSQL_JMX_OPTS: >
      -Djava.rmi.server.hostname=localhost
      -Dcom.sun.management.jmxremote
      -Dcom.sun.management.jmxremote.port=1099
      -Dcom.sun.management.jmxremote.authenticate=false
      -Dcom.sun.management.jmxremote.ssl=false
      -Dcom.sun.management.jmxremote.rmi.port=1099
```

With respect to monitoring, here it what this does:

- The environment variable `KSQL_JMX_OPTS` is supplied to the server
  with various arguments. The `>` character lets you write a
  multi-line string in Yaml, which makes this long argument easier to
  read. The advertised hostname, port, and security settings are
  configured. JMX has a wide range of
  [configuration options](https://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html),
  and you can set these however you like.
- Port `1099` is exposed, which corresponds to the JMX port set in
  the `KSQL_JMX_OPTS` configuration. This enables remote monitoring
  tools to communicate into ksqlDB’s process.


## Pre-flight checks

Before going through the tutorial, check that the environment has started correctly.
If any of these pre-flight checks fails, consult the [Troubleshooting the scripted demo](teardown.md#cp-demo-troubleshooting) section.

1. Verify the status of the Docker containers show `Up` state.
   ```bash
   docker compose ps
   ```

   Your output should resemble:
   ```text
   NAME             IMAGE                                                      COMMAND                  SERVICE          CREATED       STATUS                 PORTS
   alertmanager     confluentinc/cp-enterprise-alertmanager:2.2.0              "alertmanager-start"     alertmanager     2 hours ago   Up 2 hours             0.0.0.0:9093->9093/tcp, [::]:9093->9093/tcp
   connect          localbuild/connect:8.0.0-8.0.0                             "/etc/confluent/dock…"   connect          2 hours ago   Up 2 hours (healthy)   0.0.0.0:8083->8083/tcp, [::]:8083->8083/tcp
   control-center   confluentinc/cp-enterprise-control-center-next-gen:2.2.0   "/etc/confluent/dock…"   control-center   2 hours ago   Up 2 hours (healthy)   0.0.0.0:9021-9022->9021-9022/tcp, [::]:9021-9022->9021-9022/tcp
   elasticsearch    docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.0   "/tini -- /usr/local…"   elasticsearch    2 hours ago   Up 2 hours (healthy)   0.0.0.0:9200->9200/tcp, [::]:9200->9200/tcp, 0.0.0.0:9300->9300/tcp, [::]:9300->9300/tcp
   kafka1           confluentinc/cp-server:8.0.0                               "bash -c 'if [ ! -f …"   kafka1           2 hours ago   Up 2 hours (healthy)   0.0.0.0:8091->8091/tcp, [::]:8091->8091/tcp, 0.0.0.0:9091->9091/tcp, [::]:9091->9091/tcp, 0.0.0.0:10091->10091/tcp, [::]:10091->10091/tcp, 0.0.0.0:11091->11091/tcp, [::]:11091->11091/tcp, 0.0.0.0:12091->12091/tcp, [::]:12091->12091/tcp
   kafka2           confluentinc/cp-server:8.0.0                               "bash -c 'if [ ! -f …"   kafka2           2 hours ago   Up 2 hours (healthy)   0.0.0.0:8092->8092/tcp, [::]:8092->8092/tcp, 0.0.0.0:9092->9092/tcp, [::]:9092->9092/tcp, 0.0.0.0:10092->10092/tcp, [::]:10092->10092/tcp, 0.0.0.0:11092->11092/tcp, [::]:11092->11092/tcp, 0.0.0.0:12092->12092/tcp, [::]:12092->12092/tcp
   kibana           docker.elastic.co/kibana/kibana-oss:7.10.0                 "/usr/local/bin/dumb…"   kibana           2 hours ago   Up 2 hours (healthy)   0.0.0.0:5601->5601/tcp, [::]:5601->5601/tcp
   ksqldb-cli       confluentinc/cp-ksqldb-cli:8.0.0                           "/bin/sh"                ksqldb-cli       2 hours ago   Up 2 hours
   ksqldb-server    confluentinc/cp-ksqldb-server:8.0.0                        "/etc/confluent/dock…"   ksqldb-server    2 hours ago   Up 2 hours (healthy)   0.0.0.0:8088-8089->8088-8089/tcp, [::]:8088-8089->8088-8089/tcp
   openldap         osixia/openldap:1.3.0                                      "/container/tool/run…"   openldap         2 hours ago   Up 2 hours             389/tcp, 636/tcp
   prometheus       confluentinc/cp-enterprise-prometheus:2.2.0                "prometheus-start"       prometheus       2 hours ago   Up 2 hours             0.0.0.0:9090->9090/tcp, [::]:9090->9090/tcp
   restproxy        confluentinc/cp-kafka-rest:8.0.0                           "/etc/confluent/dock…"   restproxy        2 hours ago   Up 2 hours             0.0.0.0:8086->8086/tcp, [::]:8086->8086/tcp
   schemaregistry   confluentinc/cp-schema-registry:8.0.0                      "/etc/confluent/dock…"   schemaregistry   2 hours ago   Up 2 hours (healthy)   0.0.0.0:8085->8085/tcp, [::]:8085->8085/tcp
   streams-demo     cnfldemos/cp-demo-kstreams:0.0.12                          "/app/start.sh"          streams-demo     2 hours ago   Up 2 hours             9092/tcp
   tools            cnfldemos/tools:0.3                                        "/bin/bash"              tools            2 hours ago   Up 2 hours
   ```
2. Jump to the end of the entire `cp-demo` pipeline and view the Kibana dashboard at [http://localhost:5601/app/dashboards#/view/Overview](http://localhost:5601/app/dashboards#/view/Overview) .  This is a cool view and validates that the `cp-demo` start script completed successfully.
   ![image](tutorials/cp-demo/images/kibana-dashboard.png)
3. View the full Confluent Platform configuration in the [docker-compose.yml](https://github.com/confluentinc/cp-demo/tree/latest/docker-compose.yml) file.
4. View the Kafka Streams application configuration in the [client configuration](https://github.com/confluentinc/cp-demo/tree/latest/env_files/streams-demo.env) file, set with security parameters to the Kafka cluster and Schema Registry.


### Authorization with RBAC

1. Verify which users are configured to be super users.
   ```bash
   docker compose logs kafka1 | grep "super.users ="
   ```

   Your output should resemble the following. Notice this authorizes each service name which authenticates as itself,
   as well as the unauthenticated `PLAINTEXT` which authenticates as `ANONYMOUS` (for demo purposes only):
   ```bash
   kafka1            |    super.users = User:admin;User:mds;User:superUser;User:ANONYMOUS
   ```
2. From the Confluent Control Center UI, in the Administration menu, click the **Manage role assignments** option. Click **Assignments**, and then the Kafka cluster ID.
3. From the **Topic** list, verify that the LDAP user `appSA` is allowed to access a few topics, including any topic whose name starts with **wikipedia**. This role assignment was done during `cp-demo` startup in the [create-role-bindings.sh script](https://github.com/confluentinc/cp-demo/tree/latest/scripts/helper/create-role-bindings.sh).
4. Verify that LDAP user `appSA` (which is not a super user) can consume messages from topic `wikipedia.parsed`.  Notice that it is configured to authenticate to brokers with mTLS and authenticate to Schema Registry with LDAP.
   ```bash
   docker compose exec connect kafka-avro-console-consumer \
     --bootstrap-server kafka1:11091,kafka2:11092 \
     --consumer-property security.protocol=SSL \
     --consumer-property ssl.truststore.location=/etc/kafka/secrets/kafka.appSA.truststore.jks \
     --consumer-property ssl.truststore.password=confluent \
     --consumer-property ssl.keystore.location=/etc/kafka/secrets/kafka.appSA.keystore.jks \
     --consumer-property ssl.keystore.password=confluent \
     --consumer-property ssl.key.password=confluent \
     --property schema.registry.url=https://schemaregistry:8085 \
     --property schema.registry.ssl.truststore.location=/etc/kafka/secrets/kafka.appSA.truststore.jks \
     --property schema.registry.ssl.truststore.password=confluent \
     --property basic.auth.credentials.source=USER_INFO \
     --property basic.auth.user.info=appSA:appSA \
     --group wikipedia.test \
     --topic wikipedia.parsed \
     --max-messages 5
   ```
5. Verify that LDAP user `badapp` cannot consume messages from topic `wikipedia.parsed`.
   ```bash
   docker compose exec connect kafka-avro-console-consumer \
     --bootstrap-server kafka1:11091,kafka2:11092 \
     --consumer-property security.protocol=SSL \
     --consumer-property ssl.truststore.location=/etc/kafka/secrets/kafka.badapp.truststore.jks \
     --consumer-property ssl.truststore.password=confluent \
     --consumer-property ssl.keystore.location=/etc/kafka/secrets/kafka.badapp.keystore.jks \
     --consumer-property ssl.keystore.password=confluent \
     --consumer-property ssl.key.password=confluent \
     --property schema.registry.url=https://schemaregistry:8085 \
     --property schema.registry.ssl.truststore.location=/etc/kafka/secrets/kafka.badapp.truststore.jks \
     --property schema.registry.ssl.truststore.password=confluent \
     --property basic.auth.credentials.source=USER_INFO \
     --property basic.auth.user.info=badapp:badapp \
     --group wikipedia.test \
     --topic wikipedia.parsed \
     --max-messages 5
   ```

   Your output should resemble:
   ```bash
   ERROR [Consumer clientId=consumer-wikipedia.test-1, groupId=wikipedia.test] Topic authorization failed for topics [wikipedia.parsed]
   org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to access topics: [wikipedia.parsed]
   ```
6. Create role bindings to permit `badapp` client to consume from topic `wikipedia.parsed` and its related subject in Schema Registry.

   Get the Kafka cluster ID:
   ```none
   KAFKA_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id --tlsv1.2 --cacert scripts/security/snakeoil-ca-1.crt | jq -r ".id")
   ```

   Create the role bindings:
   ```text
   # Create the role binding for the topic ``wikipedia.parsed``
   docker compose exec tools bash -c "confluent iam rbac role-binding create \
       --principal User:badapp \
       --role ResourceOwner \
       --resource Topic:wikipedia.parsed \
       --kafka-cluster-id $KAFKA_CLUSTER_ID"

   # Create the role binding for the group ``wikipedia.test``
   docker compose exec tools bash -c "confluent iam rbac role-binding create \
       --principal User:badapp \
       --role ResourceOwner \
       --resource Group:wikipedia.test \
       --kafka-cluster-id $KAFKA_CLUSTER_ID"

   # Create the role binding for the subject ``wikipedia.parsed-value``, i.e., the topic-value (versus the topic-key)
   docker compose exec tools bash -c "confluent iam rbac role-binding create \
       --principal User:badapp \
       --role ResourceOwner \
       --resource Subject:wikipedia.parsed-value \
       --kafka-cluster-id $KAFKA_CLUSTER_ID \
       --schema-registry-cluster schema-registry"
   ```
7. Verify that LDAP user `badapp` now can consume messages from topic `wikipedia.parsed`.
   ```bash
   docker compose exec connect kafka-avro-console-consumer \
     --bootstrap-server kafka1:11091,kafka2:11092 \
     --consumer-property security.protocol=SSL \
     --consumer-property ssl.truststore.location=/etc/kafka/secrets/kafka.badapp.truststore.jks \
     --consumer-property ssl.truststore.password=confluent \
     --consumer-property ssl.keystore.location=/etc/kafka/secrets/kafka.badapp.keystore.jks \
     --consumer-property ssl.keystore.password=confluent \
     --consumer-property ssl.key.password=confluent \
     --property schema.registry.url=https://schemaregistry:8085 \
     --property schema.registry.ssl.truststore.location=/etc/kafka/secrets/kafka.badapp.truststore.jks \
     --property schema.registry.ssl.truststore.password=confluent \
     --property basic.auth.credentials.source=USER_INFO \
     --property basic.auth.user.info=badapp:badapp \
     --group wikipedia.test \
     --topic wikipedia.parsed \
     --max-messages 5
   ```
8. View all the role bindings that were configured for RBAC in this cluster.
   ```bash
   ./scripts/validate/validate_bindings.sh
   ```
9. Because the Kafka cluster is configured for [SASL](../../security/authentication/sasl/plain/overview.md#kafka-sasl-auth-plain), any administrative commands must authenticate directly with the Kafka brokers. This authentication is provided via a client properties file specified with the `--command-config` flag on the command-line tool itself. For example, to run a command like the [consumer throttle script](https://github.com/confluentinc/cp-demo/tree/latest/scripts/app/throttle_consumer.sh), you must include this flag pointing to a file with the correct security credentials. This replaces the previous method of relying on a pre-configured `KAFKA_OPTS` environment variable on a broker container. Consequently, the command is no longer restricted to running on a specific container like `kafka1` or `kafka2` and can be executed from any machine that has the configuration file and network access to the brokers.
10. Next step: Learn more about security with the [Security Tutorial](../../security/security_tutorial.md#security-tutorial).


### Configure clients from the Confluent CLI

For [Confluent CLI](https://docs.confluent.io/confluent-cli/current/overview.html) frequent users, once you
have set up context in the CLI, you can use one-line command [confluent kafka client-config create](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/client-config/index.html#confluent-kafka-client-config)
to create a configuration file for connecting your client apps to Confluent Cloud.

The following table lists client languages, corresponding language ID, and
whether the language supports Confluent Cloud Schema Registry configuration. For languages that support
Confluent Cloud Schema Registry configuration, you can optionally configure it for your client apps by
passing Schema Registry information via the flags to the command.


| Language    | Language ID   | Support for Confluent Cloud Schema Registry   | Notes                                                                                                                                |
|-------------|---------------|-----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|
| Clojure     | `clojure`     | No                                            |                                                                                                                                      |
| C/C++       | `cpp`         | No                                            | See examples: [C/C++ examples (librdkafka)](https://github.com/edenhill/librdkafka/tree/master/examples)                             |
| C#          | `csharp`      | No                                            |                                                                                                                                      |
| Go          | `go`          | Yes                                           | See examples: [confluent-kafka-go/examples](https://github.com/confluentinc/confluent-kafka-go/tree/master/examples)                 |
| Groovy      | `groovy`      | No                                            |                                                                                                                                      |
| Java        | `java`        | Yes                                           |                                                                                                                                      |
| Kotlin      | `kotlin`      | No                                            |                                                                                                                                      |
| Ktor        | `ktor`        | Yes                                           |                                                                                                                                      |
| JavaScript  | `javascript`  | Yes                                           | See examples: [confluent-kafka-javascript/examples](https://github.com/confluentinc/confluent-kafka-javascript/tree/master/examples) |
| Python      | `python`      | Yes                                           | See examples: [confluent-kafka-python/examples](https://github.com/confluentinc/confluent-kafka-python/tree/master/examples)         |
| REST API    | `restapi`     | Yes                                           |                                                                                                                                      |
| Ruby        | `ruby`        | No                                            |                                                                                                                                      |
| Rust        | `rust`        | No                                            |                                                                                                                                      |
| Scala       | `scala`       | No                                            |                                                                                                                                      |
| Spring Boot | `springboot`  | Yes                                           |                                                                                                                                      |

Prerequisites:
: - [Access to Confluent Cloud](https://www.confluent.io/confluent-cloud/) with an active cluster.
  - [Install the Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html).

1. Log in to your cluster using the [confluent login](https://docs.confluent.io/confluent-cli/current/command-reference/confluent_login.html) command with the cluster URL specified.
   ```none
   confluent login
   ```

   ```none
   Enter your Confluent Cloud credentials:
   Email: susan@myemail.com
   Password:
   ```
2. Set the Confluent Cloud [environment](../security/access-control/hierarchy/cloud-environments.md#cloud-environments).
   1. Get the environment ID.
      ```none
      confluent environment list
      ```

      Your output should resemble:
      ```none
            Id       |        Name
      +--------------+--------------------+
        * t2703      | default
          env-abc123 | demo-env-102893
          env-xyz123 | ccloud-demo
          env-wxy123 | data-lineage-demo
          env-abc12d | my-new-environment
      ```
   2. Set the environment using the ID (`<env-id>`).
      ```none
      confluent environment use <env-id>
      ```

      Your output should resemble:
      ```none
      Now using "env-xyz123" as the default (active) environment.
      ```
3. Set the cluster to use.
   1. Get the cluster ID.
      ```none
      confluent kafka cluster list
      ```

      Your output should resemble:
      ```none
            Id      |   Name    | Type  | Cloud    |  Region  | Availability | Status
      +-------------+-----------+-------+----------+----------+--------------+--------+
          lkc-oymmj | cluster_1 | BASIC | gcp      | us-east4 | single-zone  | UP
        * lkc-7k6kj | cluster_0 | BASIC | gcp      | us-east1 | single-zone  | UP
      ```
   2. Set the cluster using the ID (`<cluster-id>`). This is the cluster where the commands are run.
      ```none
      confluent kafka cluster use <cluster-id>
      ```

      To verify the selected cluster after setting it, type `confluent kafka cluster list` again.
      The selected cluster will have an asterisk (`*`) next to it.
4. Create an API key and secret, and save them.

   You can generate the API key on the Confluent CLI or from the Confluent Cloud Console. Be sure to save the API key and secret.

   ### Confluent CLI

   1. Run the following command to create the API key and secret, using the ID (`<cluster-id>`).
      > ```bash
      > confluent api-key create --resource <cluster-id>
      > ```

      > Your output should resemble:
      > ```none
      > It may take a couple of minutes for the API key to be ready.
      > Save the API key and secret. The secret is not retrievable later.
      > +---------+------------------------------------------------------------------+
      > | API Key | ABC123xyz                                                        |
      > | Secret  | 123xyzABC123xyzABC123xyzABC123xyzABC123xyzABC123xyzABC123xyzABCx |
      > +---------+------------------------------------------------------------------+
      > ```

   For more information, see [Use API Keys to Authenticate to Confluent Cloud](../security/authenticate/workload-identities/service-accounts/api-keys/overview.md#cloud-api-key-resource).

   ### Confluent Cloud Console

   > 1. In the console, click the **Kafka API keys** tab and click **Create key**.
   >    Save the key and secret, then click the checkbox next to **I have saved my API key and secret
   >    and am ready to continue.**
   >    ![image](images/cloud-api-key-confirm.png)
   > 2. Add the API secret with `confluent api-key store <key> <secret>`. When you create an API
   >    key with the CLI, it is automatically stored locally. However, when you create an API key using
   >    the console, API, or with the CLI on another machine, the secret is not available for CLI use until
   >    you store it. This is required because secrets cannot be retrieved after creation.
   >    ```bash
   >    confluent api-key store <api-key> <api-secret> --resource <cluster-id>
   >    ```

   > For more information, see [Use API Keys to Authenticate to Confluent Cloud](../security/authenticate/workload-identities/service-accounts/api-keys/overview.md#cloud-api-key-resource).
5. Set the API key to use for Confluent CLI commands, using the ID (`<cluster-id>`).

1. Create a client configuration file for the language of your choice, using language ID (`<language-id>`). Then,
   copy and paste the displayed configuration into your client application source code.

   See [Client Language Table](#client-language-table) for a list of language IDs and whether the language supports Schema Registry configuration.
   - For languages that do NOT support Schema Registry configuration, run the following command:
     ```bash
     confluent kafka client-config create <language-id>
     ```
   - For languages that support Schema Registry configuration, run the following command:
     ```bash
     confluent kafka client-config create <language-id> \
       --schema-registry-api-key <api-key> \
       --schema-registry-api-secret <api-secret>
     ```

* For tips and recommendations for configuring resilient clients, see
  [Client Configuration Settings for Confluent Cloud](client-configs.md#client-producer-consumer-config-recs-cc).
* For more information about using the CLI, see [confluent kafka client-config
  create](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/client-config/index.html#confluent-kafka-client-config).


## Features

The Amazon DynamoDB CDC Source connector includes the following features:

* **IAM user authentication**: The connector supports authenticating to DynamoDB
  using IAM user access credentials.
* **Provider integration support**: The connector supports IAM role-based authorization using Confluent Provider
  Integration. For more information about provider integration setup, see the [IAM roles authentication](#cc-amazon-dynamodb-cdc-source-setup-connection).
* **Customizable API endpoints**: The connector allows you to specify an AWS
  DynamoDB API and Resource Group Tag API endpoint.
* **Kafka cluster authentication customization**: The connector supports
  authenticating a Kafka cluster using API keys and/or service accounts.
* **Snapshot mode customization**: The connector allows you to configure either
  of the following modes for snapshots:
  - **SNAPSHOT**: Only allows a one-time scan (Snapshot) of the existing data in
    the source tables simultaneously.
  - **CDC**: Only allows CDC with DynamoDB streams without initial snapshot
    for all streams simultaneously.
  - **SNAPSHOT_CDC** (Default): Allows an initial snapshot of all configured
    tables and once the snapshot is complete, starts CDC streaming using
    DynamoDB streams.
* **Seamless table streaming**: The connector support the following two modes to
  provide seamless table streaming:
  - **TAG_MODE**: Auto-discover multiple DynamoDB tables and stream
    simultaneously (that is, `dynamodb.table.discovery.mode` is set to `TAG`
  - **INCLUDELIST_MODE**: Explicitly specify/select specific multiple DynamoDB
    table names and stream simultaneously (that is,
    `dynamodb.table.discovery.mode` is set to `INCLUDELIST`).
* **Automatic topic creation**: The connector supports the auto-creation of
  topics with the name of the table, with a customer-provided prefix and suffix
  using [TopicRegexRouter Single Message Transformation
  (SMT)](/platform/current/connect/transforms/topicregexrouter.html).
* **Supported data formats**: The connector supports Avro, Protobuf, and JSON
  Schema output formats. [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be
  enabled to use a Schema Registry-based format (for example, Avro, JSON Schema, or
  Protobuf). For more information, see [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits).
* **Schema management**: The connector supports Schema Registry, Schema Context
  and Reference Subject Naming Strategy.
* **AWS DynamoDB scanning capabilities**: The connector includes the following
  scanning capabilities:
  - **Parallel Scans**: A DynamoDB table can be logically divided into multiple
    segments. The connector will divide the table into five logical segments and
    scan these segments in parallel.
  - **Pagination**: Tables are scanned sequentially, and a scan result’s
    response can fetch no more than 1 MB of data. Since tables can be large, scan
    request responses are paginated. With each response, a `LastEvaluatedKey` is
    returned. This `LastEvaluatedKey` from a scan response should be used as the
    `ExclusiveStartKey` for the next scan request. If no `LastEvaluatedKey` is
    returned, it indicates that the end of the result set has been reached.
  - **Non-isolated scans**:  To scan an entire table, the task continues making
    multiple subsequent scan requests by submitting the appropriate
    `exclusiveStartKey` with each request. For a large table, this process may
    take hours, and the snapshot will capture items as they are at the time a
    scan request is made, and not from when the snapshot operation was started.
  - **Eventual Consistency**: By default, a scan uses eventually consistent
    reads when accessing items in a table. Therefore, the results from a
    consistent scan may not reflect the latest item changes at the
    time the scan iterates through each item in the table. A snapshot, on the
    other hand, only captures data from committed transactions. As a result, a
    scan will not include data from ongoing uncommitted transactions.
    Additionally, a snapshot does not need to manage or track any ongoing
    transactions on a DynamoDB table.
* **Custom offset support**: The connector allows you to configure [custom
  offsets](offsets.md#connect-custom-offsets) using the Confluent Cloud user interface to prevent
  data loss and data duplication.
* **Tombstone event and record deletion management**: The connector allows you
  to manage tombstone events and deleted records. Note that when the connector
  detects a delete event, it creates two event messages:
  - A delete event message with `op` type `d` and `document` field with
    the table primary key:
    ```json
    {
       "op": "d"
       "key": {
         "id": "5028"
       },
       "value": {
         "document": "5028"
       },
    }
    ```
  - A tombstone record with Kafka Record Key as the table primary key value and
    Kafka Record Value as `null`:
    ```json
    {
       "key": {
         "id": "5028"
       },
       "value": null
    }
    ```

    Note that Kafka log compaction uses this to know that it can delete all
    messages for this key.

  **Tombstone message sample**
  ```json
  {
     "topic": "table1",
     "key": {
       "id": "5028"
     },
     "value": null,
     "partition": 0,
     "offset": 1
  }
  ```
* **Lease table prefix customization**: The connector supports naming lease
  tables with a prefix.


### Using HTTPS Requests

You can communicate with your hosted ksqlDB cluster by using the
[ksqlDB REST API](/platform/current/ksqldb/developer-guide/ksqldb-rest-api/index.html).

Run the following `curl` command to send a POST request to the `ksql`
endpoint. In this example, the request runs the LIST STREAMS statement and
the response contains details about the streams in the ksqlDB cluster.

- Specify `--basic` authentication in the **Accept** header of your request.
- Send your ksqlDB-specific API key and secret, separated by a
  colon, as the `--user` credentials.

```bash
curl --http1.1 \
    -X "POST" "https://<cloud-ksqldb-url>/ksql" \
    -H "Accept: application/vnd.ksql.v1+json" \
    -H "Content-Type: application/json" \
    --basic --user "<ksqldb-specific-api-key>:<ksqldb-specific-secret>" \
    -d $'{
  "ksql": "LIST STREAMS;",
  "streamsProperties": {}
}'
```

Your output should resemble:

```json
[
{
    "@type": "streams",
    "statementText": "LIST STREAMS;",
    "streams": [
    {
        "type": "STREAM",
        "name": "KSQL_PROCESSING_LOG",
        "topic": "pksqlc-zz321-processing-log",
        "keyFormat": "KAFKA",
        "valueFormat": "JSON",
        "isWindowed": false
    }
    ],
    "warnings": []
}
]
```

For more information, see [ksqlDB API](/platform/current/ksqldb/developer-guide/ksqldb-rest-api/index.html).

For an example that shows fully-managed Confluent Cloud connectors in action with
Confluent Cloud for Apache Flink, see the [Cloud ETL Demo](/platform/current/tutorials/examples/cloud-etl/docs/index.html).
This example also shows how to use Confluent CLI to manage your resources in
Confluent Cloud.

[![image](images/topology.png)](https://docs.confluent.io/platform/current/tutorials/examples/cloud-etl/docs/index.html)


## Flags

```none
    --bootstrap string                    Kafka cluster endpoint (Confluent Cloud) or a comma-separated list of broker hosts, each formatted as "host" or "host:port" (Confluent Platform).
    --group string                        Consumer group ID. (default "confluent_cli_consumer_<randomly-generated-id>")
-b, --from-beginning                      Consume from beginning of the topic.
    --offset int                          The offset from the beginning to consume from.
    --partition int32                     The partition to consume from. (default -1)
    --key-format string                   Format of message key as "string", "avro", "double", "integer", "jsonschema", or "protobuf". Note that schema references are not supported for Avro. (default "string")
    --value-format string                 Format message value as "string", "avro", "double", "integer", "jsonschema", or "protobuf". Note that schema references are not supported for Avro. (default "string")
    --print-key                           Print key of the message.
    --print-offset                        Print partition number and offset of the message.
    --full-header                         Print complete content of message headers.
    --delimiter string                    The delimiter separating each key and value. (default "\t")
    --timestamp                           Print message timestamp in milliseconds.
    --config strings                      A comma-separated list of configuration overrides ("key=value") for the consumer client. For a full list, see https://docs.confluent.io/platform/current/clients/librdkafka/html/md_CONFIGURATION.html
    --config-file string                  The path to the configuration file for the consumer client, in JSON or Avro format.
    --schema-registry-endpoint string     Endpoint for Schema Registry cluster.
    --api-key string                      API key.
    --api-secret string                   API secret.
    --schema-registry-context string      The Schema Registry context under which to look up schema ID.
    --schema-registry-api-key string      Schema registry API key.
    --schema-registry-api-secret string   Schema registry API secret.
    --cluster string                      Kafka cluster ID.
    --context string                      CLI context name.
    --environment string                  Environment ID.
    --certificate-authority-path string   File or directory path to one or more Certificate Authority certificates for verifying the broker's key with SSL.
    --username string                     SASL_SSL username for use with PLAIN mechanism.
    --password string                     SASL_SSL password for use with PLAIN mechanism.
    --cert-location string                Path to client's public key (PEM) used for SSL authentication.
    --key-location string                 Path to client's private key (PEM) used for SSL authentication.
    --key-password string                 Private key passphrase for SSL authentication.
    --protocol string                     Specify the broker communication protocol as "PLAINTEXT", "SASL_SSL", or "SSL". (default "SSL")
    --sasl-mechanism string               SASL_SSL mechanism used for authentication. (default "PLAIN")
    --client-cert-path string             File or directory path to client certificate to authenticate the Schema Registry client.
    --client-key-path string              File or directory path to client key to authenticate the Schema Registry client.
```


## Control Center features

Control Center includes the following pages where you can drill down to view data and
configure features in your Kafka environment.
The following table lists Control Center pages and what they display depending on the mode for Confluent Control Center.

| Control Center feature                                                   | Normal mode                                                                                                                                                                                                                                                                                                                      | Reduced infrastructure mode                                                                                                                                                  |
|--------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [Clusters overview](clusters.md#controlcenter-userguide-clusters)        | View healthy and unhealthy clusters at a glance and search for a cluster
being managed by Control Center. Click on a cluster tile to drill into views
of critical metrics and connected services for that cluster.                                                                                                       | View healthy and unhealthy clusters, the number of topics,
and connected services.                                                                                       |
| [Brokers overview](brokers.md#controlcenter-userguide-brokers)           | View broker partitioning and replication status,
which broker is the active controller, and broker metrics like
throughput and more.                                                                                                                                                                                     | Same as Normal mode. To access the Brokers page in Reduced infrastructure mode,
Use the **Brokers** navigation menu entry.                                               |
| [Topics](topics/overview.md#c3-all-topics)                               | Add and edit topics, view production and consumption metrics for a topic.
Browse, create, and download messages, and manage Schema Registry for topics.                                                                                                                                                                      | Add and edit topics. Browse, create, download messages, and manage Schema Registry for topics.
Note that internal topics are not created in Reduced infrastructure mode. |
| [Connect](connect.md#controlcenter-userguide-connect)                    | Manage, monitor, and configure connectors with [Kafka Connect](/platform/current/connect/index.html#kafka-connect),
the toolkit for connecting external systems to Kafka.                                                                                                                                                    | Same as Normal mode.                                                                                                                                                         |
| [ksqlDB](ksql.md#controlcenter-userguide-ksql)                           | Develop applications against ksqlDB, the streaming SQL engine for Kafka. Use
the ksqlDB page in Control Center to: run, view, and terminate SQL queries; browse
and download messages from query results; add, describe, and drop streams and
tables; and view schemas of available streams and tables in a cluster. | Same as Normal mode.                                                                                                                                                         |
| [Consumers](clients/consumers.md#controlcenter-userguide-consumers)      | View the consumer groups associated with a selected Kafka cluster,
including the number of consumers per group, the number of topics
being consumed, and consumer lag across all relevant topics. The Consumers
feature also contains the redesigned streams monitoring page.                                        | Same as Normal mode.                                                                                                                                                         |
| [Replicators](replicators.md#controlcenter-userguide-replicators)        | Monitor and configure replicated topics and create replica topics that preserve
topic configuration in the source cluster.                                                                                                                                                                                                   | Configure replicated topics and create replica topics that preserve
topic configuration in the source cluster.                                                           |
| [Cluster Settings](clusters.md#controlcenter-userguide-cluster-settings) | View and edit cluster properties and broker configurations.                                                                                                                                                                                                                                                                      | Same as Normal mode.                                                                                                                                                         |
| [Alerts](alerts/concepts.md#concepts-alerts)                             | Use Alerts to define the trigger criteria for anomalous events
that occur during data monitoring and to trigger an alert when those events
occur. Set triggers, actions, and view alert history across all of your
Control Center clusters.                                                                          | Use Alerts to define a limited set of triggers that do not rely on monitoring
and/or metrics data (Cluster down, consumer lag and consumer lead).                        |


### Initialization

The Consumer is configured using a dictionary in the examples below. If you are
running Kafka locally, you can initialize the Consumer as shown below.

```python
from confluent_kafka import Consumer

conf = {'bootstrap.servers': 'host1:9092,host2:9092',
        'group.id': 'foo',
        'auto.offset.reset': 'smallest'}

consumer = Consumer(conf)
```

If you are connecting to a Kafka cluster in Confluent Cloud, you need to provide
credentials for access. The example below shows using a cluster API key and
secret.

```python
from confluent_kafka import Consumer

conf = {'bootstrap.servers': 'pkc-abcd85.us-west-2.aws.confluent.cloud:9092',
        'security.protocol': 'SASL_SSL',
        'sasl.mechanism': 'PLAIN',
        'sasl.username': '<CLUSTER_API_KEY>',
        'sasl.password': '<CLUSTER_API_SECRET>',
        'group.id': 'foo',
        'auto.offset.reset': 'smallest'}

consumer = Consumer(conf)
```

The `group.id` property is mandatory and specifies which consumer group the consumer
is a member of. The `auto.offset.reset` property specifies what offset the consumer
should start reading from in the event there are no committed offsets for a partition,
or the committed offset is invalid (perhaps due to log truncation).

The local example below shows `enable.auto.commit` configured to `false`
in the consumer. The default value is `True`.

```python
from confluent_kafka import Consumer

conf = {'bootstrap.servers': 'host1:9092,host2:9092',
        'group.id': 'foo',
        'enable.auto.commit': 'false',
        'auto.offset.reset': 'earliest'}

consumer = Consumer(conf)
```

* For information on the available configuration properties, refer to the
  [API Documentation](/platform/current/clients/confluent-kafka-python/html/index.html).
* For a step-by-step tutorial using the Python client including code samples for
  the producer and consumer see [this guide](https://developer.confluent.io/get-started/python/).


#### OAuth2 authentication example

It is important to note that the connector’s OAuth2 configuration only allows
for use of the Client Credentials grant type.

1. Run the demo app with the `oauth2` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=oauth2
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=HttpSinkOAuth2
   topics=http-messages
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=OAUTH2
   oauth2.token.url=http://localhost:8080/oauth/token
   oauth2.client.id=kc-client
   oauth2.client.secret=kc-secret
   ```

   For details about using this connector with Kafka Connect Reporter, see
   [Connect
   Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
3. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart).


## Quick Start

In this quick start guide, the Marketo Source connector is used to consume records from Marketo entities `leads`, `campaigns`, `activities` entities of types (`activities_add_to_nurture,activities_add_to_opportunity`) and send the records to respective Kafka topics named `marketo_leads`, `marketo_campaigns` and `marketo_activities`.

1. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your confluent platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-marketo:latest
   ```
2. Start the Confluent Platform.
   ```bash
   confluent local start
   ```
3. Check the status of all services.
   ```bash
   confluent local status
   ```
4. Configure your connector by first creating a JSON file named `marketo-configs.json` with the following properties. Find the REST API endpoint URL from the process described in [Marketo REST API Quickstart](https://developers.marketo.com/blog/quick-start-guide-for-marketo-rest-api/). This endpoint URL will be used in the `marketo.url` configuration key (as shown in the following example) of the connector, but ensure you remove the path `rest` from the endpoint URL before using it in connector configurations. To see the process of determining the OAuth client ID and OAuth client secret, see [Marketo REST API Quickstart](https://developers.marketo.com/blog/quick-start-guide-for-marketo-rest-api/). `tasks.max` should be 3 here as there are three entity types: `leads`, `campaigns` and `activities`.
   ```bash
   // substitute <> with your config
   {
       "name": "marketo-connector",
       "config": {
           "connector.class": "io.confluent.connect.marketo.MarketoSourceConnector",
           "key.converter": "org.apache.kafka.connect.storage.StringConverter",
           "value.converter": "org.apache.kafka.connect.json.JsonConverter",
           "value.converter.schemas.enable": "false",
           "confluent.topic.bootstrap.servers": "127.0.0.1:9092",
           "confluent.topic.replication.factor": 1,
           "confluent.license": "<license>", // leave it empty for evaluation license
           "tasks.max": 3,
           "poll.interval.ms": 1000,
           "topic.name.pattern": "marketo_${entityName}",
           "marketo.url": "https://<instance-id>.mktorest.com/",
           "marketo.since": "2020-07-01T00:00:00+00:00",
           "entity.names": "activities_add_to_nurture,activities_add_to_opportunity,campaigns,leads",
           "oauth2.client.id": "<client_id>",
           "oauth2.client.secret": "<client_secret>"
       }
   }
   ```
5. Start the Marketo Source connector by loading the connector’s configuration with the following command:
   ```bash
   confluent local load marketo-connector -- -d marketo-configs.json
   ```
6. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status marketo-connector
   ```
7. Create some `leads`, `activities` and `campaigns` records using [Marketo APIs](https://developers.marketo.com/rest-api/endpoint-reference/). Use POST or Bulk Import APIs of appropriate entities to inject some sample records.
8. Confirm the messages from entities `leads`, `activities`, and `campaigns` were delivered to the `marketo_leads`, `marketo_activities` and `marketo_campaigns` topics respectively, in Kafka. Note, it may take about a minute for assets (`campaigns`) and about 5 minutes or more (depending upon the time Marketo server instance takes to prepare the export file) for export entities (`leads` and `activities`).
   ```bash
   confluent local consume marketo_leads -- --from-beginning
   ```


#### Connector configuration

1. Create your `oracle-cdc-confluent-cloud.json` file based on the following example:
   ```json
   {

      "name": "OracleCDC_Confluent_Cloud",
      "config":{
         "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
         "name": "OracleCDC_Confluent_Cloud",
         "tasks.max":3,

         "oracle.server": "<database-url>",
         "oracle.sid":"<SID of the CDB>",
         "oracle.pdb.name":"<name of the PDB where tables reside. If you don't have PDB, remove this config property>",
         "oracle.username": "<username e.g. C##MYUSER>",
         "oracle.password": "<password>",
         "start.from":"snapshot",

         "redo.log.topic.name": "oracle-redo-log-topic",
         "redo.log.consumer.bootstrap.servers":"<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>",
         "redo.log.consumer.sasl.jaas.config":"org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<kafka-api-key>\" password=\"<kafka-api-secret>\";",
         "redo.log.consumer.security.protocol":"SASL_SSL",
         "redo.log.consumer.sasl.mechanism":"PLAIN",

         "table.inclusion.regex":"<regex-expression e.g. ORCL[.]ADMIN[.]MARIPOSA.*>",
         "_table.topic.name.template_":"Using template vars to set change event topic for each table",
         "table.topic.name.template": "${databaseName}.${schemaName}.${tableName}",
         "connection.pool.max.size": 20,
         "confluent.topic.replication.factor":3,

         "topic.creation.groups":"redo",
         "topic.creation.redo.include":"oracle-redo-log-topic",
         "topic.creation.redo.replication.factor":3,
         "topic.creation.redo.partitions":1,
         "topic.creation.redo.cleanup.policy":"delete",
         "topic.creation.redo.retention.ms":1209600000,
         "topic.creation.default.replication.factor":3,
         "topic.creation.default.partitions":5,
         "topic.creation.default.cleanup.policy":"compact",

         "confluent.topic.bootstrap.servers":"<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>",
         "confluent.topic.sasl.jaas.config":"org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<kafka-api-key>\" password=\"<kafka-api-secret>\";",
         "confluent.topic.security.protocol":"SASL_SSL",
         "confluent.topic.sasl.mechanism":"PLAIN",

         "value.converter":"io.confluent.connect.avro.AvroConverter",
         "value.converter.basic.auth.credentials.source":"USER_INFO",
         "value.converter.schema.registry.basic.auth.user.info":"<schema-registry-api-key>:<schema-registry-api-secret>",
         "value.converter.schema.registry.url":"<your-schema-registry-url e.g https://xyz.us-east-2.aws.confluent.cloud>"

       }
   }
   ```
2. Create `oracle-redo-log-topic`. Make sure the topic name matches the value you put for `"redo.log.topic.name"`.

   **Confluent Platform CLI**
   ```text
   bin/kafka-topics --create --topic oracle-redo-log-topic \
   --bootstrap-server broker:9092 --replication-factor 1 \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```

   **Confluent Cloud CLI**
   ```text
   confluent kafka topic create oracle-redo-log-topic \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```
3. Start the Oracle CDC Source connector with the following command:
   ```text
   curl -s -H "Content-Type: application/json" -X POST -d @oracle-cdc-confluent-cloud.json http://localhost:8083/connectors/ | jq
   ```


## Quick Start

The quick start guide uses ServiceNow Sink connector to consume records from
Kafka and send them to a ServiceNow table. This guide assumes multi-tenant
environment is used. For local testing, refer to [Running
Connect in standalone mode](/kafka-connectors/self-managed/userguide.html#configuring-and-running-workers).

1. Create a table called `test_table` in ServiceNow.
   ![image](images/servicenow_create_table.png)
2. Define three columns in the table.
   ![image](images/servicenow_define_columns.png)
3. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your confluent platform installation directory
   confluent-hub install confluentinc/kafka-connect-servicenow:latest
   ```
4. Start the Confluent Platform.
   ```bash
   confluent local start
   ```
5. Check the status of all services.
   ```bash
   confluent local services status
   ```
6. Create a `servicenow-sink.json` file with the following contents:

   #### NOTE
   All user-defined tables in ServiceNow start with `u_`,

   ```bash
    // substitute <> with your config
    {
       "name": "ServiceNowSinkConnector",
       "config": {
           "connector.class": "io.confluent.connect.servicenow.ServiceNowSinkConnector",
           "topics": "test_table",
           "servicenow.url": "https://<endpoint>.service-now.com/",
           "tasks.max": "1",
           "servicenow.table": "u_test_table",
           "servicenow.user": "<username>",
           "servicenow.password": "<password>",
           "key.converter": "io.confluent.connect.avro.AvroConverter",
           "key.converter.schema.registry.url": "http://localhost:8081",
           "value.converter": "io.confluent.connect.avro.AvroConverter",
           "value.converter.schema.registry.url": "http://localhost:8081",
           "confluent.topic.bootstrap.servers": "localhost:9092",
           "confluent.license": "<license>", // leave it empty for evaluation license
           "confluent.topic.replication.factor": "1",
           "reporter.bootstrap.servers": "localhost:9092",
           "reporter.error.topic.name": "test-error",
           "reporter.error.topic.replication.factor": 1,
           "reporter.error.topic.key.format": "string",
           "reporter.error.topic.value.format": "string",
           "reporter.result.topic.name": "test-result",
           "reporter.result.topic.key.format": "string",
           "reporter.result.topic.value.format": "string",
           "reporter.result.topic.replication.factor": 1
       }
   }
   ```

   #### NOTE
   For details about using this connector with Kafka Connect Reporter, see
   [Connect
   Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
7. Load the ServiceNow Sink connector by posting configuration to Connect REST
   server.
   ```bash
   confluent local load ServiceNowSinkConnector --config servicenow-sink.json
   ```
8. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status ServiceNowSinkConnector
   ```
9. To produce some records into the `test_table` topic, first start a Kafka producer.

   #### NOTE
   All user-defined columns in ServiceNow start with `u_`

   ```bash
   kafka-avro-console-producer \
   --broker-list localhost:9092 --topic test_table \
   --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"u_name","type":"string"},
   {"name":"u_price", "type": "float"}, {"name":"u_quantity", "type": "int"}]}'
   ```
10. The console producer is now waiting for input, so you can go ahead and
    insert some records into the topic.
    ```json
    {"u_name": "scissors", "u_price": 2.75, "u_quantity": 3}
    {"u_name": "tape", "u_price": 0.99, "u_quantity": 10}
    {"u_name": "notebooks", "u_price": 1.99, "u_quantity": 5}
    ```
11. Confirm the messages were delivered to the ServiceNow table by using
    the ServiceNow user interface.
    ![image](images/servicenow_result.png)


## Quick Start

This quick start uses the Solace Source connector to consume records from a
Solace PubSub+ Standard broker and send them to Kafka.

1. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-solace-source:latest
   ```
2. [Install the Solace JMS Client Library](#install-solace-client-jar).
3. Start the Confluent Platform.
   ```bash
   confluent local start
   ```
4. Start a Solace PubSub+ Standard docker container.
   ```bash
   docker run -d --name "solace" \
     -p 8080:8080 -p 55555:55555 -p 9000:9000 \
     --shm-size=1000000000 \
     --tmpfs /dev/shm \
     --ulimit nofile=2448:38048 \
     -e username_admin_globalaccesslevel=admin \
     -e username_admin_password=admin \
     solace/solace-pubsub-standard:9.1.0.77
   ```
5. Once the Solace docker container has started, navigate to the [Solace UI](http://localhost:8080) and configure a `connector-quickstart` queue in
   the `Default` message VPN.
6. Publish messages to the Solace queue using the REST endpoint.
   ```bash
   curl -X POST -d "m1" http://localhost:9000/Queue/connector-quickstart -H "Content-Type: text/plain" -H "Solace-Message-ID: 1000"

   # repeat the above command to send additional messages (change the Solace-Message-ID header on each message)
   ```
7. Create a `solace-source.json` file with the following contents:
   ```json
   {
     "name": "SolaceSourceConnector",
     "config": {
       "connector.class": "io.confluent.connect.solace.SolaceSourceConnector",
       "tasks.max": "1",
       "kafka.topic": "from-solace-messages",
       "solace.host": "smf://localhost:55555",
       "solace.username": "admin",
       "solace.password": "admin",
       "jms.destination.type": "queue",
       "jms.destination.name": "connector-quickstart",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "io.confluent.connect.avro.AvroConverter",
       "value.converter.schema.registry.url": "http://localhost:8081",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```
8. Load the Solace Source connector.
   ```bash
   confluent local load solace --config solace-source.json
   ```
9. Confirm the connector is in a `RUNNING` state.
   ```bash
   confluent local status SolaceSourceConnector
   ```
10. Confirm the messages were delivered to the `from-solace-messages` topic in
    Kafka.
    ```bash
    kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081 --topic from-solace-messages --from-beginning
    ```


## Quick Start

The following steps show the `SpoolDirCsvSourceConnector` loading a mock CSV file to a Kafka topic named `spooldir-testing-topic`. The other connectors are similar but load from different file types.

Prerequisites
: - [Confluent Platform](/platform/current/installation/index.html)
  - [Confluent CLI](https://docs.confluent.io/confluent-cli/current/installing.html) (requires separate installation)

1. Install the connector through the [Confluent Hub Client](/kafka-connectors/self-managed/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin installjcustenborder/kafka-connect-spooldir:latest
   ```
2. Start Confluent Platform using the Confluent CLI [confluent local](https://docs.confluent.io/confluent-cli/current/command-reference/local/index.html) commands.
   ```bash
   confluent local start
   ```
3. Create a data directory and generate test data.
   ```bash
   mkdir data && curl "https://api.mockaroo.com/api/58605010?count=1000&key=25fd9c80" > "data/csv-spooldir-source.csv"
   ```
4. Set up directories for files with errors and files that finished successfully.
   ```bash
   mkdir error && mkdir finished
   ```
5. Create a `spooldir.json` file with the following contents:
   ```json
   {
     "name": "CsvSpoolDir",
     "config": {
       "tasks.max": "1",
       "connector.class": "com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector",
       "input.path": "/path/to/data",
       "input.file.pattern": "csv-spooldir-source.csv",
       "error.path": "/path/to/error",
       "finished.path": "/path/to/finished",
       "halt.on.error": "false",
       "topic": "spooldir-testing-topic",
       "csv.first.row.as.header": "true",
       "schema.generation.enabled": "true"
     }
   }
   ```
6. Load the SpoolDir CSV Source connector.
   ```bash
   confluent local load spooldir --config spooldir.json
   ```

   #### IMPORTANT
   Don’t use the [confluent local](https://docs.confluent.io/confluent-cli/current/command-reference/local/index.html) commands in production environments.
7. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status spooldir
   ```
8. Confirm that the messages are being sent to Kafka.
   ```bash
   kafka-avro-console-consumer \
       --bootstrap-server localhost:9092 \
       --property schema.registry.url=http://localhost:8081 \
       --topic spooldir-testing-topic \
       --from-beginning | jq '.'
   ```
9. Confirm that the source CSV file has been moved to the `finished` directory.


#### NOTE
Mounting custom volumes does not support multiple PersistentVolumes for
ZooKeeper and Kafka data. CFK configures and manages one PersistentVolume for
ZooKeeper and Kafka data.

The following are a few of the common use cases for custom volume mounts:

* Third-party secrets providers

  As an alternative to using Kubernetes secrets to secure sensitive information,
  you can use a vault product like HashiCorp Vault, AWS Secrets Manager, and
  Azure KeyVault.

  You integrate a third-party secrets provider by configuring an ephemeral
  volume mount for the Confluent component pod that takes the credentials from
  the secrets provider.
* Kafka connectors

  Some Kafka connectors require JARs, that are outside of the Connect plugin
  but need to be available to the Connect pods. You can create persistent
  volumes with the connector JARs and mount them on the Connect worker pods.
* Multiple custom partitions

  For example, you could write logs to a separate persistent volume of your
  choice.


In CFK, you mount custom volumes to Confluent component pods by defining custom
volume mounts in the component custom resources (CRs), such as for Kafka, ZooKeeper,
Control Center, Schema Registry, ksqlDB, Connect, and Kafka REST Proxy. The same volume
will be mounted on all the pods in the component cluster in the specified paths.

To mount custom volumes to a Confluent Platform component:

1. Configure the volumes according to the driver specification.
2. Add the following to the Confluent Platform component CR:
   ```yaml
   spec:
     mountedVolumes:         --- [1]
       volumes:              --- [2]
       volumeMounts:         --- [3]
   ```

   * [1] `mountedVolumes` is an array of the `volumes` and `volumeMounts`
     that are requested for this component.
   * [2] Required. `volumes` is an array of named volumes in a pod that
     may be accessed by any container in the pod.

     For the supported volume types and the specific configuration properties
     required for each volume type, see [Kubernetes Volume Types](https://kubernetes.io/docs/concepts/storage/volumes/#volume-types) for
     the supported volume types.
   * [3] Required. Describes mounting paths of the `volumes` within this
     container.

     For the configuration properties for volume mount, see [Kubernetes Pod
     volumeMounts](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#volumes-1).
3. Apply the CR using the `kubectl apply` command.

Before the volumes and volume mounts are added to the component pod template,
CFK performs a validation to ensure that there is no conflict with internal
volume mounts. Reconcile will fail, and the error will be added to the CFK logs
in the following cases:

* A custom volume’s mount path conflicts with an internal mount path.

  These are the internal mounts used by Confluent Platform components:
  * `/mnt/config`
  * `/mnt/config/init`
  * `/mnt/config/shared`
  * `/mnt/data/data0`
  * `/mnt/plugins`
  * `/opt/confluentinc`
* A custom volume’s mount path conflicts with a custom-mounted secret.
* There is a conflict between the custom volume names or custom volume mount
  paths.

The below example is to mount an Azure file volume and HashiCorp vault with
SecretProviderClass and a CSI driver:

```yaml
apiVersion: platform.confluent.io/v1beta1
kind: Kafka
spec:
  mountedVolumes:
    volumes:
    - name: azure
      azureFile:
        secretName: azure-secret
        shareName: aksshare
        readOnly: true
    - name: secrets-store-inline
      csi:
        driver: secrets-store.csi.k8s.io
        readOnly: true
        volumeAttributes:
          secretProviderClass: "vault-database"
    volumeMounts:
    - name: azure
      mountPath: /mnt/azurePath1
    - name: azure
      mountPath: /mnt/azurePath2
    - name: secrets-store-inline
      mountPath: "/mnt/secrets-store"
      readOnly: true
```


## External access to other Confluent Platform components using load balancers

The external clients can connect to other Confluent Platform components using load balancers.

The access endpoint of each Confluent Platform component is:  `<component CR
name>.<Kubernetes domain>`

For example, in the `example.com` domain with TLS enabled, you
access the Confluent Platform components at the following endpoints:

* `https://connect.example.com`
* `https://replicator.example.com`
* `https://schemaregistry.example.com`
* `https://ksql.example.com`
* `https://controlcenter.example.com`
* `https://kafkarestproxy.example.com`

**To allow external access to Kafka using load balancers:**

1. Set the following in the component CR and apply the configuration:
   ```yaml
   spec:
     externalAccess:
       type: loadBalancer
       loadBalancer:
         domain:                 --- [1]
         prefix:                 --- [2]
         sessionAffinity:        --- [3]
         sessionAffinityConfig:  --- [4]
           clientIP:
             timeoutSeconds:     --- [5]
   ```

   * [1] Required. Set `domain` to the domain name of your Kubernetes cluster.

     If you change this value on a running cluster, you must roll the cluster.
   * [2] Optional. Set `prefix` to change the default load balancer prefixes.
     The default is the component name, such as `controlcenter`,
     `connect`, `replicator`, `schemaregistry`, `ksql`.

     The value is used for the DNS entry. The component DNS name becomes
     `<prefix>.<domain>`.

     If not set, the default DNS name is `<component name>.<domain>`, for
     example, `controlcenter.example.com`.

     You may want to change the default prefixes for each component to avoid DNS
     conflicts when running multiple Kafka clusters.

     If you change this value on a running cluster, you must roll the cluster.
   * [3] Required for consumer REST Proxy to enable client IP-based session
     affinity.

     For REST Proxy to be used for Kafka consumers, set to `ClientIP`. See
     [Kubernetes Service](https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies)
     for more information about session affinity.
   * [4] Contains the configurations of session affinity if set
     `sessionAffinity: ClientIP` in [3].
   * [5] Specifies the seconds of `ClientIP` type session sticky time. The
     value must be bigger than `0` and less than or equal to `86400` (1 day).

     Default value is `10800` (3 hours).
2. Add a DNS entry for each Confluent Platform component that you added a load balancer to.

   Once the external load balancers are created, you add a DNS entry associated
   with component load balancers to your DNS table (or whatever method you use
   to get DNS entries recognized by your provider environment).

   You need the following to derive Confluent Platform component DNS entries:
   * Domain name of your Kubernetes cluster as set in Step #1
   * The external IP of the component load balancers

     You can retrieve the external IP using the following command:
     ```bash
     kubectl get services -n <namespace> -ojson
     ```
   * The component `prefix` if set in Step #1 above. Otherwise, the default
     component name.

   A DNS name is made up of the `prefix` and the `domain` name. For example,
   `controlcenter.example.com`.

For a tutorial scenario on configuring external access using load balancers, see
the [quickstart tutorial for using load balancer](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/networking/external-access-load-balancer-deploy).


## Create a rolebinding

1. Create a ConfluentRoleBinding CR.

   The following is the structure of the CR:
   ```yaml
   kind: ConfluentRolebinding
   metadata:
     name:
     namespace:
   spec:
     principal:                  --- [1]
       type:                     --- [2]
       name:                     --- [3]
     role:                       --- [4]
     resourcePatterns:           --- [5]
       - name:                   --- [6]
         resourceType:           --- [7]
         patternType:            --- [8]
     clustersScopeByIds:         --- [9]
       kafkaClusterId:           --- [9a]
       schemaRegistryClusterId:  --- [9b]
       connectClusterId:         --- [9c]
       ksqlClusterId:            --- [9d]
     clustersScopeByRegistryName:--- [10]
     kafkaRestClassRef:          --- [11]
       name:                     --- [12]
       namespace:                --- [13]
   ```

   * [1] Required. The identity of a user or group this rolebinding is created for.
   * [2] Required. The type of the principal. Set it to `user` or `group`.
   * [3] Required. The name of the principal.
   * [4] Required. Predefined role name. For the predefined roles you can use,
     refer to [Confluent RBAC Predefined Roles](https://docs.confluent.io/platform/current/security/rbac/rbac-predefined-roles.html).
   * [5] Optional. Qualified resources associated with this rolebinding.
   * [6] Required. The name of the resource associated with this rolebinding.

     This setting cannot be updated. When you update this resource name, a new
     rolebinding is created.
   * [7] Required. The type of resource the role binding is applied to. Valid
     options are `Topic`, `Group`, `Subject`, `KsqlCluster`,
     `Cluster`, and `TransactionalId`. For more information about the RBAC
     resource, see [Authorization using Role-Based Access Control](https://docs.confluent.io/platform/current/security/rbac/index.html#terminology).
   * [8] Optional. Specify whether the pattern of resource is `PREFIXED` or
     `LITERAL`. The default is `LITERAL` if not set.
   * [9] Optional. The scope of the cluster id. You can specify a cluster name
     ([10]) or one scope id among `kafkaClusterId`,
     `schemaRegistryClusterId`, `ksqlClusterId`, and `connectClusterId`.
     * [9a] Get the Kafka cluster ID using one of the following commands:
       ```bash
       confluent cluster describe --url https://<kafka_bootstrap_endpoint>
       ```

       ```bash
       curl -ik https://<mds_endpoint>/v1/metadata/id
       ```
     * [9b] Schema Registry cluster id is in the following pattern: `id_<SR-CR-name>_<namespace>`
     * [9c] Connect cluster id is in the following pattern: `<namespace>.<Connect-CR-name>`
     * [9d] ksqlDB cluster id is in the following pattern: `<namespace>.<ksqldb-CR-name>`
   * [10] Optional. The cluster name registered in the [cluster registry](https://docs.confluent.io/platform/current/security/cluster-registry.html#cluster-registry-and-mds),
     which uniquely identifies the cluster for this rolebinding.
   * [11] Optional. The KafkaRestClass CR that defines configurations for the
     Confluent REST Class. If it is not configured, the default KafkaRestClass is
     used.
   * [12] Required under `kafkaRestClassRef` ([11]). The name of the
     KafkaRestClass CR.
   * [13] Optional. If omitted, the same namespace of this ConfluentRoleBinding
     CR is used.
2. Apply the ConfluentRolebinding CR:
   ```bash
   kubectl apply -f <ConfluentRolebinding CR>
   ```

The following example shows how a Confluent CLI command to create a
role binding is translated to a ConfluentRolebinding CR:

```bash
confluent iam rbac role-binding create --principal User:<user-id> \
  --role DeveloperRead --resource Subject:* \
  --kafka-cluster-id <kafka-cluster-id> \
  --schema-registry-cluster-id <schema-registry-group-id>
```

```yaml
apiVersion: platform.confluent.io/v1beta1
kind: ConfluentRolebinding
metadata:
  name: internal-schemaregistry-schema-validation
  namespace: <namespace>
spec:
  principal:
    name: <user-id>
    type: user
  clustersScopeByIds:
    schemaRegistryClusterId: <schema-registry-group-id>
    kafkaClusterId: <kafka-cluster-id>
  resourcePatterns:
    name: "*"
    patternType: LITERAL
    resourceType: Subject
  role: DeveloperRead
```


### Migrate RBAC from using LDAP to using both LDAP and OAuth

This section describes the steps to upgrade a Confluent Platform deployment configured with
LDAP-based RBAC to LDAP and OAuth-based RBAC.

To migrate your Confluent Platform deployment to use OAuth, the Confluent Platform version must be 7.7.

Upgrading the Confluent Platform version and migrating to OAuth simultaneously is not
supported.

Even though this upgrade can be done in one step, as described in this
section, we recommend the two-step migration, MDS first, and the rest of the
components to reduce failed restarts of components.

To migrate an existing Confluent Platform deployment from LDAP to LDAP and OAuth:

1. Upgrade the MDS with the required OAuth settings as described in
   [Enable RBAC for Kafka](#co-rbac-kafka) and apply the CR with the `kubectl apply` command.

   Following is a sample snippet of a Kafka CR with LDAP and OAuth:
   ```yaml
   kind:kafka
   spec:
     services:
       mds:
         provider:
           ldap:
             address: ldaps://ldap.operator.svc.cluster.local:636
             authentication:
               type: simple
               simple:
                 secretRef: credential
             tls:
               enabled: true
             configurations:
           oauth:
             configurations:
     dependencies:
       kafkaRest:
         authentication:
           type: oauth
           jaasConfig:
             secretRef: oauth-secret
           oauthSettings:
   ```
2. After the Kafka successfully restarts, upgrade the rest of the Confluent Platform components.
   1. Add the following annotation to the Schema Registry, Connect, and Control Center CRs:
      ```yaml
      kind: <component>
      metadata:
        annotations:
          platform.confluent.io/disable-internal-rolebindings-creation: "true"
      ```
   2. Add the OAuth settings to the rest of the Confluent Platform components as described
      in [Enable RBAC for KRaft controller](#co-rbac-kraft) and [Enable RBAC for other Confluent Platform components](#co-rbac-cp) and apply the CRs
      with the `kubectl apply` command.

      The following are sample snippets of the relevant settings in the
      component CRs.
      ```yaml
      kind: KRaftController
      spec:
        dependencies:
          mdsKafkaCluster:
            bootstrapEndpoint:
            authentication:
              type: oauth
              jaasConfig:
                secretRef:
              oauthSettings:
                tokenEndpointUri:
      ```

      ```yaml
      kind: KafkaRestClass
      spec:
        kafkaRest:
          authentication:
            type: oauth
            oauth:
              secretRef:
              configuration:
      ```

      ```yaml
      kind: SchemaRegistry
      spec:
        dependencies:
          mds:
            authentication:
              type: oauth
              oauth:
                secretRef:
                configuration:
      ```
   3. If you have existing connectors, add the following to the Connect CR to
      avoid possible down time.
      ```none
      kind: Connect
      spec:
        configOverrides:
          server:
            - producer.sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginCallbackHandler
            - consumer.sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginCallbackHandler
            - admin.sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginCallbackHandler
      ```
3. Log into Control Center and check if you can see Kafka, Schema Registry, and Connect.
4. [Create internal role binding](co-manage-rbac.md#co-create-rolebinding).


## REST Proxy

The REST Proxy should be used for any language that does not have native clients
with serializers compatible with Schema Registry. It is a convenient,
language-agnostic method for interacting with Kafka. Almost all standard
libraries have good support for HTTP and JSON, so even if a wrapper of the API
does not exist for your language it should still be easy to use the API. It also
automatically translates between Avro and JSON. This simplifies writing
applications in languages that do not have good Avro support.

The [REST Proxy API Reference](../kafka-rest/api.md#kafkarest-api) describes the complete API in
detail, but we will highlight some key interactions here. First, you will want
to produce data to Kafka. To do so, construct a `POST` request to the
`/topics/{topicName}` resource including the schema for the data (plain
integers in this example) and a list of records, optionally including the
partition for each record.

```http
POST /topics/test HTTP/1.1
Host: kafkaproxy.example.com
Content-Type: application/vnd.kafka.avro.v1+json
Accept: application/vnd.kafka.v1+json, application/vnd.kafka+json, application/json

{
  "value_schema": "{\"name\":\"int\",\"type\": \"int\"}"
  "records": [
    {
      "value": 12
    },
    {
      "value": 24,
      "partition": 1
    }
  ]
}
```

Note that REST Proxy relies on content type information to properly convert data
to Avro, so you *must* specify the `Content-Type` header. The response
includes the same information you would receive from the Java clients API about
the partition and offset of the published data (or errors in case of
failure). Additionally, it includes the schema IDs it registered or looked up in
Schema Registry.

```http
HTTP/1.1 200 OK
Content-Type: application/vnd.kafka.v1+json

{
  "key_schema_id": null,
  "value_schema_id": 32,
  "offsets": [
    {
      "partition": 2,
      "offset": 103
    },
    {
      "partition": 1,
      "offset": 104
    }
  ]
}
```

In future requests, you can use this schema ID instead of the full schema,
reducing the overhead for each request. You can also produce data to specific
partitions using a similar request format with the
`/topics/{topicName}/partitions/{partition}` endpoint.

To achieve good throughput, it is important to batch your produce requests so
that each HTTP request contains many records. Depending on durability and
latency requirements, this can be as simple as maintaining a queue of records
and only send a request when the queue has reached a certain size or a timeout is
triggered.

Consuming data is a bit more complex because consumers are stateful. However, it
still only requires two API calls to get started. See the
[API Reference](../kafka-rest/api.md#kafkarest-api) for complete details and examples.

Finally, the API also provides metadata about the cluster, such as the set brokers, list
of topics, and per-partition information. However, most applications will not
need to use these endpoints.

Note that it is also possible to use
[non-Java clients](https://cwiki.apache.org/confluence/display/KAFKA/Clients)
developed by the community and manage registration and schema validation manually using the
[Schema Registry API](../schema-registry/develop/api.md#schemaregistry-api). However, as this is error-prone
and must be duplicated across every application, we recommend using the REST
Proxy unless you need features that are not exposed via the REST Proxy.


#### IMPORTANT
* If you do not specify a license, CMF will generate a trial license.
* You can use some Confluent Platform components with Confluent Cloud brokers with a valid license,
  including with CMF version 2.1 and later, Flink version 2.0 and later, and Kafka source and sink connectors for Flink version 4.0 and later.
  For support with these self-managed Confluent Platform components, you must have a valid Confluent Enterprise License for Customer-managed Confluent Platform for Confluent Cloud subscription license.
  For more details, see [Confluent Platform Enterprise Licensing Requirements](../../installation/license.md#customer-managed-cp-cc-license).

1. (Optional) Store your Confluent license in a Kubernetes secret.
   ```none
   kubectl create secret generic <license-secret-name> --from-file=license.txt
   ```
2. (Optional) Create a CMF database encryption key into a Kubernetes secret

   CMF is storing sensitive data such as secrets in its internal database. Below instructions are
   for setting up the encryption key for the CMF database.
   CMF has a `cmf.sql.production` property. When the property is set to `false`, encryption is disabled. Otherwise,
   an encryption key is required.
   ```none
   # Generate a 256-bit key (recommended for production)
   openssl rand -out cmf.key 32
   # Create a Kubernetes secret with the encryption key
   kubectl create secret generic <secret-name> \
     --from-file=<property-name>=cmf.key
     -n <your-cmf-namespace>
   ```

   During the CMF installation, pass the following Helm parameter to use the encryption key:
   ```none
   --set encryption.key.kubernetesSecretName=<secret-name> \
   --set encryption.key.kubernetesSecretProperty=<property-name>
   ```

   **Example**
   ```none
   openssl rand -out cmf.key 32
   kubectl create secret generic cmf-encryption-key \
     --from-file=encryption-key=cmf.key \
     -n confluent
   helm upgrade --install cmf --version "~2.1.0" \
           confluentinc/confluent-manager-for-apache-flink \
           --namespace confluent \
           --set encryption.key.kubernetesSecretName=cmf-encryption-key \
           --set encryption.key.kubernetesSecretProperty=encryption-key
   ```

   #### WARNING
   You must backup the encryption key, CMF does not keep a backup of it. If the key is lost,
   you will no longer be able to access the encrypted data stored in the database.
3. Install CMF using the default configuration:

   For deployment on OpenShift, you must also pass
   `--set podSecurity.securityContext.fsGroup=null  --set podSecurity.securityContext.runAsUser=null`
   to below Helm command.
   ```none
   helm upgrade --install cmf confluentinc/confluent-manager-for-apache-flink  \
   --version "~2.1.0" \
   --namespace <namespace> \
   --set license.secretRef=<license-secret-name> \
   --set cmf.sql.production=false # or pass --set encryption.key.kubernetesSecretName ...
   ```

   #### NOTE
   CMF will create a `PersistentVolumeClaim` (PVC) in Kubernetes. If the PVC remains in status `Pending`,
   check your Kubernetes cluster configuration and make sure a Container Storage Interface (CSI) driver is installed
   and configured correctly. Alternatively, if you want to run CMF without persistent storage, you can disable the
   PVC by setting the `persistence.create` property to `false`. Note that in this case, a restart of the CMF pod
   will lead to a data loss.
4. Configure the Chart.
   Helm provides [several options](https://helm.sh/docs/intro/using_helm/#customizing-the-chart-before-installing)
   for setting and overriding values in a chart.
   For CMF, you should customize the chart by passing a values file with the `--values` flag.

   First, use Helm to show the default `values.yaml` file for CMF.
   ```bash
   helm inspect values confluentinc/confluent-manager-for-apache-flink --version "~2.1.0"
   ```

   You should see output similar to the following:
   ```bash
   ## Image pull secret
   imagePullSecretRef:

   ## confluent-manager-for-apache-flink image
   image:
   repository: confluentinc
   name: cp-cmf
   pullPolicy: IfNotPresent
   tag: 1.0.1

   ## CMF Pod Resources
   resources:
   limits:
      cpu: 2
      memory: 1024Mi
   requests:
      cpu: 1
      memory: 1024Mi

   ## Load license either from K8s secret
   license:
   ##
   ## The license secret reference name is injected through
   ## CONFLUENT_LICENSE environment variable.
   ## The expected key: license.txt. license.txt contains raw license data.
   ## Example:
      ##   secretRef: confluent-license-for-cmf
   secretRef: ""

   ## Pod Security Context
   podSecurity:
   enabled: true
   securityContext:
      fsGroup: 1001
      runAsUser: 1001
      runAsNonRoot: true

   ## Persistence for CMF
   persistence:
   # if set to false, the database will be on the pod ephemeral storage, e.g. gone when the pod stops
   create: true
   dataVolumeCapacity: 10Gi
   ##  storageClassName: # Without the storage class, the default storage class is used.

   ## Volumes to mount for the CMF pod.
   ##
   ## Example with a PVC.
   ## mountedVolumes:
   ##   volumes:
   ##   - name: custom-volume
   ##     persistentVolumeClaim:
   ##       claimName: pvc-test
   ##   volumeMounts:
   ##   - name: custom-volume
   ##     mountPath: /mnt/<path_of_your_choice>
   ##
   mountedVolumes:
   volumes:
   volumeMounts:

   ## Configure the CMF service for example Authn/Authz
   cmf:
   #  authentication:
   #    type: mtls

   ## Enable Kubernetes RBAC
   # When set to true, it will create a proper role/rolebinding or cluster/clusterrolebinding based on namespaced field.
   # If a user doesn't have permission to create role/rolebinding then they can disable rbac field and
   # create required resources out of band to be used by the Operator. In this case, follow the
   # templates/clusterrole.yaml and templates/clusterrolebiding.yaml to create proper required resources.
   rbac: true
   ## Creates a default service account for the CMF pod if service.account.create is set to true.
   # In order to use a custom service account, set the name field to the desired service account name and set create to false.
   # Also note that the new service account must have the necessary permissions to run the CMF pod, i.e cluster wide permissions.
   # The custom service account must have:
   #
   # rules:
   #  - apiGroups: ["flink.apache.org"]
   #    resources: ["flinkdeployments", "flinkdeployments/status"] # Needed to manage FlinkDeployments CRs
   #    verbs: ["*"]
   #  - apiGroups: [""]
   #    resources: ["services"] # Read-only permissions needed for the flink UI
   #    verbs: ["get", "list", "watch"]
   serviceAccount:
   create: true
   name: ""
   # The jvmArgs parameter allows you to specify custom Java Virtual Machine (JVM) arguments that will be passed to the application container.
   # This can be useful for tuning memory settings, garbage collection, and other JVM-specific options.
   # Example :
   # jvmArgs: "-Xms512m -Xmx1024m -XX:+UseG1GC"
   ```

   Note the following about CMF default values:
   - CMF uses SQLite to store metadata about your deployments.
     The data is persisted on a persistent volume
     that is created during the installation via a `PersistentVolumeClaim` created by Helm.
   - The persistent volume is created with your Kubernetes cluster’s default storage class.
     Depending on your storage class, your metadata might
     not be retained if you uninstall CMF.
     For example, if your reclaim policy is `Delete`, data is not retained.
     **Make sure to backup the data in the persistent volume regularly**.
   - If you want to set your storage class, you can overwrite `persistence.storageClassName`
     during the installation.
   - By default, the chart uses the image hosted by [Confluent on DockerHub](https://hub.docker.com/r/confluentinc/cp-cmf).
     To specify your own registry, set the following configuration values:
     ```none
     image:
       repository: <image-registry>
       name: cp-cmf
       pullPolicy: IfNotPresent
       tag: <tag>
     ```
   - By default, the chart creates a cluster role and [service account](https://kubernetes.io/docs/concepts/security/service-accounts/)
     that CMF can use to create and monitor Flink applications in all namespaces.
     If you want to keep your service account, you set the `serviceAccount.name` property during installation
     to the preferred service account.
   - To change the log level, for example to show debug logs, set `cmf.logging.level.root=debug`.


## Confluent Platform how-tos

You have several options to get started with Confluent Platform and Kafka, depending on your use cases and goals.

- [Quick Start for Confluent Platform](platform-quickstart.md#quickstart) - Provides a simple example that shows you how to run Confluent Platform using Docker on a single broker, single cluster
  development environment with topic replication factors set to `1`.
- [Tutorial: Set Up a Multi-Broker Kafka Cluster](tutorial-multi-broker.md#basics-multi-broker-setup) - Provides an example of how to run a single cluster with multiple brokers.
  Describes how to configure and start a single controller, and as many brokers as you want to run in the cluster.

* [Run multiple clusters](tutorial-multi-broker.md#basics-multi-cluster-setup)- Describes a multi-cluster deployment where you have a dedicated controller for each cluster,
  and a Kafka server properties file for each broker.
* [Scripted Confluent Platform Demo](../tutorials/cp-demo/overview.md#scripted-demo) - Provides a scripted demo to build a full Confluent Platform deployment with
  [ksqlDB](../ksqldb/overview.md#ksql-home) and [Kafka Streams](../streams/overview.md#kafka-streams) for stream processing,
  and security end-to-end.


# Configure Metadata Service (MDS) in Confluent Platform

The Confluent Platform Metadata Service (MDS) manages a variety of metadata about your
Confluent Platform installation. Specifically, the MDS:

- Hosts the [cluster registry](../../security/cluster-registry.md#cluster-registry)
  that enables you to keep track of which clusters you have installed.
- Serves as the system of record for cross-cluster authorization data
  (including [RBAC](../../security/authorization/rbac/overview.md#rbac-overview), and
  [centralized ACLs](../../security/authorization/rbac/authorization-acl-with-mds.md#authorization-acl-with-mds)), and can be used
  for token-based authentication.
- Provides a convenient way to manage
  [audit log configurations](../../security/compliance/audit-logs/audit-logs-cli-config.md#audit-log-cli-config) across multiple clusters.
- Can be used to authenticate data (note that client authentication is not
  supported).

You can set up the MDS internally within a Kafka cluster that serves other
functions, and manage permissions in the same way that a database stores
permissions for users logging into the database itself. You can also use the MDS
to store user data. For the Kafka cluster hosting MDS, you must configure MDS on
each Kafka broker, and you should synchronize these configurations across nodes.

You can also set up MDS on a dedicated Kafka cluster, servicing multiple worker
Kafka clusters in such a way that security information is isolated away from
client data. In the case of [role-based access control (RBAC)](../../security/authorization/rbac/overview.md#rbac-overview),
the MDS offers a single, centralized configuration context that, after it is set
up for a cluster, saves administrators from the complex and time-consuming task
of defining and assigning roles for each resource on an individual basis. The MDS
can enforce the rules for RBAC, centralized audit logs, centralized ACLs, and the
cluster registry on its host Kafka cluster and across multiple secondary clusters
(such as Kafka, Connect, and Schema Registry). So you can use a single Kafka cluster
hosting MDS to manage and secure multiple secondary Kafka, Connect, Schema Registry, and
ksqlDB clusters.

The MDS listens for commands using HTTP on the default port
1.    MDS maintains a local cache of authorization data that is persisted to an
internal Kafka topic named `_confluent-metadata-auth`.

Running on a Kafka broker, you can optionally integrate MDS with LDAP to provide
authentication and refreshable bearer tokens for impersonation. Note that
impersonation is restricted to Confluent components. For details about
configuring LDAP integration with RBAC, see [Configure LDAP Authentication](ldap-auth-mds.md#ldap-auth-mds).

This topic includes the following configuration tasks:

- [Configure a primary Kafka cluster to host the MDS and role binding](#config-primary-kafka-cluster-mds)
- [Configure a secondary Kafka cluster managed by the MDS of the primary Kafka cluster](#config-secondary-kafka-cluster-managed-by-primary-mds-cluster)


# Configure Confluent Platform Components to Communicate with MDS over TLS

This topic describes the Kafka client configuration for Confluent Platform components to
communicate with MDS over TLS. These files can be found in your Confluent Platform
install server directory in the following locations:

| Component                | Properties file to update                                 |
|--------------------------|-----------------------------------------------------------|
| Schema Registry          | `/etc/schema-registry/schema-registry.properties`         |
| ksqlDB                   | `/etc/ksqldb/ksql-server.properties`                      |
| Connect                  | `/etc/kafka/connect-distributed.properties`               |
| Confluent Control Center | `/etc/confluent-control-center/control-center.properties` |
| REST Proxy               | `/etc/kafka-rest/kafka-rest.properties`                   |

Specify the following Kafka client configuration for your component. Any content in
brackets (`<>`) must be customized for your environment.

```text
confluent.metadata.bootstrap.server.urls=https://<MDS-advertised-listener0>:8090,https://<MDS-advertised-listener1>:8090,...
confluent.metadata.http.auth.credentials.provider=BASIC
confluent.metadata.basic.auth.user.info=<username>:<password>
confluent.metadata.ssl.truststore.location=<truststore-location>
confluent.metadata.ssl.truststore.password=<truststore-password>
confluent.metadata.ssl.keystore.location=<keystore-location>
confluent.metadata.ssl.keystore.password=<keystore-password>
confluent.metadata.ssl.key.password=<key-password>
confluent.metadata.ssl.endpoint.identification.algorithm=HTTPS
```

See also:

- [HTTPS Configuration Options](mds-configuration.md#https-configs-for-ssl)
- [Metadata Service Configuration Settings](mds-configuration.md#mds-configuration-options)
- [Use TLS Authentication in Confluent Platform](../../security/authentication/mutual-tls/overview.md#kafka-ssl-authentication)


## General

`id`
: Unique ID for the Confluent REST Proxy server instance. This is used in generating unique
  IDs for consumers that do not specify their ID. The ID is empty by default, which
  makes a single server setup easier to get up and running, but is not safe for
  multi-server deployments where automatic consumer IDs are used.


  * Type: string
  * Default: “”
  * Importance: high

`bootstrap.servers`
: A list of Kafka brokers to connect to. For example,
  `PLAINTEXT://hostname:9092,SSL://hostname2:9092`. This configuration is
  particularly important when Kafka security is enabled, because Kafka may expose
  multiple endpoints that will be stored as metadata, but REST Proxy may need to be
  configured with just one of those endpoints. The client will make use of all
  servers irrespective of which servers are specified here for
  bootstrapping—this list only impacts the initial hosts used to discover the
  full set of servers. Because these servers are used only for the initial
  connection to discover the full cluster membership (which may change
  dynamically), this list need not contain the full set of servers (you may want
  more than one, though, in case a server is down).

`listeners`
: Comma-separated list of listeners that listen for API requests over either HTTP
  or HTTPS. If a listener uses HTTPS, the appropriate TLS configuration parameters
  need to be set as well.


  * Type: list
  * Default: `http://0.0.0.0:8082`
  * Importance: high

`schema.registry.url`
: The base URL for Schema Registry that should be used by the serializer.


  * Type: string
  * Default: `http://localhost:8081`
  * Importance: high


  #### NOTE
  The configuration property `auto.register.schemas` is not supported for
  Kafka REST Proxy.

`consumer.request.max.bytes`
: Maximum number of bytes in unencoded message keys and values returned by a single
  request. This can be used by administrators to limit the memory used by a single
  consumer and to control the memory usage required to decode responses on clients
  that cannot perform a streaming decode. Note that the actual payload will be
  larger due to overhead from base64 encoding the response data and from JSON
  encoding the entire response.


  * Type: long
  * Default: 67108864
  * Importance: medium

`consumer.threads`
: The maximum number of threads to run consumer requests on. Note that this must
  be greater than the maximum number of consumers in a single consumer group.
  The sentinel value of -1 allows the number of threads to grow as needed to
  fulfill active consumer requests. Inactive threads will ultimately be stopped
  and cleaned up.


  * Type: int
  * Default: 50
  * Importance: medium

`consumer.request.timeout.ms`
: The maximum total time to wait for messages for a request if the maximum number
  of messages has not yet been reached.


  * Type: int
  * Default: 1000
  * Importance: medium

`host.name`
: The host name used to generate absolute URLs in responses. If empty, the default
  canonical hostname is used.


  * Type: string
  * Default: “”
  * Importance: medium

`access.control.allow.methods`
: Set value to Jetty Access-Control-Allow-Origin header for specified methods.


  * Type: string
  * Default: “”
  * Importance: low

`access.control.allow.origin`
: Set value for Jetty Access-Control-Allow-Origin header.
  You may use `*` for any origin, or you can specify multiple origins separated by commas.


  * Type: string
  * Default: “”
  * Importance: low

`response.http.headers.config`
: Use to select which HTTP headers are returned in the HTTP response for Confluent Platform
  components. Specify multiple values in a comma-separated string using the
  format `[action][header name]:[header value]` where `[action]` is one of
  the following: `set`, `add`, `setDate`, or `addDate`. You must use
  quotation marks around the header value when the header value contains commas.
  For example:


  ```none
  response.http.headers.config="add Cache-Control: no-cache, no-store, must-revalidate", add X-XSS-Protection: 1; mode=block, add Strict-Transport-Security: max-age=31536000; includeSubDomains, add X-Content-Type-Options: nosniff
  ```


  * Type: string
  * Default: “”
  * Importance: low

`reject.options.request`
: Boolean indicating whether or not to reject the OPTIONS method request to REST
  services. By default, sending a request with the OPTIONS method to all REST
  services from Confluent Platform REST Proxy, Confluent Control Center REST endpoint, and so on, returns the list
  of available methods on the specified endpoint. For example: `curl -X OPTIONS
  http://localhost:8083`. When `reject.options.request` is set to `true`,
  requests with `-X OPTIONS` are rejected and available methods are not
  returned. Setting `reject.options.request` to `true` protects API
  endpoints that are not specifically used by applications, which reduces the
  attack surface.


  * Type: boolean
  * Default: false
  * Importance: low

`consumer.instance.timeout.ms`
: Amount of idle time before a consumer instance is automatically destroyed.


  * Type: int
  * Default: 300000
  * Importance: low

`consumer.iterator.backoff.ms`
: Amount of time to backoff when an iterator runs out of data. If a consumer has
  a dedicated worker thread, this is effectively the maximum error value for the entire
  request timeout. It should be small enough to closely target the timeout, but
  large enough to avoid busy waiting.


  * Type: int
  * Default: 50
  * Importance: low

`fetch.min.bytes`
: Minimum number of bytes in message keys and values returned by a single request
  before the timeout of `consumer.request.timeout.ms` passes. The special sentinel
  value of -1 disables this functionality.


  * Type: int
  * Default: -1
  * Importance: medium

`consumer.iterator.timeout.ms`
: Timeout for blocking consumer iterator operations. This should be set to a small
  enough value that it is possible to effectively peek() on the iterator.


  * Type: int
  * Default: 1
  * Importance: low

`debug`
: Boolean indicating whether extra debugging information is generated in some
  error response entities.


  * Type: boolean
  * Default: false
  * Importance: low

`idle.timeout.ms`
: The number of milliseconds before an idle connection times out.


  * Type: long
  * Default: 30000
  * Importance: low

`metric.reporters`
: A list of classes to use as metrics reporters. Implementing the
  `MetricReporter` interface allows plugging in classes that will be
  notified of new metric creation. The JmxReporter is always included to register
  JMX statistics.


  * Type: list
  * Default: []
  * Importance: low

`metrics.jmx.prefix`
: Prefix to apply to metric names for the default JMX reporter.


  * Type: string
  * Default: `kafka.rest`
  * Importance: low

`metrics.num.samples`
: The number of samples maintained to compute metrics.


  * Type: int
  * Default: 2
  * Importance: low

`metrics.sample.window.ms`
: The metrics system maintains a configurable number of samples over a fixed window
  size. This configuration controls the size of the window. For example, you might
  maintain two samples each measured over a 30 second period. When a window expires,
  you erase and overwrite the oldest window.


  * Type: long
  * Default: 30000
  * Importance: low

`port`
: DEPRECATED: port to listen on for new connections. Use `listeners` instead.


  * Type: int
  * Default: 8082
  * Importance: low

`request.logger.name`
: Name of the SLF4J logger to write the NCSA Common Log Format request log.


  * Type: string
  * Default: `io.confluent.rest-utils.request`
  * Importance: low

`response.mediatype.default`
: The default response media type that should be used if no specify types are
  requested in an Accept header.


  * Type: string
  * Default: `application/json`
  * Importance: low

`response.mediatype.preferred`
: An ordered list of the server’s preferred media types used for responses, from
  most preferred to least.


  * Type: list
  * Default: [application/json, application/vnd.kafka.v2+json]
  * Importance: low

`shutdown.graceful.ms`
: Amount of time to wait after a shutdown request for outstanding requests to complete.


  * Type: int
  * Default: 1000
  * Importance: low

`kafka.rest.resource.extension.class`
: A list of classes to use as RestResourceExtension. Implementing the interface
  `RestResourceExtension` allows you to inject user defined resources
  like filters to REST Proxy. Typically used to add custom capabilities like logging,
  security, etc.


  * Type: list
  * Default: “”
  * Importance: low

`advertised.listeners`
: List of advertised listeners. This configuration is used to generate absolute URLs in V3
  responses. The HTTP and HTTPS protocols are supported. Each listener must include the protocol,
  hostname, and port. For example: `http://myhost:8080` and `https://0.0.0.0:8081`.


  * Type: list
  * Default: “”
  * Importance: low

`confluent.resource.name.authority`
: The authority where the governance of the name space is delegated to. This value
  is defined by the remainder of the CRN. This is used when generating Confluent resource
  names. Examples: `confluent.cloud` and `mds-01.example.com`.


  * Type: string
  * Default: “”
  * Importance: low


## Configure ksqlDB for Secured Confluent Schema Registry

You can configure ksqlDB to connect to Schema Registry over HTTP by setting
the `ksql.schema.registry.url` to the HTTPS endpoint of Schema Registry.
Depending on your security setup, you might also need to supply
additional SSL configuration. For example, a trustStore is required if
the Schema Registry SSL certificates aren’t trusted by the JVM by default. A
keyStore is required if Schema Registry requires mutual authentication.

You can configure SSL for communication with Schema Registry by using
non-prefixed names, like `ssl.truststore.location`, or prefixed names
like `ksql.schema.registry.ssl.truststore.location`. Non-prefixed
names are used for settings that are shared with other communication
channels, where the same settings are required to configure SSL
communication with both Kafka and Schema Registry. Prefixed names affect
communication with Schema Registry only and override any non-prefixed
settings of the same name.

Use the following to configure ksqlDB for communication with Schema Registry
over HTTPS, where mutual authentication isn’t required and Schema Registry
SSL certificates are trusted by the JVM:

```properties
ksql.schema.registry.url=https://<host-name-of-schema-registry>:<ssl-port>
```

Use the following settings to configure ksqlDB for communication with
Schema Registry over HTTPS, with mutual authentication, with an explicit
trustStore, and where the SSL configuration is shared between Kafka and
Schema Registry:

```properties
ksql.schema.registry.url=https://<host-name-of-schema-registry>:<ssl-port>
ksql.schema.registry.ssl.truststore.location=/etc/kafka/secrets/ksql.truststore.jks
ksql.schema.registry.ssl.truststore.password=<your-secure-password>
ksql.schema.registry.ssl.keystore.location=/etc/kafka/secrets/ksql.keystore.jks
ksql.schema.registry.ssl.keystore.password=<your-secure-password>
ksql.schema.registry.ssl.key.password=<your-secure-password>
```

Use the following settings to configure ksqlDB for communication with
Schema Registry over HTTP, without mutual authentication and with an
explicit trustStore. These settings explicitly configure only ksqlDB to
Schema Registry SSL communication.

```properties
ksql.schema.registry.url=https://<host-name-of-schema-registry>:<ssl-port>
ksql.schema.registry.ssl.truststore.location=/etc/kafka/secrets/sr.truststore.jks
ksql.schema.registry.ssl.truststore.password=<your-secure-password>
```

The exact settings will vary depending on the encryption and
authentication mechanisms Schema Registry is using, and how your SSL
certificates are signed.

You can pass authentication settings to the Schema Registry client used by
ksqlDB by adding the following to your ksqlDB Server config.

```properties
ksql.schema.registry.basic.auth.credentials.source=USER_INFO
ksql.schema.registry.basic.auth.user.info=username:password
```

For more information, see
[Schema Registry Security Overview](../../../schema-registry/security/index.md#schemaregistry-security).


### Create a Confluent Platform to Confluent Cloud link

Set up the cluster link that mirrors data from Confluent Platform to Confluent Cloud.

This is a **source initiated link**, meaning that its connection will come from Confluent Platform
and go to Confluent Cloud. As such, you won’t have to open your on-premise firewall.

To create this source initiated link, you must create both halves of the cluster link:
the first half on Confluent Cloud, the second half on Confluent Platform.

1. Create a cluster link on the Confluent Cloud cluster.
   1. Create a link configuration file `$CONFLUENT_CONFIG/clusterlink-hybrid-dst.config` with the following entries:
      ```bash
      link.mode=DESTINATION
      connection.mode=INBOUND
      ```

      The combination of the configurations `link.mode=DESTINATION` and `connection.mode=INBOUND` tell the cluster link
      that it is the Destination half of a source initiated cluster link. These two configurations must be used together.

      #### NOTE
      - This tutorial example is based on the assumption that there is only one listener. If you configure multiple listeners (for example, INTERNAL, REPLICATION and EXTERNAL)
        and want to switch to a different listener than the default, you must add one more parameter to the configuration: `local.listener.name=EXTERNAL`.
        To learn more, see the Confluent Platform documentation on [Configuration Options](/platform/current/multi-dc-deployments/cluster-linking/configs.html#configuration-options)
        and [Understanding Listeners in Cluster Linking](/platform/current/multi-dc-deployments/cluster-linking/configs.html#understanding-listeners-in-cluster-linking)
      - If you want to add any configurations to your cluster link (such as consumer offset sync or auto-create mirror topics)
        `clusterlink-hybrid-dst.config` is the file where you would add them. Cluster link configurations are always set on
        the Destination cluster link (not the Source cluster link).
   2. Create the destination cluster link on Confluent Cloud.
      ```bash
      confluent kafka link create from-on-prem-link --cluster $CC_CLUSTER_ID \
        --source-cluster $CP_CLUSTER_ID \
        --config-file $CONFLUENT_CONFIG/clusterlink-hybrid-dst.config
      ```

      The output from this command should indicate that the link was created.
      ```bash
      Created cluster link "from-on-prem-link".
      ```
2. Create security credentials for the cluster link on Confluent Platform. This security credential will be used to read topic data and metadata from the source cluster.
   ```bash
   kafka-configs --bootstrap-server localhost:9092 --alter --add-config \
     'SCRAM-SHA-512=[iterations=8192,password=1LINK2RUL3TH3MALL]' \
     --entity-type users --entity-name cp-to-cloud-link \
     --command-config $CONFLUENT_CONFIG/CP-command.config
   ```

   Your output should resemble:
   ```bash
   Completed updating config for user cp-to-cloud-link.
   ```
3. Create a link configuration file `$CONFLUENT_CONFIG/clusterlink-CP-src.config` for the source cluster link on Confluent Platform with the following entries:
   ```bash
   link.mode=SOURCE
   connection.mode=OUTBOUND

   bootstrap.servers=<CC-BOOTSTRAP-SERVER>
   ssl.endpoint.identification.algorithm=https
   security.protocol=SASL_SSL
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<CC-link-api-key>' password='<CC-link-api-secret>';

   local.listener.name=SASL_PLAINTEXT
   local.security.protocol=SASL_PLAINTEXT
   local.sasl.mechanism=SCRAM-SHA-512
   local.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="cp-to-cloud-link" password="1LINK2RUL3TH3MALL";
   ```

   - The combination of configurations `link.mode=SOURCE` and `connection.mode=OUTBOUND` tell the cluster link
     that it is the source-half of a source initiated cluster link. These configurations must be used together.
   - The middle section tells the cluster link the `bootstrap.servers` of the Confluent Cloud destination cluster for it to reach out to,
     and the authentication credentials to use. Cluster Linking to Confluent Cloud uses TLS and SASL_PLAIN. This is needed so that the
     Confluent Cloud cluster knows to accept the incoming request. The Confluent Cloud bootstrap server is shown as the Endpoint in the output for
     `confluent kafka cluster describe $CC_CLUSTER_ID` , or in cluster settings on the Confluent Cloud console. If you use the Endpoint from
     the CLI output, remove the protocol prefix. For example, if the endpoint shows as `SASL_SSL://pkc-r2ymk.us-east-1.aws.confluent.cloud:9092`,
     your entry in `$CONFLUENT_CONFIG/clusterlink-CP-src.config` should be `bootstrap.servers=pkc-r2ymk.us-east-1.aws.confluent.cloud:9092`.
   - The last section, where lines are prefixed with `local`, contains the security credentials to use with the source cluster (Confluent Platform) to read data.
   - Note that the authentication mechanisms and security protocols for Confluent Platform map to what is defined in the [broker](#cluster-link-hybrid-config).
     Those for Confluent Cloud map to what will be defined in a file called `clusterlink-cloud-to-CP.config` in a subsequent step.
     To learn more about the authentication and security protocols used, see [Configure SASL/SCRAM authentication for Confluent Platform](/platform/current/kafka/authentication_sasl/authentication_sasl_scram.html),
     and the [JAAS](/platform/current/security/authentication/sasl/scram/overview.html#jaas) section in particular.
4. Create the source cluster link on Confluent Platform, using the following command, specifying the configuration file from the previous step.
   ```bash
   kafka-cluster-links --bootstrap-server localhost:9092 \
        --create --link from-on-prem-link \
        --config-file $CONFLUENT_CONFIG/clusterlink-CP-src.config \
        --cluster-id $CC_CLUSTER_ID --command-config $CONFLUENT_CONFIG/CP-command.config
   ```

   Your output should resemble:
   ```bash
   Cluster link 'from-on-prem-link' creation successfully completed.
   ```


#### NOTE
- **When using Schema Linking:** To use a mirror topic that has a schema with |ccloud| ksqlDB, broker-side schema ID validation,
  or the topic viewer, make sure that make sure that [Schema Linking](https://docs.confluent.io/cloud/current/sr/schema-linking.html)
  puts the schema in the default context of the Confluent Cloud Schema Registry. To learn more, see
  [How Schemas work with Mirror Topics](https://docs.confluent.io/cloud/current/multi-cloud/cluster-linking/mirror-topics-cc.html#how-schemas-work-with-mirror-topics).
- Before running the first command in the steps below, make sure that you are still logged in to Confluent Cloud and have the appropriate environment and cluster selected.
  To list and select these resources, use the commands `confluent kafka environment list`, `confluent kafka environment use`, `confluent kafka cluster list`,
  and `confluent kafka cluster use`. A selected environment or cluster is indicated by an asterisk next to it in the output of list commands.
  The commands won’t work properly if no resources are selected (or if the wrong ones are selected).

Perform the following tasks logged in to Confluent Cloud.

1. Create a mirror topic.

   The following command establishes a mirror of the original `from-on-prem` topic, using the cluster link `from-on-prem-link`.
   ```bash
   confluent kafka mirror create from-on-prem --link from-on-prem-link
   ```

   The command output will be:
   ```bash
   Created mirror topic "from-on-prem".
   ```

   - The mirror topic name must match the original topic name. To learn more, see all [Known Limitations](https://docs.confluent.io/platform/current/multi-dc-deployments/cluster-linking/index.html#cluster-linking-limitations).
   - A mirror topic must specify the link to its source topic at creation time. This ensures that the mirror topic is a clean slate, with no conflicting data or metadata.
2. List the mirror topics on the link.
   ```bash
   confluent kafka mirror list --cluster $CC_CLUSTER_ID
   ```

   Your output will resemble:
   ```bash
         Link Name     | Mirror Topic Name | Num Partition | Max Per Partition Mirror Lag | Source Topic Name | Mirror Status | Status Time Ms
   +-------------------+-------------------+---------------+------------------------------+-------------------+---------------+----------------+
     from-on-prem-link | from-on-prem      |             1 |                            0 | from-on-prem      | ACTIVE        |  1633640214250
   ```
3. Consume from the mirror topic on the destination cluster to verify it.

   Still on Confluent Cloud, run a consumer to consume messages from the mirror topic to consume the messages you originally produced to the Confluent Platform topic in previous steps.
   ```bash
   confluent kafka topic consume from-on-prem --from-beginning
   ```

   Your output should be:
   ```bash
   1
   2
   3
   4
   5
   ```

   #### NOTE
   If when you attempt to run the consumer you get an error indicating “no API key selected for resource”, run this command to specify
   the `<CC-API-KEY>` for the Confluent Cloud destination cluster, then re-run the consumer command: `confluent api-key use <CC-API-KEY> --resource $CC_CLUSTER_ID`,
   or follow the instructions on the CLI provided with the error messages.


## Demos and Examples

After completing the [Replicator quick start](replicator-quickstart.md#replicator-quickstart), explore these hands-on working examples of Replicator in multi-datacenter deployments, for which you can download the demo from GitHub and run yourself.
Refer to the diagram below to determine the Replicator examples that correspond to your deployment scenario.

![image](multi-dc-deployments/replicator/images/replicator-demos.png)
1. Kafka on-premises to Kafka on-premises
   - [Example: Replicate Data in an Active-Active Multi-DataCenter Deployment on Confluent Platform](replicator-docker-tutorial.md#replicator): fully-automated example of an active-active multi-datacenter design with two instances of Replicator copying data bidirectionally between the datacenters
   - [Schema translation](replicator-schema-translation.md#quickstart-demos-replicator-schema-translation): showcases the transfer of schemas stored in Schema Registry from one cluster to another using Replicator
   - [Confluent Platform demo](../../tutorials/cp-demo/index.md#cp-demo): deploy a Kafka streaming ETL, along with Replicator to replicate data
2. Kafka on-premises to Confluent Cloud
   - [Hybrid On-premises and Confluent Cloud](../../tutorials/cp-demo/index.md#cp-demo): on-premises Kafka cluster and Confluent Cloud cluster, and data copied between them with Replicator
   - [Connect Cluster Backed to Destination](/cloud/current/get-started/examples/ccloud/docs/replicator-to-cloud-configuration-types.html): Replicator configuration with Kafka Connect backed to destination cluster
   - [On-premises to Cloud with Connect Backed to Origin](/cloud/current/get-started/examples/ccloud/docs/replicator-to-cloud-configuration-types.html#onprem-cloud-origin): Replicator configuration with Kafka Connect backed to origin cluster
3. Confluent Cloud to Confluent Cloud
   - [Cloud to Cloud with Connect Backed to Destination](/cloud/current/get-started/examples/ccloud/docs/replicator-to-cloud-configuration-types.html#cloud-cloud-destination): Replicator configuration with Kafka Connect backed to destination cluster
   - [Cloud to Cloud with Connect Backed to Origin](/cloud/current/get-started/examples/ccloud/docs/replicator-to-cloud-configuration-types.html#cloud-cloud-origin): Replicator configuration with Kafka Connect backed to origin cluster
   - [Migrate Topics on Confluent Cloud Clusters](/cloud/current/clusters/migrate-topics-on-cloud-clusters.html): migrate topics from the origin Confluent Cloud cluster to the destination Confluent Cloud cluster


### Confluent Cloud

To run the producer on Confluent Cloud:

```bash
./bin/kafka-avro-console-producer \
  --topic test \
  --bootstrap-server ${BOOTSTRAP_SERVER} \
  --producer.config config.properties \
  --property schema.registry.url=${SR_URL} \
  --property basic.auth.credentials.source=USER_INFO \
  --property basic.auth.user.info=${SR_API_KEY}:${SR_API_SECRET} \
  --property value.schema='{"type":"record","name":"myrecord","fields":
  [{"name":"f1","type":"string"}]}' \
  --property value.rule.set='{ "domainRules":
  [{ "name": "checkLen", "kind": "CONDITION", "type": "CEL",
     "mode": "WRITE", "expr": "size(message.f1) < 10",
     "onFailure": "ERROR"}]}'

{"f1": "success"}
{"f1": "this will fail"}
```

where `config.properties` contains:

```bash
bootstrap.servers={{ BOOTSTRAP_SERVER }}
 security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}';
sasl.mechanism=PLAIN

client.dns.lookup=use_all_dns_ips

session.timeout.ms=45000

acks=all
```

In the above example for `config.properties`, the following best practices and requirements are implemented:

- `bootstrap.servers`, security protocols, and credentials are required for the Apache Kafka® producer, consumer, and admin.
- `client.dns.lookup`  value is required for Kafka clients prior to 2.6.
- `session.timeout.ms` being included is a best practice for higher availability in Kafka clients prior to 3.0.
- `acks=all` specifies that the producer requires all in-sync replicas to acknowledge receipt of messages (records),
  and is a best practice configuration on the producer to prevent data loss.


### Confluent Cloud

To run the consumer on Confluent Cloud:

```bash
./bin/kafka-avro-console-consumer \
  --topic test \
  --bootstrap-server ${BOOTSTRAP_SERVER} \
  --consumer.config config.properties \
  --property schema.registry.url=${SR_URL} \
  --property basic.auth.credentials.source=USER_INFO \
  --property basic.auth.user.info=${SR_API_KEY}:${SR_API_SECRET}
```

where `config.properties` contains:

```bash
bootstrap.servers={{ BOOTSTRAP_SERVER }}
security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}';
sasl.mechanism=PLAIN

client.dns.lookup=use_all_dns_ips

session.timeout.ms=45000

acks=all
```

In the above example for `config.properties`, the following best practices and requirements are implemented:

- `bootstrap.servers`, security protocols, and credentials are required for the Kafka producer, consumer, and admin.
- `client.dns.lookup`  value is required for Kafka clients prior to 2.6.
- `session.timeout.ms` being included is a best practice for higher availability in Kafka clients prior to 3.0.
- `acks=all` specifies that the producer requires all in-sync replicas to acknowledge receipt of messages (records),
  and is a best practice configuration on the producer to prevent data loss.


## Quick Start

This Quick Start describes how to configure Schema Registry for Role-Based Access Control to
manage user and application authorization to topics and subjects (schemas), including how to:

- Configure Schema Registry to start and connect to the RBAC-enabled Apache Kafka® cluster (edit `schema-registry.properties` [use the Confluent CLI to create roles](../../security/authorization/rbac/rbac-cli-quickstart.md#rbac-cli-quickstart))
- Use the Confluent CLI to grant a SecurityAdmin role to the Schema Registry service principal.
- Use the Confluent CLI to grant a ResourceOwner role to the Schema Registry service principal on the internal topic and group (used to coordinate across the Schema Registry cluster).
- Use the Confluent CLI to grant users access to topics (and associated subjects in Schema Registry).

The examples assume a local install of Schema Registry and shared RBAC and MDS configuration. Your production environment may differ (for example, Confluent Cloud or remote Schema Registry).

If you were to use a local Kafka, ZooKeeper, and bootstrap server as might be the case for testing, these would also need authorization through RBAC, requiring additional prerequisite setup and credentials.


### .NET Example for UAMI

Configure your .NET client with the following UAMI-specific properties.

```c#
private const string azureIMDSQueryParams = "api-version=&resource=&client_id=";
        private const string kafkaLogicalCluster = "your-logical-cluster";
        private const string identityPoolId = "your-identity-pool-id";

        public static async Task Main(string[] args)
        {
            if (args.Length != 3)
            {
                Console.WriteLine("Usage: .. brokerList schemaRegistryUrl");
                return;
            }
            var bootstrapServers = args[1];
            var schemaRegistryUrl = args[2];
            var topicName = Guid.NewGuid().ToString();
            var groupId = Guid.NewGuid().ToString();

            var commonConfig = new ClientConfig
            {
                BootstrapServers = bootstrapServers,
                SecurityProtocol = SecurityProtocol.SaslPlaintext,
                SaslMechanism = SaslMechanism.OAuthBearer,
                SaslOauthbearerMethod = SaslOauthbearerMethod.Oidc,
                SaslOauthbearerMetadataAuthenticationType = SaslOauthbearerMetadataAuthenticationType.AzureIMDS,
                SaslOauthbearerConfig = $"query={azureIMDSQueryParams}",
                SaslOauthbearerExtensions = $"logicalCluster={kafkaLogicalCluster},identityPoolId={identityPoolId}"
            };

            var consumerConfig = new ConsumerConfig
            {
                BootstrapServers = bootstrapServers,
                SecurityProtocol = SecurityProtocol.SaslPlaintext,
                SaslMechanism = SaslMechanism.OAuthBearer,
                SaslOauthbearerMethod = SaslOauthbearerMethod.Oidc,
                GroupId = groupId,
                AutoOffsetReset = AutoOffsetReset.Earliest,
                EnableAutoOffsetStore = false
            };

            // pass the config values to the Producer's builder
            using (var producer = new ProducerBuilder<Null, User>(commonConfig)
```


## Steps to migrate LDAP to OAuth in a Confluent Platform cluster

To ensure a smooth transition, review and complete the following process.

1. Understand the current LDAP RBAC configuration.

   Review your existing LDAP RBAC configuration to understand how roles and permissions
   are configured.
2. Configure the Metadata Service (MDS) to support OAuth.

   To modify the Metadata Service (MDS) to enable OAuth support, you need to:
   * Update the `confluent.metadata.server.user.store` property to `LDAP_WITH_OAUTH`
     for a hybrid approach during the migration phase and with `OAUTH` once all clients
     are migrated to OAuth.
   * Configure the necessary OAuth endpoints and ensure that the Metadata Service (MDS) can validate
     OAuth tokens.

   **Example configuration of Metadata Service (MDS) for OAuth support**

   Similar to your Confluent Server broker configurations, the following settings are required
   to enable identity provider (IdP)-issued OAuth token validation in MDS. For details
   on these configurations, see configurations for supporting identity provider tokens
   in MDS.
   ```none
   confluent.metadata.server.user.store=LDAP_WITH_OAUTH
   confluent.metadata.server.oauthbearer.jwks.endpoint.url=<idp-jwks-endpoint>
   confluent.metadata.server.oauthbearer.expected.issuer=<idp-issuer>
   confluent.metadata.server.oauthbearer.expected.audience=<configured-audience>
   confluent.metadata.server.oauthbearer.sub.claim.name=sub # optional
   confluent.metadata.server.oauthbearer.groups.claim.name=groups # optional
   ```

   For Kafka Java clients supporting SASL OAUTHBEARER, allow specific IdP endpoints by setting the following configuration property:
   ```properties
   org.apache.kafka.sasl.oauthbearer.allowed.urls=<idp_jwks_url>,<idp_token_url>,...
   ```

   This property specifies a comma-separated list of allowed IdP JWKS (JSON Web Key
   Set) and token endpoint URLs. Use \* (asterisk) as the value to allow any endpoint.
   ```properties
   org.apache.kafka.sasl.oauthbearer.allowed.urls=*
   ```

   You should consult the specific Kafka client and IdP documentation for the
   exact interpretation and security implications of such a broad setting.

   Java applications should set this property as a JVM system property when
   launching the application:
   ```bash
   -Dorg.apache.kafka.sasl.oauthbearer.allowed.urls=<idp_jwks_url>,<idp_token_url>,...
   ```

   For other clients (for example, Python, Go, .NET) that are built on
   librdkafka, these clients use different property names and configuration
   mechanisms. So, refer to specific client library documentation for the
   equivalent OAuthBEARER configuration properties.
3. Configure your OIDC identity provider to issue OAuth tokens.
   - Set up an OIDC-compliant identity provider (IdP), such as Okta, Keycloak,
     or another provider.
   - Ensure that your identity provider is configured to issue tokens that Metadata Service (MDS)
     can validate.
4. Update your client configurations.

   Update the configurations of clients (for example, producers/consumers, or if
   using OAuth for the platform service to service authentication, update client
   configurations for Confluent Server brokers, Schema Registry, REST Proxy, and Connect) to use OAuth
   for authentication. This involves setting the appropriate OAuth properties in
   the client configuration files. For details, see [Configure Clients for SASL/OAUTHBEARER authentication in Confluent Platform](../../authentication/sasl/oauthbearer/configure-clients.md#configure-sasl-oauthbearer-clients).

   **Example of client configuration for OAuth authentication**

   To use OAuth authentication with an Confluent Platform cluster, you must configure Kafka clients
   with the following properties, replacing the placeholders with your actual values:
   ```none
   sasl.mechanism=OAUTHBEARER
   security.protocol=SASL_SSL
   sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginCallbackHandler
   sasl.login.connect.timeout.ms=15000 # optional
   sasl.oauthbearer.token.endpoint.url=<idp-token-endpoint>
   sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
     clientId="<client-id>" \
     clientSecret="<client-secret>" \
     scope="<scope>"; # optional
   ```
5. Test the configuration.

   Thoroughly test the new OAuth configuration in a staging environment to ensure
   that authentication and authorization work as expected.
6. Monitor and validate.
   - Monitor your Confluent Platform cluster after migration to ensure that there are no issues with
     authentication or authorization.
   - Validate that all users, clients, and services have the correct permissions.


# Security in Confluent Platform

* [Overview](overview.md)
* [Deployment Profiles](deployment-profiles.md)
* [Compliance](compliance/index.md)
  * [Overview](compliance/overview.md)
  * [Audit Logs](compliance/audit-logs/index.md)
  * [Manage Secrets](compliance/secrets/index.md)
* [Authenticate](authentication/index.md)
  * [Overview](authentication/overview.md)
  * [Mutual TLS](authentication/mutual-tls/index.md)
  * [OAuth/OIDC](authentication/oauth-oidc/index.md)
  * [Multi-Protocol Authentication](authentication/multi-protocol/index.md)
  * [REST Proxy](authentication/rest-proxy/index.md)
  * [SSO for Confluent Control Center](authentication/sso-for-c3/index.md)
  * [HTTP Basic Authentication](authentication/http-basic-auth/index.md)
  * [SASL](authentication/sasl/index.md)
  * [LDAP](authentication/ldap/index.md)
  * [Delegation Tokens](authentication/delegation-tokens/index.md)
* [Authorize](authorization/index.md)
  * [Overview](authorization/overview.md)
  * [Access Control Lists](authorization/acls/index.md)
  * [Role-Based Access Control](authorization/rbac/index.md)
  * [LDAP Group-Based Authorization](authorization/ldap/index.md)
* [Protect Data](protect-data/index.md)
  * [Overview](protect-data/overview.md)
  * [Protect Data in Motion with TLS Encryption](protect-data/encrypt-tls.md)
  * [Protect Sensitive Data Using Client-side Field Level Encryption](protect-data/csfle/index.md)
  * [Redact Confluent Logs](protect-data/log-redaction.md)
* [Configure Security Properties using Prefixes](../kafka/security_prefixes.md)
* [Secure Components](component/index.md)
  * [Overview](component/overview.md)
  * [Schema Registry](component/nav-sr-security.md)
  * [Kafka Connect](component/connect-redirect.md)
  * [KRaft Security](component/kraft-security.md)
  * [ksqlDB RBAC](authorization/rbac/ksql-rbac.md)
  * [REST Proxy](component/nav-rest-proxy-security.md)
* [Enable Security for a Cluster](security_tutorial.md)
* [Add Security to Running Clusters](incremental-security-upgrade.md)
* [Configure Confluent Server Authorizer](csa-introduction.md)
* [Security Management Tools](sec-manage-tools.md)
  * [Ansible Playbooks for Confluent Platform](https://docs.confluent.io/ansible/current/overview.html)
  * [Deploy Secure Confluent Platform Docker Images](../installation/docker/security.md)
* [Cluster Registry](cluster-registry.md)
* [Encrypt using Client-Side Payload Encryption](encrypt/cspe.md)


## Configure TLS encryption for Kafka clients

The new Producer and Consumer clients support security for Kafka versions 0.9.0 and higher.

If you are using the Kafka Streams API, you can read on how to configure equivalent
[SSL](/platform/current/clients/javadocs/javadoc/org/apache/kafka/common/config/SslConfigs.html) and
[SASL](/platform/current/clients/javadocs/javadoc/org/apache/kafka/common/config/SaslConfigs.html) parameters.

If client authentication is not required by the Confluent Server broker, the following is a minimal
configuration example that you can store in a client properties file
`client-ssl.properties`.

Because this configuration stores passwords directly in the Kafka client configuration file, it is
important to restrict access to these files via file system permissions.

```bash
bootstrap.servers=kafka1:9093
security.protocol=SSL
ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
ssl.truststore.password=test1234
```

If client authentication requires TLS, the client must provide the keystore
as well. You can read about the additional configurations required in
[mTLS authentication](../authentication/mutual-tls/overview.md#kafka-ssl-authentication).

Here are examples using Kafka tools, `kafka-console-producer` and
`kafka-console-consumer`, to pass in the `client-ssl.properties` file with
the properties specified above:

```bash
kafka-console-producer --bootstrap-server kafka1:9093 \
  --topic test \
  --producer.config client-ssl.properties
kafka-console-consumer --bootstrap-server kafka1:9093 \
  --topic test \
  --consumer.config client-ssl.properties \
  --from-beginning
```


## Configure TLS encryption for Connect workers

This section describes how to enable security for Kafka Connect. Securing Kafka Connect requires that you configure security for:

1. Kafka Connect workers: part of the Kafka Connect API, a worker is really just an advanced client, underneath the covers
2. Kafka Connect connectors: connectors may have embedded producers or consumers, so you must override the default configurations for Connect producers used with source connectors and Connect consumers used with sink connectors
3. Kafka Connect REST: Kafka Connect exposes a REST API that can be configured to use TLS/SSL using [additional properties](#encryption-ssl-rest)

Configure security for Kafka Connect as described in the section below. Additionally, if you are using Confluent Control Center streams monitoring for Kafka Connect, configure security for:


Configure the top-level settings in the Connect workers to use TLS by adding these
properties in `connect-distributed.properties`. These top-level settings are
used by the Connect worker for group coordination and to read and write to the
internal topics which are used to track the cluster’s state (for example, configurations
and offsets).

```bash
bootstrap.servers=kafka1:9093
security.protocol=SSL
ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
ssl.truststore.password=test1234
```

Connect workers manage the producers used by source connectors and the consumers
used by sink connectors. So, for the connectors to leverage security, you also
have to override the default producer or consumer configuration that the worker uses.
Depending on whether the connector is a source or sink connector:

* For source connectors, configure the same properties, but add the `producer` prefix.
  ```bash
  producer.bootstrap.servers=kafka1:9093
  producer.security.protocol=SSL
  producer.ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
  producer.ssl.truststore.password=test1234
  ```
* For sink connectors, configure the same properties, but add the `consumer` prefix.
  ```bash
  consumer.bootstrap.servers=kafka1:9093
  consumer.security.protocol=SSL
  consumer.ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
  consumer.ssl.truststore.password=test1234
  ```


## Mirror Data to Confluent Cloud with Cluster Linking

In this section, you will create a source-initiated cluster link
to mirror the topic `wikipedia.parsed` from Confluent Platform to Confluent Cloud.
For security reasons, most on-premises datacenters don’t
allow inbound connections,
so Confluent recommends source-initiated cluster linking to easily and securely
mirror Kafka topics from your on-premises cluster to Confluent Cloud.

1. Verify that you’re still using the `ccloud` CLI context.
   ```none
   confluent context list
   ```
2. Give the cp-demo service account the `CloudClusterAdmin` role in Confluent Cloud
   to authorize it to create cluster links and mirror topics in Confluent Cloud.
   ```shell
   confluent iam rbac role-binding create \
      --principal User:$SERVICE_ACCOUNT_ID \
      --role CloudClusterAdmin \
      --cloud-cluster $CCLOUD_CLUSTER_ID --environment $CC_ENV
   ```

   Verify that the role-binding was created. The output should show the role has been created.
   ```shell
   confluent iam rbac role-binding list \
      --principal User:$SERVICE_ACCOUNT_ID \
      -o json | jq
   ```
3. Inspect the file `scripts/ccloud/cluster-link-ccloud.properties`
   ```none
   # This is the Confluent Cloud half of the cluster link

   # Confluent Cloud dedicated cluster is the destination
   link.mode=DESTINATION

   # Link connection comes in from Confluent Platform so you don't have to open your on-prem firewall
   connection.mode=INBOUND
   ```
4. Create the Confluent Cloud half of the cluster link with the name **cp-cc-cluster-link**.
   ```shell
   confluent kafka link create cp-cc-cluster-link \
      --cluster $CCLOUD_CLUSTER_ID \
      --source-cluster $CP_CLUSTER_ID \
      --config-file ./scripts/ccloud/cluster-link-ccloud.properties
   ```
5. Inspect the file `scripts/ccloud/cluster-link-cp-example.properties` and read the comments to understand what each property does.
   ```none
   # Configuration for the Confluent Platform half of the cluster link
   # Copy the contents of this file to cluster-link-cp.properties and
   # add your Confluent Cloud credentials

   # *****DO NOT***** add cluster-link-cp.properties to version control
   # with your Confluent Cloud credentials

   # Confluent Platform is the source cluster
   link.mode=SOURCE

   # The link is initiated at the source so you don't have to open your firewall
   connection.mode=OUTBOUND

   # Authenticate to Confluent Cloud
   bootstrap.servers=<confluent cloud bootstrap endpoint>
   ssl.endpoint.identification.algorithm=https
   security.protocol=SASL_SSL
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
       username='<confluent cloud cluster link api key>' \
       password='<confluent cloud cluster link api secret>';

   # We are using the  CP's SASL OAUTHBEARER token listener
   local.listener.name=TOKEN
   local.sasl.mechanism=OAUTHBEARER
   local.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
   local.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
       username="connectorSA" \
       password="connectorSA" \
       metadataServerUrls="https://kafka1:8091,https://kafka2:8092";
   ```
6. Run the following command to copy the file to `scripts/ccloud/cluster-link-cp.properties`
   with credentials and bootstrap endpoint for your own Confluent Cloud cluster.
   ```shell
   sed -e "s|<confluent cloud cluster link api key>|${CCLOUD_CLUSTER_API_KEY}|g" \
      -e "s|<confluent cloud cluster link api secret>|${CCLOUD_CLUSTER_API_SECRET}|g" \
      -e "s|<confluent cloud bootstrap endpoint>|${CC_BOOTSTRAP_ENDPOINT}|g" \
         scripts/ccloud/cluster-link-cp-example.properties > scripts/ccloud/cluster-link-cp.properties
   ```
7. Next, use the `cp` CLI context to log into Confluent Platform.
   To create a cluster link, the CLI user must have `ClusterAdmin`
   privileges. For simplicity, we are continuing to use a super user instead of a `ClusterAdmin`.
   ```shell
   confluent context use cp
   ```
8. The cluster link itself needs the `DeveloperRead` and `DeveloperManage`
   roles for any topics it plans to mirror, as well as the `ClusterAdmin` role for the Kafka cluster.
   Our cluster link uses the `connectorSA` principal, which already has
   `ResourceOwner` permissions on the `wikipedia.parsed` topic, so we just
   need to add the `ClusterAdmin` role.
   ```shell
   confluent iam rbac role-binding create \
      --principal User:connectorSA \
      --role ClusterAdmin \
      --kafka-cluster $CP_CLUSTER_ID
   ```
9. Create the Confluent Platform half of the cluster link, still called **cp-cc-cluster-link**.
   ```shell
   confluent kafka link create cp-cc-cluster-link \
      --destination-bootstrap-server $CC_BOOTSTRAP_ENDPOINT \
      --destination-cluster $CCLOUD_CLUSTER_ID \
      --config ./scripts/ccloud/cluster-link-cp.properties \
      --url https://localhost:8091/kafka \
      --certificate-authority-path scripts/security/snakeoil-ca-1.crt
   ```
10. Switch contexts back to “ccloud” and create the mirror topic for `wikipedia.parsed` in Confluent Cloud.
    ```shell
    confluent context use ccloud \
    && confluent kafka mirror create wikipedia.parsed --link cp-cc-cluster-link
    ```
11. Consume records from the mirror topic using the schema context “cp-demo”.
    Press `Ctrl+C` to stop the consumer when you are ready.
    ```shell
    confluent kafka topic consume \
       --api-key $CCLOUD_CLUSTER_API_KEY \
       --api-secret $CCLOUD_CLUSTER_API_SECRET \
       --schema-registry-endpoint $CC_SR_ENDPOINT/contexts/:.cp-demo: \
       --schema-registry-api-key $SR_API_KEY \
       --schema-registry-api-secret $SR_API_SECRET \
       --value-format avro \
          wikipedia.parsed | jq
    ```

You successfully created a source-initiated cluster link to seamlessly
move data from on-premises to cloud in real time. Cluster linking opens up
real-time hybrid cloud, multi-cloud, and disaster recovery use cases.
See the [Cluster Linking documentation](https://docs.confluent.io/cloud/current/multi-cloud/overview.html)
for more information.


### Breaking Changes

- Remove `confluent schema-registry cluster [delete | enable | upgrade]` and `confluent schema-registry region list` commands
- Remove `confluent context create` command
- Remove the configuration and partition-replica lists from `confluent kafka topic describe` for on-premises; these lists are now available through new on-premises `confluent kafka topic configuration list` and `confluent kafka replica list` commands
- Remove the configuration and partition-replica lists from `confluent local kafka topic describe`; topic configurations are available through a new `confluent local kafka topic configuration list` command
- Rename `confluent schema-registry exporter get-config` to `confluent schema-registry exporter configuration describe`
- Rename `confluent schema-registry exporter get-status` to `confluent schema-registry exporter status describe`
- Rename `confluent schema-registry compatibility validate` to `confluent schema-registry schema compatibility validate`
- Rename `confluent schema-registry config` to `confluent schema-registry configuration`
- Rename `confluent kafka topic describe` to `confluent kafka topic configuration list` for Confluent Cloud
- Rename `confluent kafka replica list` to `confluent kafka replica status list`
- Rename `confluent kafka broker describe` to `confluent kafka broker configuration list`
- Rename `confluent kafka broker update` to `confluent kafka broker configuration update`
- Rename `confluent local kafka broker describe` to `confluent local kafka broker configuration list`
- Rename `confluent local kafka broker update` to `confluent local kafka broker configuration update`
- Rename `confluent price list` to `confluent billing price list`
- Rename `confluent admin [payment | promo]` subcommands to `confluent billing [payment | promo]` subcommands
- Rename `confluent kafka broker get-tasks` to `confluent kafka broker task list` and remove the `--all` flag; this functionality is now implicit when no broker ID is provided
- Remove the `--all` flag from `confluent kafka broker describe`; this functionality has been moved to a new on-premises `confluent kafka cluster configuration list` command
- Remove the `--all` flag from `confluent kafka broker update`; this functionality has been moved to a new on-premises `confluent kafka cluster configuration update` command
- Remove deprecated `--api-key` and `--api-secret` flags from all `confluent schema-registry` commands
- Remove the `--context` flag from `confluent environment use`, `confluent flink region use`, `confluent service-account use`, and `confluent kafka cluster use`
- Remove the `--environment` from `confluent flink region use` and `confluent kafka cluster use`
- Replace the `--schema` flag for `confluent schema-registry schema compatibility validate` with a required argument
- Replace the `--name` flag for `confluent kafka quota create` with a required argument
- Replace the `--name` flag for `confluent schema-registry kek create` with a required argument
- Rename `--organization-id` to `--organization` for `confluent login`
- Rename `--group-id` to `--group` for `confluent asyncapi export`
- Rename `--kms-key-id` to `--kms-key` for `confluent schema-registry kek create`
- Rename `--deleted` to `--all` for `confluent schema-registry subject describe` and `confluent schema-registry subject list`
- Rename `--aws-account-id` to `--aws-account` for `confluent stream-share consumer redeem`
- Rename `--azure-subscription-id` to `--azure-subscription` for `confluent stream-share consumer redeem`
- Rename `--gcp-project-id` to `--gcp-project` for `confluent stream-share consumer redeem`
- Rename `--config-name` to `--config` for `confluent kafka broker describe` and `confluent local kafka broker describe`
- Rename  `--provider` to `--cloud` for `confluent byok` commands
- Rename `--ca-location` and `--ca-cert-path` to `--certificate-authority-path` for all commands which use these flags
- The `--subject` flag is now required for `confluent schema-registry schema compatibility validate`
- The `--type` flag is now required for `confluent schema-registry schema compatibility validate` for Confluent Cloud
- The `--config` flag is now required for `confluent kafka topic update`
- The `--passphrase` and `--passphrase-new` flags are now required for `confluent secret file rotate` and no longer accept pipes or files
- The `--passphrase` flag is now required for `confluent secret master-key generate` and no longer accepts pipes or files
- The `--config` flag for `confluent secret file add`, `confluent secret file remove`, and `confluent secret file update` no longer accepts pipes or files
- The broker ID is now a required argument for `confluent kafka broker list` and `confluent kafka broker update`
- The API key and secret are now required arguments for `confluent api-key store`
- Remove “Cloud Name” (human) and “cloud_name” (serialized) from the output of `confluent kafka region list`
- Remove “Read-Only” (human) and “read_only” (serialized) from the output of `confluent configuration` commands
- Rename “Name” to “ID” (human) and “name” to “id” (serialized) in the output of `confluent plugin search`; a new “Name” (human) and “name” (serialized) field has been added in its place
- Rename “Kafka” to “Kafka Cluster” (human) and “kafka” to “kafka_cluster” (serialized) in the output of `confluent ksql cluster` commands
- Rename “Schema Registry Secret” to “Schema Registry API Secret” (human) and “schema_registry_secret” to “schema_registry_api_secret” (serialized) in the output of `confluent stream-share consumer redeem`
- Rename “Resource Display Name” to “Resource Name” (human) and “resource_display_name” to “resource_name” (serialized) in the output of `confluent billing cost list`
- Rename “Provider” to “Cloud” (human) and “provider” to “cloud” (serialized) in the output of `confluent kafka cluster describe`
- Rename “Service Provider” to “Cloud” (human) and “service_provider” to “cloud” (serialized) in the output of `confluent kafka cluster list`
- Rename “Service Provider Region” to “Region” (human) and “service_provider_region” to “region” (serialized) in the output of `confluent kafka cluster list`
- Rename “Schema ID” to “ID” (human) and “schema_id” to “id” (serialized) in the output of `schema-registry schema list`
- Rename “Region Name” to “Name” (human) and “region_name” to “name” (serialized) in the output of “confluent kafka region list”
- Rename “Region ID” to “Region” (human) and “region_id” to “region” (serialized) in the output of “confluent kafka region list”
- Rename “Cloud ID” to “Cloud” (human) and “cloud_id” to “cloud” (serialized) in the output of “confluent kafka region list”
- Rename “Resource ID” and “Environment ID” to “Resource” and “Environment” (human) and “resource_id” and “environment_id” to “resource” and “environment” (serialized) in the output of `confluent billing cost list`
- Rename “Broker ID” to “Broker” (human) and “broker_id” to “broker” (serialized) in the output of `confluent broker task list`
- Rename “Partition ID”, “Cluster ID” and “Leader ID” to “ID”, “Cluster” and “Leader” (human) and “partition_id”, “cluster_id” and “leader_id” to “id”, “cluster” and “leader” (serialized) in the output of `confluent kafka partition [describe | list]`
- Rename “Private Link Attachment ID” to “Private Link Attachment” (human) and “private_link_attachment_id” to “private_link_attachment” (serialized) in the output of `confluent network private-link attachment connection` commands
- Rename “Task ID” to “Task” (human) and “task_id” to “task” (serialized) in the output of `confluent connect cluster describe`
- Rename “Plugin ID” and “Version ID” to “ID” and “Version” (human) and “plugin_id”  and “version_id” to “plugin”  and “version” (serialized) in the output of `confluent flink artifact` commands
- Rename “Partition ID” to “Partition” (human) and “partition_id” to “partition” (serialized) in the output of `confluent kafka partition reassignment list`
- Rename “ingress” and “egress” to “ingress_limit” and “egress_limit” in the serialized output of `confluent kafka cluster` commands
- Rename “kafka_cluster_id” to “kafka_cluster” in the serialized output of `confluent iam acl` commands
- Rename “cluster_id” to “cluster” in the serialized output of `confluent broker task list`
- Rename “cluster_id” and “consumer_group_id” to “cluster” and “consumer_group” in the serialized output of `confluent kafka consumer group [describe | list]`
- Rename “cluster_id”, “consumer_group_id”, “consumer_id”, “instance_id”, “client_id”, and “partition_id” to “cluster”, “consumer_group”, “consumer”, “instance”, “client”, and “partition” in the serialized output of `confluent kafka consumer group lag [describe | list]`
- Rename “owner_id” and “resource_id” to “owner” and “resource” in the serialized output of `confluent api-key [describe | list]`
- Rename “cluster_id”, “environment_id”, and “service_account_id” to “cluster”, “environment”, and “service_account” in the serialized output of `confluent audit-log describe`
- Rename “cluster_id”, “environment_id”, and “service_account_id” to “cluster”, “environment”, and “service_account” in the serialized output of `confluent connect event describe`
- Rename “source_cluster_id”, “destination_cluster_id”, and “remote_cluster_id” to “source_cluster”, “destination_cluster”, and “remote_cluster” in the serialized output of `confluent kafka link [describe | list]`
- Rename “cluster_id”, “consumer_group_id”, “max_lag_consumer_id”, “max_lag_instance_id”, “max_lag_client_id”, and “max_lag_partition_id” to “cluster”, “consumer_group”, “max_lag_consumer”, “max_lag_instance”, “max_lag_client”, and “max_lag_partition” in the serialized output of `confluent kafka consumer group lag summarize`
- Rename “cluster_id” to “cluster” in the serialized output of `confluent kafka partition [describe | list]`
- Rename “cluster_id”, “partition_id”, and “broker_id” to “cluster”, “partition”, and “broker” in the serialized output of `confluent kafka replica list`
- Rename “cluster_id” to “cluster” in the serialized output of `confluent schema-registry cluster describe`
- Rename “cluster_id” to “cluster” in the serialized output of `confluent kafka partition reassignment list`
- Rename “environment_id” to “environment” in the serialized output of `confluent network` commands
- Rename “plugin_name” and “plugin_id” to “name” and “id” in the serialized output of `confluent plugin list`
- Rename “consumer_group_id”, “consumer_id”, “instance_id”, and “client_id” to “consumer_group”, “consumer”, “instance”, and “client” in the serialized output of `confluent kafka consumer list`
- The field “Network Zonal Subdomains” (human) and “network_zonal_subdomains” (serialized) in the output of `confluent stream-share consumer redeem` and `confluent stream-share consumer share describe` is now a map
- The field “subtask_statuses” in the serialized output of `confluent kafka broker task list` is now a map
- The field “config” in the serialized output of `confluent schema-registry exporter describe` is now a map
- The field “kms_properties” in the serialized output of `confluent schema-registry kek` commands is now a map
- The field `principals` in the serialized output of `confluent kafka quota` commands is now an array
- The field “network_zones” in the serialized output of  `confluent stream-share consumer redeem` and `confluent stream-share consumer share describe` is now an array
- The field “Error Trace” (human) and “error_trace” (serialized) in the output of `confluent schema-registry exporter status describe` is now omitted when it is empty
- The field “topic_count” in the serialized output of `confluent kafka cluster describe` is now omitted when it is empty
- Remove unused “disable_updates”, “anonymous_id”, “no_browser”, and “ver” configuration fields
- Rename the Windows-only configuration field “update_plugins_once” to “update_plugins_once_windows”
- Legacy on-premises contexts are no longer supported; the Certificate Authority path must now be provided by flag or environment variable
- The following deprecated environment variables are no longer supported: “CCLOUD_EMAIL”, “CCLOUD_PASSWORD”, “CONFLUENT_USERNAME”, “CONFLUENT_PASSWORD”, “CONFLUENT_MDS_URL”, and “CONFLUENT_CA_CERT_PATH”
- Rename the `CONFLUENT_PLATFORM_CA_CERT_PATH` environment variable to `CONFLUENT_PLATFORM_CERTIFICATE_AUTHORITY_PATH`
- `confluent logout` now revokes the refresh token when logging out of Confluent Cloud
- Saved credentials will no longer be read from the `.netrc` file
- CLI text highlighting is now enabled by default for new users
- All confirmation prompts for resource `delete` and `undelete` commands are now yes/no prompts
- The `confluent login` command will no longer automatically log in using saved credentials in the keychain or configuration file
- On-premises login with `confluent login` will now print the confirmation code to the terminal and ask the user to confirm before opening a browser


## Quick Start

In this quick start guide, the AMPS Source connector is used to consume messages
from an SOW topic called `Orders` on AMPS that has Kerberos authentication
enabled. It then sends these messages as records to a Kafka topic named
`AMPS_Orders` with headers being forwarded from the AMPS messages.

For an example of how to get Kafka Connect connected to [Confluent Cloud](/cloud/current/index.html), see
[Connect Self-Managed Kafka Connect to Confluent Cloud](/cloud/current/cp-component/connect-cloud-config.html#distributed-cluster).

**Prerequisites:**

- [Confluent Platform](/platform/current/installation/installing_cp/index.html) is installed and services are running by using the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands.

  #### NOTE
  This quick start assumes that you are using the [Confluent
  CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands, but [standalone
  installations](/platform/current/installation/installing_cp/index.html) are also supported. By
  default ZooKeeper, Apache Kafka®, Schema Registry, Kafka Connect REST API, and Kafka Connect
  are started with the `confluent local start` command. Note that as
  of Confluent Platform 7.5, ZooKeeper is deprecated for new deployments. Confluent recommends
  KRaft mode for new deployments.
- Kafka and Schema Registry are running locally on the default ports.


#### SEE ALSO
For a more detailed Docker-based example of the Confluent Elasticsearch
Connector, refer to [Confluent Platform Demo
(cp-demo)](/platform/current/tutorials/cp-demo/docs/index.html#cp-demo). You can deploy a Kafka
streaming ETL, including Elasticsearch, using ksqlDB for stream processing.

This quick start assumes that you are using the [Confluent
CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands, but [standalone
installations](/platform/current/installation/installing_cp/index.html) are also supported. By
default ZooKeeper, Apache Kafka®, Schema Registry, Kafka Connect REST API, and Kafka Connect
are started with the `confluent local start` command. Note that as
of Confluent Platform 7.5, ZooKeeper is deprecated for new deployments. Confluent recommends
KRaft mode for new deployments.


### Requirements and considerations

* Confluent Platform 8.0 or later with the CSFLE Add-On enabled.
* CSFLE in Confluent Platform is supported only in non-shared mode.
  * Schema Registry supports CSFLE mainly for DEK permissions checks.
  * Client performs all key management service (KMS) interactions, including
    encryption and decryption.
* CSFLE supports Java clients for producing/consuming encrypted
  messages.
* CSFLE is not available in Confluent CLI, Connect, ksqlDB, Flink, or
  non-Java clients at this time.
* CSFLE is not integrated with Control Center.
* Only `string` and `byte` type Avro fields are supported for CSFLE tagging
  and encryption.
* The *kafka-avro-console-producer* and *kafka-avro-console-consumer* tools
  work with CSFLE.
* Supported KMS types include local KEK, AWS KMS (Amazon
  Web Services), and HashiCorp Vault. For the full list, see [Supported KMS
  types](https://docs.confluent.io/platform/current/security/protect-data/csfle/overview.html#supported-kms-types).
* The CSFLE API is protected using Confluent Platform Role-Based Access Control (RBAC).


## Features

The following are summaries of the main, notable features of CFK.

Cloud Native Declarative API
: * Declarative Kubernetes-native API approach to configure, deploy, and manage
    Confluent Platform components (namely Apache Kafka®, Connect workers, ksqlDB, Schema Registry, Confluent Control Center,
    Confluent REST Proxy) and application resources (such as topics, rolebindings)
    through Infrastructure as Code (IaC).
  * Provides built-in automation for cloud-native security best practices:
    * Complete granular RBAC, authentication and TLS network encryption
    * Auto-generated certificates
    * Support for credential management systems, such as HashiCorp Vault, to
      inject sensitive configurations in memory to Confluent deployments
  * Provides server properties, JVM, Log4j, and Log4j 2 configuration overrides for
    customization of all Confluent Platform components.

Upgrades
: * Provides automated rolling updates for configuration changes.
  * Provides automated rolling upgrades with no impact to Kafka availability.

Scaling
: * Provides single command, automated scaling and reliability checks of
    Confluent Platform.

Resiliency
: * Restores a Kafka pod with the same Kafka broker ID, configuration, and
    persistent storage volumes if a failure occurs.
  * Provides automated rack awareness to spread replicas of a partition across
    different racks (or zones), improving availability of Kafka brokers and
    limiting the risk of data loss.

Scheduling
: * Supports Kubernetes labels and annotations to provide useful context to
    DevOps teams and ecosystem tooling.
  * Supports Kubernetes tolerations and pod/node affinity for efficient
    resource utilization and pod placement.

Monitoring
: * Supports metrics aggregation using JMX/Jolokia.
  * Supports aggregated metrics export to Prometheus.


# Build Streaming Applications on Confluent Platform


You can use Apache Kafka® clients to write distributed applications and microservices
that read, write, and process streams of events in parallel, at scale, and in a
fault-tolerant manner, even those related to network problems or machine failures.

The Kafka client library provides functions, classes, and utilities that you can use to create Kafka
[producer](../_glossary.md#term-producer) clients and [consumer](../_glossary.md#term-consumer) clients using your choice of programming languages. The primary way to build
production-ready producers and consumers is by using a programming language and a Kafka client
library.

The official Confluent supported clients are:

* Java: The official Java client library supports the producer, consumer,
  Streams, and Connect APIs.
* [librdkafka](https://docs.confluent.io/platform/current/clients/librdkafka/html/md_INTRODUCTION.html):
  The librdkafka and the following derived clients libraries only support the
  admin, producer, and consumer APIs.
  * C/C++
  * Python
  * Go
  * .NET
  * JavaScript

When you use the official Confluent-supported clients, you get the same
enterprise-level support that you get with the rest of Confluent Platform:

* The release cycle for Confluent-provided clients follow the Confluent release
  cycle, as opposed to the Kafka release cycle.
* Confluent Platform maintenance fixes are provided for the 2-3 years (2 years with the
  Standard Support and 3 years with the Platinum Support) after the initial
  release of a minor version.

Additional open-source and community-developed Kafka client libraries are
available for other programming languages. Some of these include Scala,
Ruby, Rust, PHP, and Elixir.

The core APIs in the Kafka client library are:

* Producer API: This API provides classes and methods for creating and sending
  messages to Kafka topics. It allows developers to specify message payloads,
  keys, and metadata and to control message delivery and acknowledgment.
* Consumer API: This API provides classes and methods for consuming messages
  from Kafka topics. It allows developers to subscribe to one or more topics,
  receive messages in batches or individually, and process messages using custom
  logic.
* Streams API: This API provides a high-level abstraction for building real-time
  data processing applications that consume, transform, and produce data streams
  from Kafka topics.
* Connector API: This API provides a framework for building connectors that can
  transfer data between Kafka topics and external data systems, such as
  databases, message queues, and cloud storage services.
* Admin API: This API provides functions for managing Kafka topics, partitions,
  and configurations. It allows developers to create, delete, and update topics
  and retrieve metadata about Kafka clusters and brokers.

In addition to these core APIs, the Kafka client library includes various tools
and utilities for configuring and monitoring Kafka clients and clusters,
handling errors and exceptions and optimizing client performance and
scalability.


## Related content

After getting started with your deployment, you may want check out the following
Kafka Connect documentation:

* Course: [Kafka Connect 101](https://developer.confluent.io/learn-kafka/kafka-connect/)
* Course: [Building data pipelines with Apache Kafka](https://developer.confluent.io/learn-kafka/data-pipelines/intro/)
* Tutorial: [Moving Data In and Out of Kafka](/platform/current/connect/quickstart.html)
* [Kafka Connect Logging](/platform/current/connect/logging.html)
* [Upgrade Kafka Connect](/platform/current/installation/upgrade.html)
* [Kafka Connect Security](/platform/current/connect/security.html)
* [Kafka Connect REST Interface](/platform/current/connect/references/restapi.html)
* [Using Kafka Connect with Schema Registry](/platform/current/schema-registry/connect.html)
* [Upgrading a Connector Plugin](upgrade.md#connect-upgrading-plugin)
* [Override the Worker Configuration](/platform/current/connect/references/allconfigs.html#override-the-worker-configuration)
* [Adding Connectors or Software (Docker)](extending.md#connect-adding-connectors-to-images)

Also, check out Confluent’s [end-to-end demos](https://github.com/confluentinc/examples/) for Kafka Connect on-premises,
Confluent Cloud, and Confluent for Kubernetes.


### Kafka capabilities

Confluent Platform provides all of Kafka’s open-source features plus additional proprietary components.
Following is a summary of Kafka features. For an overview of
Kafka use cases, features and terminology, see [Kafka Introduction](/kafka/introduction.html).

- At the core of Kafka is the [Kafka broker](../_glossary.md#term-Kafka-broker). A broker stores data in a durable way from clients in
  one or more topics that can be consumed by one or more clients. Kafka also provides several
  [command-line tools](../tools/cli-reference.md#cp-all-cli) that enable you to start and stop Kafka, create topics and more.
- Kafka provides security features such as [data encryption](../security/protect-data/encrypt-tls.md#kafka-ssl-encryption) between
  producers and consumers and brokers using SSL / TLS. [Authentication](../security/authentication/overview.md#authentication-overview) using SSL or SASL and authorization using ACLs.
  These security features are disabled by default.
- Additionally, Kafka provides the following [Java APIs](/kafka/kafka-apis.html).
  - The Producer API that enables an application to send messages to Kafka.
    To learn more, see [Producer](../clients/producer.md#kafka-producer).
  - The Consumer API that enables an application to subscribe to one or more topics and process the stream of records produced to them.
    To learn more, see [Consumer](../clients/consumer.md#kafka-consumer).
  - [Kafka Connect](../connect/index.md#kafka-connect), a component that you can use to stream data between Kafka and other data systems in a
    scalable and reliable way. It makes it simple to configure connectors to move data into and out of Kafka.
    Kafka Connect can ingest entire databases or collect metrics from all your application servers into Kafka topics, making the data available for stream processing. Connectors can also deliver data from Kafka topics into secondary indexes like Elasticsearch or into batch systems such as Hadoop for offline analysis.
  - The [Streams API](../streams/introduction.md#streams-intro) that enables applications to act as a stream processor, consuming an input stream from one or more
    topics and producing an output stream to one or more output topics, effectively transforming the input streams to output streams. It has a very low barrier to entry, easy operationalization, and a high-level DSL for writing stream processing applications. As such it is the most convenient yet scalable option to process and analyze data that is backed by Kafka.
  - The [Admin API](/kafka/kafka-apis.html#admin-client-api) that provides the capability to create, inspect, delete, and manage topics,
    brokers, ACLs, and other Kafka objects. To learn more, see [Confluent REST Proxy for Apache Kafka](../kafka-rest/index.md#kafkarest-intro), which leverages the Admin API.


### Ansible Playbooks for Confluent Platform

Ansible Playbooks for Confluent Platform (Confluent Ansible) provides you a simple way to configure and deploy
Confluent Platform on a traditional VM or bare metal infrastructure. For more information, see
[Ansible documentation](https://docs.confluent.io/ansible/current/overview.html) .

For version compatibility among Confluent Ansible, Confluent Platform, Ansible, and Python, see
[Ansible Requirements](https://docs.confluent.io/ansible/current/ansible-requirements.html).

The following table summarizes the Confluent Platform features supported with Ansible Playbooks for Confluent Platform.

| Confluent Platform 8.1 Feature             | Availability in Ansible Playbooks for Confluent Platform 8.1   |
|--------------------------------------------|----------------------------------------------------------------|
| Kafka Broker                               | Available                                                      |
| Schema Registry                            | Available                                                      |
| REST Proxy                                 | Available                                                      |
| ksqlDB                                     | Available                                                      |
| Connect                                    | Available                                                      |
| Control Center                             | Available <sub>[1]</sub>                                       |
| Replicator                                 | Available <sub>[2]</sub>                                       |
| Security: Authentication                   | Available                                                      |
| Security: Role-based Access Control (RBAC) | Available                                                      |
| Security: Network Encryption               | Available                                                      |
| Structured Audit Logs                      | Available                                                      |
| MDS-based Access Control Lists (ACLs)      | Available <sub>[3]</sub>                                       |
| Secrets Protection                         | Available                                                      |
| Schema Validation                          | Available                                                      |
| FIPS                                       | Available                                                      |
| Multi-region Clusters                      | Available                                                      |
| Tiered Storage                             | Available                                                      |
| Self-Balancing Clusters                    | Available                                                      |
| Auto Data Balancer                         | Not available                                                  |
| Confluent REST API                         | Available                                                      |
| Health+                                    | Available                                                      |
| Cluster Registry                           | Available                                                      |
| Cluster Linking                            | Available                                                      |
- <sub>[1]</sub> Confluent Control Center requires a separate installation. See [Installation](/control-center/current/installation/overview.html).
- <sub>[2]</sub> Cannot have RBAC enabled on the source or target cluster.
- <sub>[3]</sub> Only available for new installations.
  Does not support centrally managing ACLs across multiple Kafka clusters.

You can manually configure the features marked as *Not available* outside of the
scope of Ansible Playbooks for Confluent Platform. If you take the hybrid installation approach, refer to the
appropriate installation document in [Confluent documentation](index.md#installation-overview) to ensure your
install path of mixing Ansible installation and manual installation is
supported.


## Task status metrics

kafka.server:type=cluster-link-metrics,name=link-task-count,link-name={linkName},task-name={taskName},state={state},reason={reason},mode={mode},connection-mode={connection_mode}
: Monitor the state of link level tasks.  For example, monitor if consumer offset syncing is working.
  If the task is in error, a reason code is provided. You can set up alerts to trigger if errors occur.

**Available tags:**

- `task-name`: The specific task being monitored. Possible values:
  - `consumer-offset-sync`: Consumer offset synchronization task
  - `acl-sync`: ACL synchronization task
  - `auto-create-mirror`: Automatic mirror topic creation task
  - `topic-configs-sync`: Topic configuration synchronization task
  - `clear-mirror-start-offsets`: Clear mirror start offsets task
  - `pause-mirror-topics`: Pause mirror topics task
  - `check-availability`: Availability check task
  - `state-aggregator`: State aggregation task
  - `retry-task`: Retry task for failed operations
  - `periodic-partition-scheduler`: Periodic partition scheduler task
  - `degraded-partition-monitor`: Degraded partition monitor task
- `state`: The current state of the task. Possible values:
  - `active`: Task is currently running
  - `in-error`: Task has encountered an error
- `reason`: Error code when the task is in error state. Common values include:
  - `no-error`: No error (when state is active)
  - `authentication`: Authentication errors with link credentials [Link cannot authenticate to remote cluster]
- `broker-authentication`: Authentication errors with broker credentials [Authentication issues between link and destination broker]
- `authorization`: Authorization errors with link credentials [Link lacks permissions on remote cluster]
- `broker-authorization`: Authorization errors with broker credentials [Authorization issues between link and destination broker]
- `misconfiguration`: Configuration errors [Invalid or missing link configuration]
- `internal`: Internal/unexpected errors [Unexpected system errors]
- `suppressed-errors`: Errors that are being suppressed [Errors being handled gracefully]
- `consumer-group-in-use`: Consumer group is active on destination [Cannot sync offsets while consumers are active]
- `remote-link-not-found`: Remote link not found (bidirectional links) [Remote side of bidirectional link missing]
- `security-disabled`: Remote cluster has no authorizer configured [Remote cluster lacks security configuration]
- `acl-limit-exceeded`: ACL limit reached on destination cluster [ACL quota exceeded on destination]
- `invalid-request`: Invalid request error [Malformed or invalid API request]
- `topic-exists`: Topic already exists on destination [Cannot create mirror topic - already exists]
- `policy-violation`: Policy violation error [Mirror topic creation violates policies]
- `invalid-topic`: Invalid topic error [Topic name or configuration invalid]
- `unknown-topic-or-partition`: Topic or partition not found [Source topic/partition doesn’t exist]

kafka.server:type=cluster-link-metrics,name=mirror-transition-in-error,link-name={linkName},state={state},reason={reason},mode={mode},connection-mode={connection_mode}
: Monitor mirror topic state transition errors.  For example, if a mirror topic encounters errors during
  the promotion process; that is, while its state is `pending_stopped` and it is being transitioned to stopped.

**Available tags:**

- `state`: The mirror topic transition state. Common values include:
  - `pending_stopped`: Mirror topic is being stopped
  - `pending_mirror`: Topic is being converted to mirror
  - `pending_synchronization`: Mirror topic is being reversed/swapped
  - `pending_restore`: Mirror topic is being prepared for restore
  - `pending_setup_for_restore`: Mirror topic is being prepared for restore setup
  - `failed`: Mirror topic repair operations
- `reason`: Error code for the transition failure. Uses the same error codes as task metrics (see above).


## Related Content


* Blog post: [Why Avro For Kafka Data](https://www.confluent.io/blog/avro-kafka-data/)
* Blog post: [Yes, Virginia, You Really Do Need a Schema Registry](https://www.confluent.io/blog/schema-registry-kafka-stream-processing-yes-virginia-you-really-need-one/)
* Apache Avro® official site: [How to get started with Apache Avro using Java Clients](https://avro.apache.org/docs/current/gettingstartedjava.html)
* [How to produce and consume (Avro) messages via console tools with Confluent Cloud](https://support.confluent.io/hc/en-us/articles/360044952772) (Confluent Support)

* Confluent supported schema formats, and how to configure clients using Avro, Protobuf, or JSON Schema: [Formats, Serializers, and Deserializers](/platform/current/schema-registry/fundamentals/serdes-develop/index.html)
* Try it out: [Schema Registry API Usage Examples](/platform/current/schema-registry/develop/using.html), showing more curl commands over HTTP and HTTPS
* User guide for managing schemas on Confluent Control Center: [Manage Schemas in Confluent Platform and Control Center](schema.md#topicschema)
* Production deployments of Schema Registry: [Deploy Schema Registry in Production on Confluent Platform](installation/deployment.md#schema-registry-prod)
* Big picture: [Scripted Confluent Platform Demo](../tutorials/cp-demo/index.md#cp-demo) shows Schema Registry in the context of a full Confluent Platform deployment, with various types of security enabled


#### IMPORTANT
Confluent Platform components that have a REST endpoint (such as Schema Registry and Confluent Control Center ), don’t
support using a principal derived from mTLS authentication when using
RBAC. So if you relied on TLS/SSL certificate authentication across Confluent Platform before
configuring RBAC, when using RBAC you must also provide HTTP Basic
Auth credentials (such as LDAP user) to authenticate against other components
or REST API endpoints.

HTTP Basic Auth presents login credentials to other Confluent Platform components and the
component uses those credentials to get an OAuth
token for the user with MDS (which validates the credentials
against LDAP) and then the component uses the OAuth token to make authorization
requests to the MDS. You must specify the bearer token
for [Use HTTP Basic Authentication in Confluent Platform](../../authentication/http-basic-auth/overview.md#http-basic-auth) and more specifically, must specify
`basic.auth.user.info` and `basic.auth.credentials.source`.

When configuring Confluent Platform components (for example, Confluent Control Center , ksqlDB, and REST Proxy) for
RBAC, use OAuth for authentication with MDS and Kafka clusters. For authentication
with other Confluent Platform components such as Confluent Platform for Apache Flink, see [Use HTTP Basic Authentication in Confluent Platform](../../authentication/http-basic-auth/overview.md#http-basic-auth).

For Confluent Platform components with REST endpoints (such as Schema Registry and Confluent Control Center ), you must use
HTTP Basic Authentication to authenticate with MDS. For details, refer to
[Configure RBAC using the REST API in Confluent Platform](rbac-config-using-rest-api.md#rbac-config-using-rest-api). You cannot use [principal propagation](../../../kafka-rest/production-deployment/rest-proxy/security.md#kafka-rest-security-propagation)
with Confluent Platform components (for example, REST Proxy) that have a REST endpoint that
requires RBAC.

When using RBAC with Schema Registry and Connect you can use any of the
[authentication methods](../../authentication/overview.md#authentication-overview) supported by Confluent Platform
to communicate with Kafka clusters and MDS. For authentication with other Confluent Platform
components, see [Use HTTP Basic Authentication in Confluent Platform](../../authentication/http-basic-auth/overview.md#http-basic-auth).

When using RBAC with Kafka clients, you can use any of the
[authentication methods](../../authentication/overview.md#authentication-overview) supported by Confluent Platform
*except OAUTHBEARER*. For details, refer to
[Configure Clients for SASL/OAUTHBEARER authentication in Confluent Platform](../../authentication/sasl/oauthbearer/configure-clients.md#security-sasl-rbac-oauthbearer-clientconfig).

![Diagram that shows authentication methods available when using RBAC](images/rbac-authentication-overview.png)


### Kafka Connect

This example runs two connectors:

- SSE source connector
- Elasticsearch sink connector

They are running on a Connect worker that is configured with Confluent Platform security features.
The Connect worker’s embedded producer is configured to be idempotent, exactly-once in order semantics per partition (in the event of an error that causes a producer retry, the same message—which is still sent by the producer multiple times—will only be written to the Kafka log on the broker once).

The Kafka Connect Docker container is running a custom image which has a specific set of connectors and transformations
needed by `cp-demo`. See [this Dockerfile](https://github.com/confluentinc/cp-demo/tree/latest/Dockerfile) for more details. Confluent Control Center uses the Kafka Connect API to manage multiple [connect clusters](../../connect/index.md#kafka-connect).

1. In the navigation bar, click **Connect**.
2. Select **connect1**, the name of the cluster of Connect workers.
   ![image](tutorials/cp-demo/images/connect_default.png)
3. Verify the connectors running in this example:
   - source connector `wikipedia-sse`: view the example’s SSE source connector [configuration file](https://github.com/confluentinc/cp-demo/tree/latest/scripts/connectors/submit_wikipedia_sse_config.sh).
   - sink connector `elasticsearch-ksqldb` consuming from the Kafka topic `WIKIPEDIABOT`: view the example’s Elasticsearch sink connector [configuration file](https://github.com/confluentinc/cp-demo/tree/latest/scripts/connectors/submit_elastic_sink_config.sh).

   ![image](tutorials/cp-demo/images/connector_list.png)
4. Click any connector name to view or modify any details of the connector configuration and custom transforms.


## Set custom component properties

When a configuration setting is not directly supported by Ansible Playbooks for Confluent Platform, you can use
the custom property feature to configure Confluent Platform components.

Before you set a custom property variable, first check the Ansible variable file
at the following location  for an existing variable:

```bash
https://github.com/confluentinc/cp-ansible/blob/8.1.0-post/docs/VARIABLES.md
```

If you find an existing variable that directly supports the setting, use the
variable in the inventory file instead of using a config override.

Configure the custom properties in the Ansible inventory file, `hosts.yml`,
using the following dictionaries:

* `kafka_controller_custom_properties`
* `kafka_broker_custom_properties`
* `schema_registry_custom_properties`
* `kafka_rest_custom_properties`
* `kafka_connect_custom_properties`
* `ksql_custom_properties`
* `control_center_next_gen_custom_properties`
* `kafka_connect_replicator_custom_properties`
* `kafka_connect_replicator_consumer_custom_properties`
* `kafka_connect_replicator_producer_custom_properties`
* `kafka_connect_replicator_monitoring_interceptor_custom_properties`

In the example below:

* The `num.io.threads` property gets set in the Kafka [properties
  file](/platform/current/installation/configuration/broker-configs.html#cp-config-brokers).
* The `confluent.controlcenter.ksql.default.advertised.url` property gets set
  in the Control Center [properties
  file](/platform/current/control-center/installation/configuration.html).

Note that the default in the
`confluent.controlcenter.ksql.default.advertised.url` property value is the
name Control Center should use to identify the ksqlDB cluster.

```none
all:
  vars:
    kafka_broker_custom_properties:
      num.io.threads: 15
    control_center_next_gen_custom_properties:
      confluent.controlcenter.ksql.url: http://ksql-external-dns:1234,http://ksql-external-dns:2345
```


### Property-based example

1. Create a `quickstart-azureblobstoragesource.properties` file with the following contents. This file should be placed under Confluent Platform installation directory. This configuration is used typically along with [standalone workers](/platform/current/connect/concepts.html#standalone-workers).
   ```properties
   name=azure-blob-storage-source
   tasks.max=1
   connector.class=io.confluent.connect.azure.blob.storage.AzureBlobStorageSourceConnector

   # enter your Azure blob account, key and container name here
   azblob.account.name=<your-account>
   azblob.account.key=<your-key>
   azblob.container.name=<container-name>

   format.class=io.confluent.connect.azure.blob.storage.format.avro.AvroFormat
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   ```
2. Edit the `quickstart-azureblobstoragesource.properties` to add the following properties:
   ```properties
   transforms=AddPrefix
   transforms.AddPrefix.type=org.apache.kafka.connect.transforms.RegexRouter
   transforms.AddPrefix.regex=.*
   transforms.AddPrefix.replacement=copy_of_$0
   ```

   #### IMPORTANT
   Adding this renames the output of topic of the messages to `copy_of_blob_topic`. This prevents a continuous feedback loop of messages.
3. Load the Backup and Restore Azure Blob Storage Source connector.
   ```bash
   confluent local load azblobstorage-source --config quickstart-azureblobstoragesource.properties
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production environments.
4. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status azureblobstorage-source
   ```
5. Confirm that the messages are being sent to Kafka.
   ```bash
   kafka-avro-console-consumer \
       --bootstrap-server localhost:9092 \
       --property schema.registry.url=http://localhost:8081 \
       --topic copy_of_blob_topic \
       --from-beginning | jq '.'
   ```
6. The response should be 9 records as follows.
   ```bash
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```


## (Optional) Running the other components

You can run configure and run additional components as a part of the Self-Balancing tests, if desired, but these are not integral to this tutorial.

If you want to run Connect, ksqlDB, or Schema Registry with Confluent Platform, do the following:

1. Edit the properties files for Connect, ksqlDB, or Schema Registry search and replace any `replication.factor` values to either 2 or 3 (to work with your five broker cluster).
   If `replication.factor` values are set to less than 2 or greater than 4, this will result in system topics with replication factors that prevent graceful broker removal with Self-Balancing.

   For example, if you want to run Connect, you could set replication factors in `$CONFLUENT_HOME/etc/kafka/connect-distributed.properties` to a value of “2”:
   - `offset.storage.replication.factor=2`
   - `config.storage.replication.factor=2`
   - `status.storage.replication.factor=2`

   You could run this command to update replication configurations for Connect:
   ```bash
   sed -i '' -e "s/replication.factor=1/replication.factor=2/g" $CONFLUENT_HOME/etc/kafka/connect-distributed.properties
   ```
2. In `$CONTROL_CENTER_HOME/etc/confluent-control-center/control-center-dev.properties`, verify that the configurations for Kafka Connect, ksqlDB, and Schema Registry match the following settings
   to provide Control Center with the default advertised URLs for the component clusters:
   ```bash
   # A comma separated list of Connect host names
   confluent.controlcenter.connect.cluster=http://localhost:8083

   # KSQL cluster URL
   confluent.controlcenter.ksql.ksqlDB.url=http://localhost:8088

   # Schema Registry cluster URL
   confluent.controlcenter.schema.registry.url=http://localhost:8081
   ```
3. Start Prometheus, Control Center, and Confluent Platform as described in previous sections.
4. Start the optional components in separate windows.
   - (Optional) [Kafka Connect](../../connect/index.md#kafka-connect)
     ```bash
     connect-distributed $CONFLUENT_HOME/etc/kafka/connect-distributed.properties
     ```
   - Optional) [ksqlDB](../../ksqldb/overview.md#ksql-home)
     ```bash
     ksql-server-start $CONFLUENT_HOME/etc/ksqldb/ksql-server.properties
     ```
   - (Optional) [Schema Registry overview](/platform/current/schema-registry/index.html)
     ```bash
     schema-registry-start $CONFLUENT_HOME/etc/schema-registry/schema-registry.properties
     ```


# Configure RBAC for a Connect Worker

In an RBAC-enabled environment, several RBAC configuration lines need to be
added to each Connect worker file. Refer to the following for information
about what needs to be added to each Connect worker file.

1. Add the following parameter to enable per-connector principals.
   ```none
   connector.client.config.override.policy=All
   ```
2. Add the following parameters to enable the Connect framework to authenticate with Kafka using a [service principal](connect-rbac-connect-cluster.md#connect-rbac-service-account). The service principal is used by Connect to read from and write to internal configuration topics. Note that the `<username>` and `<password>` are the service principal username and password granted permissions when setting up the [service principal](connect-rbac-connect-cluster.md#connect-rbac-service-account).
   ```none
   # Or SASL_SSL if using TLS/SSL
   security.protocol=SASL_PLAINTEXT
   sasl.mechanism=OAUTHBEARER
   sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
   sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
     username="<username>" \
     password="<password>" \
     metadataServerUrls="http(s)://<host>:<port>";
   ```
3. Add the following parameters to establish **worker-wide default properties** for each type of Kafka client used by connectors in the cluster.
   ```none
   producer.security.protocol=SASL_PLAINTEXT
   producer.sasl.mechanism=OAUTHBEARER
   producer.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
   ```

   #### NOTE
   Any principal used by Idempotent producers must be granted IdempotentWrite
   on the cluster or Write permission on any topic to initialize the producer
   client. Binding either the DeveloperWrite or ResourceOwner RBAC roles on the
   Kafka cluster grants Write permission. Note that DeveloperWrite is the less
   permissive of the two roles, and is the first recommendation. Consume does
   not require additional Kafka permissions to be Idempotent consumers. The
   following role binding ensures that Write has access to the cluster:
   ```none
   confluent iam rbac role-binding create \
     --principal $PRINCIPAL \
     --role DeveloperWrite \
     --resource Cluster:kafka-cluster \
     --kafka-cluster $KAFKA_CLUSTER_ID
   ```

   ```none
   consumer.security.protocol=SASL_PLAINTEXT
   consumer.sasl.mechanism=OAUTHBEARER
   consumer.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
   ```

   ```none
   admin.security.protocol=SASL_PLAINTEXT
   admin.sasl.mechanism=OAUTHBEARER
   admin.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
   ```
4. Add the following Metadata Service (MDS) parameters to require user RBAC authentication for Connect. RBAC authentication is required to allow users to create connectors, read connector configurations, and delete connectors.
   ```none
   # Adds the RBAC REST extension to the Connect worker
   rest.extension.classes=io.confluent.connect.security.ConnectSecurityExtension

   # The location of a running metadata service
   confluent.metadata.bootstrap.server.urls=<mds_server_url>

   # Credentials to use when communicating with the MDS
   confluent.metadata.basic.auth.user.info=<username>:<password>
   confluent.metadata.http.auth.credentials.provider=BASIC
   ```

   #### NOTE
   For additional configurations available to any client communicating with
   MDS, see [REST client configurations](../../kafka/configure-mds/mds-configuration.md#rest-client-mds-config) in the Confluent Platform Security documentation.
5. Add the following parameter to have Connect use basic authentication for user requests and token authentication for impersonated requests (for example, from REST proxy).
   ```none
   rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler

   # The path to a directory containing public keys that should be used to verify json web tokens
   # during authentication
   public.key.path=<public key path>
   ```

See [Secret Registry](connect-rbac-secret-registry.md#connect-rbac-secret-registry) if you are using a
Secret Registry for connector credentials.


### Connect to a secure Kafka cluster, like Confluent Cloud

Run a ksqlDB Server that uses a secure connection to a Kafka cluster.
Learn about [Configure Security for ksqlDB](../../ksqldb/operate-and-deploy/installation/security.md#ksqldb-installation-security).

`KSQL_BOOTSTRAP_SERVERS`
: A host:port pair for establishing the initial connection to the Kafka cluster. Multiple bootstrap servers can be used in the form `host1:port1,host2:port2,host3:port3...`.

`KSQL_KSQL_SERVICE_ID`
: The service ID of the ksqlDB server, which is used as the prefix for the
  internal topics created by ksqlDB.

`KSQL_LISTENERS`
: A list of URIs, including the protocol, that the broker listens on.

`KSQL_KSQL_SINK_REPLICAS`
: The default number of replicas for the topics created by ksqlDB.
  The default is one.

`KSQL_KSQL_STREAMS_REPLICATION_FACTOR`
: The replication factor for internal topics, the command topic, and output
  topics.

`KSQL_SECURITY_PROTOCOL`
: The protocol that your Kafka cluster uses for security.

`KSQL_SASL_MECHANISM`
: The SASL mechanism that your Kafka cluster uses for security.

`KSQL_SASL_JAAS_CONFIG`
: The Java Authentication and Authorization Service (JAAS) configuration.

```bash
docker run -d \
  -p 127.0.0.1:8088:8088 \
  -e KSQL_BOOTSTRAP_SERVERS=REMOVED_SERVER1:9092,REMOVED_SERVER2:9093,REMOVED_SERVER3:9094 \
  -e KSQL_LISTENERS=http://0.0.0.0:8088/ \
  -e KSQL_KSQL_SERVICE_ID=default_ \
  -e KSQL_KSQL_SINK_REPLICAS=3 \
  -e KSQL_KSQL_STREAMS_REPLICATION_FACTOR=3 \
  -e KSQL_SECURITY_PROTOCOL=SASL_SSL \
  -e KSQL_SASL_MECHANISM=PLAIN \
  -e KSQL_SASL_JAAS_CONFIG="org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<username>\" password=\"<strong-password>\";" \
  confluentinc/cp-ksqldb-server:8.1.0
```


### Connect ksqlDB Server to a secure Kafka Cluster, like Confluent Cloud

ksqlDB Server runs outside of your Kafka clusters, so you need to
specify in the container environment how ksqlDB Server connects with a
Kafka cluster.

Run a ksqlDB Server that uses a secure connection to a Kafka
cluster:

```bash
docker run -d \
  -p 127.0.0.1:8088:8088 \
  -e KSQL_BOOTSTRAP_SERVERS=REMOTE_SERVER1:9092,REMOTE_SERVER2:9093,REMOTE_SERVER3:9094 \
  -e KSQL_LISTENERS=http://0.0.0.0:8088/ \
  -e KSQL_KSQL_SERVICE_ID=default_ \
  -e KSQL_KSQL_SINK_REPLICAS=3 \
  -e KSQL_KSQL_STREAMS_REPLICATION_FACTOR=3 \
  -e KSQL_KSQL_INTERNAL_TOPIC_REPLICAS=3 \
  -e KSQL_SECURITY_PROTOCOL=SASL_SSL \
  -e KSQL_SASL_MECHANISM=PLAIN \
  -e KSQL_SASL_JAAS_CONFIG="org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<username>\" password=\"<strong-password>\";" \
  confluentinc/cp-ksqldb-server:8.1.0
```

`KSQL_BOOTSTRAP_SERVERS`
: A list of hosts for establishing the initial connection to the Kafka
  cluster.

`KSQL_KSQL_SERVICE_ID`
: The service ID of the ksqlDB Server, which is used as the prefix for
  the internal topics created by ksqlDB.

`KSQL_LISTENERS`
: A list of URIs, including the protocol, that the broker listens on.
  If you are using IPv6, set it to `http://[::]:8088`.

`KSQL_KSQL_SINK_REPLICAS`
: The default number of replicas for the topics created by ksqlDB. The
  default is one.

`KSQL_KSQL_STREAMS_REPLICATION_FACTOR`
: The replication factor for internal topics, the command topic, and
  output topics.

`KSQL_KSQL_INTERNAL_TOPIC_REPLICAS`
: The number of replicas for the internal topics created by ksqlDB
  Server. The default is 1.

`KSQL_SECURITY_PROTOCOL`
: The protocol that your Kafka cluster uses for security.

`KSQL_SASL_MECHANISM`
: The SASL mechanism that your Kafka cluster uses for security.

`KSQL_SASL_JAAS_CONFIG`
: The Java Authentication and Authorization Service (JAAS)
  configuration.

Learn how to [Configure Security for ksqlDB](security.md#ksqldb-installation-security).


### Create a Kafka client project

Notice that so far, all the heavy lifting happens inside of ksqlDB.
ksqlDB takes care of the stateful stream processing. Triggering
side-effects will be delegated to a light-weight service that consumes
from a Kafka topic. You want to send an email each time an anomaly is
found. To do that, you’ll implement a simple, scalable microservice. In
practice, you might use [Kafka Streams](../../streams/overview.md#kafka-streams) to handle
this piece, but to keep things simple, just use a Kafka consumer
client.

Start by creating a `pom.xml` file for your microservice. This simple
microservice will run a loop, reading from the `possible_anomalies`
Kafka topic and sending an email for each event it receives.
Dependencies are declared on Kafka, Avro, SendGrid, and a few
other things:

```xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>io.ksqldb</groupId>
  <artifactId>email-sender</artifactId>
  <version>0.0.1</version>

  <properties>

    <java.version>8</java.version>
    <confluent.version>|release|</confluent.version>
    <kafka.version>2.5.0</kafka.version>
    <avro.version>1.9.1</avro.version>
    <slf4j.version>1.7.30</slf4j.version>
    <sendgrid.version>4.4.8</sendgrid.version>

    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
  </properties>

  <repositories>
    <repository>
      <id>confluent</id>
      <name>Confluent</name>
      <url>https://packages.confluent.io/maven/</url>
    </repository>
  </repositories>

  <pluginRepositories>
    <pluginRepository>
      <id>confluent</id>
      <url>https://packages.confluent.io/maven/</url>
    </pluginRepository>
  </pluginRepositories>

  <dependencies>

    <dependency>
      <groupId>io.confluent</groupId>
      <artifactId>kafka-avro-serializer</artifactId>
      <version>${confluent.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.kafka</groupId>
      <artifactId>kafka-clients</artifactId>
      <version>${kafka.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.avro</groupId>
      <artifactId>avro</artifactId>
      <version>${avro.version}</version>
    </dependency>
    <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-log4j12</artifactId>
      <version>${slf4j.version}</version>
    </dependency>
    <dependency>
      <groupId>com.sendgrid</groupId>
      <artifactId>sendgrid-java</artifactId>
      <version>${sendgrid.version}</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.1</version>
        <configuration>
          <source>${java.version}</source>
          <target>${java.version}</target>
          <compilerArgs>
            <arg>-Xlint:all</arg>
          </compilerArgs>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro-maven-plugin</artifactId>
        <version>${avro.version}</version>
        <executions>
          <execution>
            <phase>generate-sources</phase>
            <goals>
              <goal>schema</goal>
            </goals>
            <configuration>
              <sourceDirectory>${project.basedir}/src/main/avro</sourceDirectory>
              <outputDirectory>${project.build.directory}/generated-sources</outputDirectory>
              <enableDecimalLogicalType>true</enableDecimalLogicalType>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>io.confluent</groupId>
        <artifactId>kafka-schema-registry-maven-plugin</artifactId>
        <version>${confluent.version}</version>
        <configuration>
          <schemaRegistryUrls>
            <param>http://localhost:8081</param>
          </schemaRegistryUrls>
          <outputDirectory>src/main/avro</outputDirectory>
          <subjectPatterns>
            <param>possible_anomalies-value</param>
          </subjectPatterns>
          <prettyPrintSchemas>true</prettyPrintSchemas>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>
```

Create the directory structure for the rest of the project:

```none
mkdir -p src/main/java/io/ksqldb/tutorial src/main/resources src/main/avro
```

To ensure that your microservice logs output to the console, create the
following file at `src/main/resources/log4j.properties`:

```none

### Start the stack

To set up and launch the services in the stack, a few files need to be
created first.

MySQL requires some custom configuration to play well with Debezium, so
take care of this first. Debezium has dedicated
[documentation](https://debezium.io/documentation/reference/1.1/connectors/mysql.html)
if you’re interested, but this guide covers just the essentials. Create
a new file at `mysql/custom-config.cnf` with the following content:

```none
[mysqld]
server-id                = 223344
log_bin                  = mysql-bin
binlog_format            = ROW
binlog_row_image         = FULL
expire_logs_days         = 10
gtid_mode                = ON
enforce_gtid_consistency = ON
```

This sets up MySQL’s transaction log so that Debezium can watch for
changes as they occur.

With this file in place, create a `docker-compose.yml` file that
defines the services to launch:

```yaml

version: '2'

services:
  mysql:
    image: mysql:8.0.19
    hostname: mysql
    container_name: mysql
    ports:
      - "3306:3306"
    environment:
      MYSQL_ROOT_PASSWORD: mysql-pw
      MYSQL_DATABASE: call-center
      MYSQL_USER: example-user
      MYSQL_PASSWORD: example-pw
    volumes:
      - "./mysql/custom-config.cnf:/etc/mysql/conf.d/custom-config.cnf"

  broker:
    image: confluentinc/cp-kafka:8.1.0
    hostname: broker
    container_name: broker
    ports:
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1

  schema-registry:
    image: confluentinc/cp-schema-registry:8.1.0
    hostname: schema-registry
    container_name: schema-registry
    depends_on:
      - broker
    ports:
      - "8081:8081"
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: "PLAINTEXT://broker:9092"

  ksqldb-server:
    image: confluentinc/cp-ksqldb-server:8.1.0
    hostname: ksqldb-server
    container_name: ksqldb-server
    depends_on:
      - broker
      - schema-registry
    ports:
      - "8088:8088"
    volumes:
      - "./confluent-hub-components/:/usr/share/kafka/plugins/"
    environment:
      KSQL_LISTENERS: "http://0.0.0.0:8088"
      KSQL_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
      # Configuration to embed Kafka Connect support.
      KSQL_CONNECT_GROUP_ID: "ksql-connect-cluster"
      KSQL_CONNECT_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.storage.StringConverter"
      KSQL_CONNECT_VALUE_CONVERTER: "io.confluent.connect.avro.AvroConverter"
      KSQL_CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_CONNECT_CONFIG_STORAGE_TOPIC: "_ksql-connect-configs"
      KSQL_CONNECT_OFFSET_STORAGE_TOPIC: "_ksql-connect-offsets"
      KSQL_CONNECT_STATUS_STORAGE_TOPIC: "_ksql-connect-statuses"
      KSQL_CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_PLUGIN_PATH: "/usr/share/kafka/plugins"

  ksqldb-cli:
    image: confluentinc/cp-ksqldb-cli:8.1.0
    container_name: ksqldb-cli
    depends_on:
      - broker
      - ksqldb-server
    entrypoint: /bin/sh
    tty: true
```

There are a few things to notice here. The MySQL image mounts the custom
configuration file that you wrote. MySQL merges these configuration
settings into its system-wide configuration. The environment variables
you gave it also set up a blank database called `call-center` along
with a user named `example-user` that can access it.

Also note that the ksqlDB server image mounts the
`confluent-hub-components` directory, too. The jar files that you
downloaded need to be on the classpath of ksqlDB when the server starts
up.

Bring up the entire stack by running:

```bash
docker-compose up
```


## Example 1: Same number of partitions in DC1 and DC2

In this example, you migrate from MirrorMaker to Replicator and keep the same number of partitions for `inventory` in DC1 and
DC2.

Prerequisites:
: - Confluent Platform 5.0.0 or later is [installed](../../installation/overview.md#installation).
  - You must have the same number of partitions for `inventory` in DC1 and DC2 to use this method.
  - The `src.consumer.group.id` in Replicator must match `group.id` in MirrorMaker.

1. Stop the running MirrorMaker instance in DC1, where `<mm pid>` is the MirrorMaker process ID:
   ```none
   kill <mm pid>
   ```
2. Configure and start Replicator. In this example, Replicator is run as an executable from the command line or from
   [a Docker image](../../installation/docker/config-reference.md#config-reference).
   1. Add these values to `CONFLUENT_HOME/etc/kafka-connect-replicator/replicator_consumer.properties`. Replace
      `localhost:9082` with the `bootstrap.servers` of DC1, the source cluster:
      ```bash
      bootstrap.servers=localhost:9082
      topic.preserve.partitions=true
      ```
   2. Add this value to `CONFLUENT_HOME/etc/kafka-connect-replicator/replicator_producer.properties`. Replace
      `localhost:9092` with the `bootstrap.servers` of DC2, the destination cluster:
      ```bash
      bootstrap.servers=localhost:9092
      ```
   3. Ensure the replication factors are set to `2` or `3` for production, if they are not already:
      ```bash
      echo "confluent.topic.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "offset.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "config.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "status.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      ```
   4. Start Replicator:
      ```bash
      replicator --cluster.id <new-cluster-id> \
      --producer.config replicator_producer.properties \
      --consumer.config replicator_consumer.properties \
      --replication.config ./etc/kafka-connect-replicator/quickstart-replicator.properties
      ```

Replicator will use the committed offsets by MirrorMaker from DC1 and start replicating messages from DC1 to DC2 based on
these offsets.


## Example 2: Different number of partitions in DC1 and DC2

In this example, you migrate from MirrorMaker to Replicator and have a different number of partitions for `inventory` in DC1
and DC2.

Prerequisite:
: - Confluent Platform 5.0.0 or later is [installed](../../installation/overview.md#installation).
  - The `src.consumer.group.id` in Replicator must match `group.id` in MirrorMaker.

1. Stop the running MirrorMaker instance from DC1.
2. Configure and start Replicator. In this example, Replicator is run as an executable from the command line or from
   [a Docker image](../../installation/docker/config-reference.md#config-reference).
   1. Add this value to `CONFLUENT_HOME/etc/kafka-connect-replicator/replicator_consumer.properties`. Replace
      `localhost:9082` with the `bootstrap.servers` of DC1, the source cluster:
      ```bash
      bootstrap.servers=localhost:9082
      topic.preserve.partitions=false
      ```
   2. Add this value to `CONFLUENT_HOME/etc/kafka-connect-replicator/replicator_producer.properties`. Replace
      `localhost:9092` with the `bootstrap.servers` of DC2, the destination cluster:
      ```bash
      bootstrap.servers=localhost:9092
      ```
   3. Ensure the replication factors are set to `2` or `3` for production, if they are not already:
      ```bash
      echo "confluent.topic.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "offset.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "config.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      echo "status.storage.replication.factor=3" >> ./etc/kafka-connect-replicator/quickstart-replicator.properties
      ```
   4. Start Replicator:
      ```bash
      replicator --cluster.id <new-cluster-id> \
      --producer.config replicator_producer.properties \
      --consumer.config replicator_consumer.properties \
      --replication.config ./etc/kafka-connect-replicator/quickstart-replicator.properties
      ```

Replicator will use the committed offsets by MirrorMaker from DC1 and start replicating messages from DC1 to DC2 based on
these offsets.


## Demo: Enabling Schema ID Validation on a Topic at the Command Line

This short demo shows the effect of enabling or disabling schema validation on a topic.

If you are just getting started with Confluent Platform and Schema Registry, you might want to first work through the [Tutorial: Use Schema Registry on Confluent Platform to Implement Schemas for a Client Application](schema_registry_onprem_tutorial.md#schema-registry-onprem-tutorial), then return to this demo.

The examples make use of the `kafka-console-producer` and `kafka-console-consumer`, which are located in `$CONFLUENT_HOME/bin`.

1. On a local install of Confluent Platform version 5.4.0 or later, modify `$CONFLUENT_HOME/etc/kafka/server.properties` to include the following configuration for the Schema Registry URL:
   ```bash
   ############################## My Schema Validation Demo Settings ################
   # Schema Registry URL
   confluent.schema.registry.url=http://localhost:8081
   ```

   The example above includes two lines of comments, which are optional, to keep track of the configurations in the file.
2. Start Confluent Platform using the following command:
   ```bash
   confluent local start
   ```


3. Create a test topic called `test-schemas` without specifying the Schema ID Validation setting so that it defaults to `false`.
   ```bash
   kafka-topics --bootstrap-server localhost:9092 --create --partitions 1 --replication-factor 1 --topic test-schemas
   ```

   This creates a topic with no broker validation on records produced to the test topic, which is what you want for the first part of the demo.
   You can verify that the topic was created with `kafka-topics --bootstrap-server localhost:9092 --list`.
4. In a new command window for the producer, run this command to produce a serialized record (using the default string serializer) to the topic `test-schemas`.
   ```bash
   kafka-console-producer --bootstrap-server localhost:9092 --topic test-schemas --property parse.key=true --property key.separator=,
   ```

   The command is successful because you currently have Schema ID Validation disabled for this topic. If broker Schema ID Validation had been enabled for this topic, the above command to produce to it would not be permitted.

   The output of this command is a producer command prompt (`>`), where you can type the messages you want to produce.

   Type your first message at the `>` prompt as follows:
   ```bash
   1,my first record
   ```

   Keep this session of the producer running.
5. Open a new command window for the consumer, and enter this command to read the messages:
   ```bash
   kafka-console-consumer --bootstrap-server localhost:9092 --from-beginning --topic test-schemas --property print.key=true
   ```

   The output of this command is `my first record`.

   Keep this session of the consumer running.
6. Now, set Schema ID Validation for the topic `test-schemas` to `true`.
   ```bash
   kafka-configs --bootstrap-server localhost:9092 --alter --entity-type topics --entity-name test-schemas --add-config confluent.value.schema.validation=true
   ```

   You should get a confirmation: `Completed updating config for topic test-schemas.`
7. Return to the producer session, and type a second message at the `>` prompt.
   ```bash
   2,my second record
   ```

   You will get an error because Schema ID Validation is enabled and the messages we are sending do not contain schema IDs: `This record has failed the validation on broker`

   If you subsequently disable Schema ID Validation (use the same command to set it to `false`), restart the producer, then type and resend the same or another similarly formatted message,
   the message will go through. (For example, produce `3,my third record`.)

   The messages that were successfully produced also show on Control Center ([http://localhost:9021/](http://localhost:9021/) in your web browser) in
   **Topics > test-schemas > messages**. You may have to select a partition or jump to a timestamp to see messages sent earlier.
   ![image](images/sv-topics.png)
8.


   Run shutdown and cleanup tasks.
   - You can stop the consumer and producer with Ctl-C in their respective command windows.
   - To stop Confluent Platform, type `confluent local services stop`.
   - If you would like to clear out existing data (topics, schemas, and messages) before starting again with another test, type `confluent local destroy`.


#### IMPORTANT
- If you use the legacy method of defining TLS/SSL values in system environment variables, TLS/SSL settings will apply to
  every Java component running on this JVM. For example on Connect, every [connector](/kafka-connectors/self-managed/overview.html)
  will use the given truststore. Consider a scenario where you are using an Amazon Web Services (AWS) connector such
  as S3 or Kinesis, and do not have the AWS certificate chain in the given truststore. The connector will fail with
  the following error:
  ```bash
  com.amazonaws.SdkClientException: Unable to execute HTTP request:
  sun.security.validator.ValidatorException: PKIX path building failed
  ```

  This does not apply if you use the dedicated Schema Registry client configurations.
- For the `kafka-avro-console-producer` and `kafka-avro-console-consumer`, you must pass the Schema Registry properties on the command line. Here is an example for the producer:
  ```bash
  ./kafka-avro-console-producer --broker-list localhost:9093 --topic myTopic \
  --producer.config ~/ect/kafka/producer.properties --property value.schema=‘{“type”:“record”,“name”:“myrecord”,“fields”:[{“name”:“f1”,“type”:“string”}]}’ \
  --property schema.registry.url=https://localhost:8081 --property schema.registry.ssl.truststore.location=/etc/kafka/security/schema.registry.client.truststore.jks --property schema.registry.ssl.truststore.password=myTrustStorePassword
  ```

  For more examples of using the producer and consumer command line utilities, see [Test drive Avro schema](../fundamentals/serdes-develop/serdes-avro.md#sr-test-drive-avro), [Test drive JSON Schema](../fundamentals/serdes-develop/serdes-json.md#sr-test-drive-json-schema), [Test drive Protobuf schema](../fundamentals/serdes-develop/serdes-protobuf.md#sr-test-drive-protobuf), and the demo in [Validate Broker-side Schemas IDs in Confluent Platform](../schema-validation.md#schema-validation).


#### Connect

- Additional RBAC configurations required for [connect-avro-distributed.properties](https://github.com/confluentinc/examples/tree/latest/security/rbac/delta_configs/connect-avro-distributed.properties.delta)
  ```none
  bootstrap.servers=localhost:9092
  security.protocol=SASL_PLAINTEXT
  sasl.mechanism=OAUTHBEARER
  sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required username="connect" password="connect1" metadataServerUrls="http://localhost:8090";

  ## Connector client (producer, consumer, admin client) properties ##
  key.converter=org.apache.kafka.connect.storage.StringConverter
  value.converter=io.confluent.connect.avro.AvroConverter
  group.id=connect-cluster
  offset.storage.topic=connect-offsets
  offset.storage.replication.factor=1
  config.storage.topic=connect-configs
  config.storage.replication.factor=1
  status.storage.topic=connect-statuses
  status.storage.replication.factor=1


  # Allow producer/consumer/admin client overrides (this enables per-connector principals)
  connector.client.config.override.policy=All

  producer.security.protocol=SASL_PLAINTEXT
  producer.sasl.mechanism=OAUTHBEARER
  producer.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  # Intentionally omitting `producer.sasl.jaas.config` to force connectors to use their own

  consumer.security.protocol=SASL_PLAINTEXT
  consumer.sasl.mechanism=OAUTHBEARER
  consumer.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  # Intentionally omitting `consumer.sasl.jaas.config` to force connectors to use their own

  admin.security.protocol=SASL_PLAINTEXT
  admin.sasl.mechanism=OAUTHBEARER
  admin.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  # Intentionally omitting `admin.sasl.jaas.config` to force connectors to use their own

  ## REST extensions: RBAC and Secret Registry ##

  # Installs the RBAC and Secret Registry REST extensions
  rest.extension.classes=io.confluent.connect.security.ConnectSecurityExtension,io.confluent.connect.secretregistry.ConnectSecretRegistryExtension

  ## RBAC Authentication ##

  # Enables basic and bearer authentication for requests made to the worker
  rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler

  # The path to a directory containing public keys that should be used to verify json web tokens during authentication
  public.key.path=/tmp/tokenPublicKey.pem

  ## RBAC Authorization ##

  # The location of a running metadata service; used to verify that requests are authorized by the users that make them
  confluent.metadata.bootstrap.server.urls=http://localhost:8090

  # Credentials to use when communicating with the MDS; these should usually match the ones used for communicating with Kafka
  confluent.metadata.basic.auth.user.info=connect:connect1
  confluent.metadata.http.auth.credentials.provider=BASIC

  ## Secret Registry Secret Provider ##

  config.providers=secret
  config.providers.secret.class=io.confluent.connect.secretregistry.rbac.config.provider.InternalSecretConfigProvider
  config.providers.secret.param.master.encryption.key=password1234

  config.providers.secret.param.kafkastore.bootstrap.servers=localhost:9092
  config.providers.secret.param.kafkastore.security.protocol=SASL_PLAINTEXT
  config.providers.secret.param.kafkastore.sasl.mechanism=OAUTHBEARER
  config.providers.secret.param.kafkastore.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  config.providers.secret.param.kafkastore.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required username="connect" password="connect1" metadataServerUrls="http://localhost:8090";
  ```
- Additional RBAC configurations required for a [source connector](https://github.com/confluentinc/examples/tree/latest/security/rbac/delta_configs/connector-source.properties.delta)
  ```none
  producer.override.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required username="connector" password="connector1" metadataServerUrls="http://localhost:8090";
  ```
- Additional RBAC configurations required for a [sink connector](https://github.com/confluentinc/examples/tree/latest/security/rbac/delta_configs/connector-sink.properties.delta)
  ```none
  consumer.override.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required username="connector" password="connector1" metadataServerUrls="http://localhost:8090";
  ```
- Role bindings:
  ```bash
  # Connect Admin
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_CONNECT --role ResourceOwner --resource Topic:connect-configs --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_CONNECT --role ResourceOwner --resource Topic:connect-offsets --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_CONNECT --role ResourceOwner --resource Topic:connect-statuses --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_CONNECT --role ResourceOwner --resource Group:connect-cluster --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_CONNECT --role ResourceOwner --resource Topic:_confluent-secrets --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_CONNECT --role ResourceOwner --resource Group:secret-registry --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_CONNECT --role SecurityAdmin --kafka-cluster $KAFKA_CLUSTER_ID --connect-cluster $CONNECT_CLUSTER_ID

  # Connector Submitter
  confluent iam rbac role-binding create --principal User:$USER_CONNECTOR_SUBMITTER --role ResourceOwner --resource Connector:$CONNECTOR_NAME --kafka-cluster $KAFKA_CLUSTER_ID --connect-cluster $CONNECT_CLUSTER_ID

  # Connector
  confluent iam rbac role-binding create --principal User:$USER_CONNECTOR --role ResourceOwner --resource Topic:$TOPIC2_AVRO --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_CONNECTOR --role ResourceOwner --resource Subject:${TOPIC2_AVRO}-value --kafka-cluster $KAFKA_CLUSTER_ID --schema-registry-cluster $SCHEMA_REGISTRY_CLUSTER_ID
  # Sink Connector
  confluent iam rbac role-binding create --principal User:$USER_CONNECTOR --role DeveloperRead --resource Group:$CONNECTOR_CONSUMER_GROUP_ID --prefix --kafka-cluster $KAFKA_CLUSTER_ID
  ```


#### REST Proxy

- Additional RBAC configurations required for [kafka-rest.properties](https://github.com/confluentinc/examples/tree/latest/security/rbac/delta_configs/kafka-rest.properties.delta)
  ```none
  # Configure connections to other Confluent Platform services
  bootstrap.servers=localhost:9092
  schema.registry.url=http://localhost:8081

  client.security.protocol=SASL_PLAINTEXT
  client.sasl.mechanism=OAUTHBEARER
  client.security.protocol=SASL_PLAINTEXT
  client.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  client.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required username="clientrp" password="clientrp1" metadataServerUrls="http://localhost:8090";

  kafka.rest.resource.extension.class=io.confluent.kafkarest.security.KafkaRestSecurityResourceExtension
  rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler

  public.key.path=/tmp/tokenPublicKey.pem

  # Credentials to use with the MDS
  confluent.metadata.bootstrap.server.urls=http://localhost:8090
  confluent.metadata.basic.auth.user.info=rp:rp1
  confluent.metadata.http.auth.credentials.provider=BASIC
  ```
- Role bindings:
  ```bash
  # REST Proxy Admin: role bindings for license management, no additional administrative rolebindings required because REST Proxy just does impersonation
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_RP --role DeveloperRead --resource Topic:$LICENSE_TOPIC --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_RP --role DeveloperWrite --resource Topic:$LICENSE_TOPIC --kafka-cluster $KAFKA_CLUSTER_ID

  # Producer/Consumer
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_RP --role ResourceOwner --resource Topic:$TOPIC3 --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_RP --role DeveloperRead --resource Group:$CONSUMER_GROUP --kafka-cluster $KAFKA_CLUSTER_ID
  ```


#### ksqlDB

- Additional RBAC configurations required for [ksql-server.properties](https://github.com/confluentinc/examples/tree/latest/security/rbac/delta_configs/ksql-server.properties.delta)
  ```none
  bootstrap.servers=localhost:9092
  security.protocol=SASL_PLAINTEXT
  sasl.mechanism=OAUTHBEARER
  sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required username="ksqlDBserver" password="ksqlDBserver1" metadataServerUrls="http://localhost:8090";

  # Specify KSQL service id used to bind user/roles to this cluster
  ksql.service.id=rbac-ksql

  # Enable KSQL authorization and impersonation
  ksql.security.extension.class=io.confluent.ksql.security.KsqlConfluentSecurityExtension

  # Enable KSQL Basic+Bearer authentication
  ksql.authentication.plugin.class=io.confluent.ksql.security.VertxBearerOrBasicAuthenticationPlugin
  public.key.path=/tmp/tokenPublicKey.pem

  # Metadata URL and access credentials
  confluent.metadata.bootstrap.server.urls=http://localhost:8090
  confluent.metadata.http.auth.credentials.provider=BASIC
  confluent.metadata.basic.auth.user.info=ksqlDBserver:ksqlDBserver1

  # Credentials for Schema Registry access
  ksql.schema.registry.url=http://localhost:8081
  ksql.schema.registry.basic.auth.user.info=ksqlDBserver:ksqlDBserver1
  ```
- Role bindings:
  ```bash
  # ksqlDB Server Admin
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_KSQLDB --role ResourceOwner --resource Topic:_confluent-ksql-${KSQL_SERVICE_ID}_command_topic --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_KSQLDB --role ResourceOwner --resource Topic:${KSQL_SERVICE_ID}ksql_processing_log --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_KSQLDB --role SecurityAdmin --kafka-cluster $KAFKA_CLUSTER_ID --ksql-cluster $KSQL_SERVICE_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_KSQLDB --role ResourceOwner --resource KsqlCluster:ksql-cluster --kafka-cluster $KAFKA_CLUSTER_ID --ksql-cluster $KSQL_SERVICE_ID

  # ksqlDB CLI queries
  confluent iam rbac role-binding create --principal User:${USER_KSQLDB} --role DeveloperWrite --resource KsqlCluster:ksql-cluster --kafka-cluster $KAFKA_CLUSTER_ID --ksql-cluster $KSQL_SERVICE_ID
  confluent iam rbac role-binding create --principal User:${USER_KSQLDB} --role DeveloperRead --resource Topic:$TOPIC1 --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_KSQLDB} --role DeveloperRead --resource Group:_confluent-ksql-${KSQL_SERVICE_ID} --prefix --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_KSQLDB} --role DeveloperRead --resource Topic:${KSQL_SERVICE_ID}ksql_processing_log --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_ADMIN_KSQLDB} --role DeveloperRead --resource Group:_confluent-ksql-${KSQL_SERVICE_ID} --prefix --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_ADMIN_KSQLDB} --role DeveloperRead --resource Topic:$TOPIC1 --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_ADMIN_KSQLDB} --role ResourceOwner --resource TransactionalId:${KSQL_SERVICE_ID} --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_KSQLDB} --role ResourceOwner --resource Topic:_confluent-ksql-${KSQL_SERVICE_ID}transient --prefix --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_ADMIN_KSQLDB} --role ResourceOwner --resource Topic:_confluent-ksql-${KSQL_SERVICE_ID}transient --prefix --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_KSQLDB} --role ResourceOwner --resource Topic:${CSAS_STREAM1} --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_ADMIN_KSQLDB} --role ResourceOwner --resource Topic:${CSAS_STREAM1} --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_KSQLDB} --role ResourceOwner --resource Topic:${CTAS_TABLE1} --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_ADMIN_KSQLDB} --role ResourceOwner --resource Topic:${CTAS_TABLE1} --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:${USER_ADMIN_KSQLDB} --role ResourceOwner --resource Topic:_confluent-ksql-${KSQL_SERVICE_ID} --prefix --kafka-cluster $KAFKA_CLUSTER_ID
  ```


#### Control Center

- Additional RBAC configurations required for [control-center-dev.properties](https://github.com/confluentinc/examples/tree/latest/security/rbac/delta_configs/control-center-dev.properties.delta)
  ```none
  confluent.controlcenter.rest.authentication.method=BEARER
  confluent.controlcenter.streams.security.protocol=SASL_PLAINTEXT
  public.key.path=/tmp/tokenPublicKey.pem

  confluent.metadata.basic.auth.user.info=c3:c31
  confluent.metadata.bootstrap.server.urls=http://localhost:8090
  ```
- Role bindings:
  ```bash
  # Control Center Admin
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_C3 --role SystemAdmin --kafka-cluster $KAFKA_CLUSTER_ID

  # Control Center user
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_C --role DeveloperRead --resource Topic:$TOPIC1 --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_C --role DeveloperRead --resource Topic:$TOPIC2_AVRO --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_C --role DeveloperRead --resource Subject:${TOPIC2_AVRO}-value --kafka-cluster $KAFKA_CLUSTER_ID --schema-registry-cluster $SCHEMA_REGISTRY_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_C --role DeveloperRead --resource Connector:$CONNECTOR_NAME --kafka-cluster $KAFKA_CLUSTER_ID --connect-cluster $CONNECT_CLUSTER_ID
  ```


### Common configuration

Any component that interacts with secured Confluent Server brokers is a *client* and must be
configured for security as well.  These clients include Kafka Connect workers
and certain connectors such as Replicator, ksqlDB clients,
non-Java clients, Confluent Control Center , Confluent Schema Registry, REST Proxy, etc.

All Kafka clients share a general set of security configuration parameters required to
interact with a secured Confluent Platform cluster:

1. To encrypt data using TLS/SSL and authenticate using SASL, configure the security protocol
   to use `SASL_SSL`. (If you want TLS/SSL for both encryption and authentication
   without SASL, the security protocol would be `SSL`).
   ```bash
   security.protocol=SASL_SSL
   ```
2. To configure TLS encryption truststore settings, set the truststore configuration
   parameters. In this tutorial, the Kafka client does not need the keystore because
   authentication is done using SASL/PLAIN instead of mutual TLS (mTLS).
   ```bash
   ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
   ssl.truststore.password=test1234
   ```
3. To configure SASL authentication, set the SASL mechanism, which in this
   tutorial is `PLAIN`.  Then configure the JAAS configuration property to
   describe to connect to the Confluent Server brokers. The properties `username` and `password`
   are used to configure the user for connections.
   ```bash
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
      username="client" \
      password="client-secret";
   ```

Combining the configuration steps above, the Kafka client’s general pattern for enabling
TLS/SSL encryption and SASL/PLAIN authentication is to add the following to the Kafka client’s
properties file.

```bash
security.protocol=SASL_SSL
ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
ssl.truststore.password=test1234
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
    username="client" \
    password="client-secret";
```

What differs between Kafka clients is the specific [configuration prefix](../kafka/security_prefixes.md#security-prefixes)
that precedes each configuration parameter, as described in the sections below.


### Step 1: Configure the inventory file

For the full configuration examples, see the [Schema Registry Switchover using
Confluent Ansible](https://github.com/confluentinc/cp-ansible/blob/8.1.x/docs/sample_inventories/sr-automation-workflow/sr_switchover_cp_to_cc.yml)
sample inventory.

```yaml
all:
  vars:
    password_encoder_secret:           --- [1]

schema_registry:                       --- [2]
  vars:
    unified_stream_manager:            --- [3]
      schema_registry_endpoint:        --- [4]
      authentication_type:             --- [5]
      basic_username:                  --- [6]
      basic_password:                  --- [7]

    schema_exporters:                  --- [8]
      - name:                          --- [9]
        subjects:                      --- [10]
        context_type:                  --- [11]
        context:                       --- [12]
        config:                        --- [13]
          schema_registry_endpoint:
          authentication_type:
          basic_username:
          basic_password:

    sr_switch_over_exporter_name:      --- [14]

    schema_importers:                  --- [15]
      - name:                          --- [16]
        subjects:                      --- [17]
        config:                        --- [18]
          schema_registry_endpoint:
          authentication_type:
          basic_username:
          basic_password:
```

* [1] Required. The secret for enabling schema exporter and importer. For more
  information, see [password.encoder.secret](https://docs.confluent.io/platform/current/schema-registry/installation/config.html#password-encoder-secret).
* [2] The below variables can be specified under the `schema_registry` role or
  under `all` role.
* [3] Required to enable forward sync from Confluent Platform to Confluent Cloud. Contains Confluent Cloud
  Schema Registry connection details.
* [4] Required. The endpoint of the remote Confluent Cloud Schema Registry.
* [5] Required. The authentication type of the remote Confluent Cloud Schema Registry. The
  supported type is `basic`.
* [6] Required. The API key of the Confluent Cloud Schema Registry.
* [7] Required. The API secret of the Confluent Cloud Schema Registry.
* [8] Required. The schema exporter configurations.

  Only one exporter is allowed.
* [9] Required. The name of the schema exporter. Must match
  `sr_switch_over_exporter_name` ([14]).
* [10] Required. The subjects of the schema exporter. To export all subjects,
  use `:*:` for all subjects in all contexts, or specify patterns:
  `[":.context:*"]`.
* [11] Required. The context type of the schema exporter. Specify how to handle
  contexts. Supported types are `AUTO`, `CUSTOM`, `NONE`, and `DEFAULT`.

  The default value is `AUTO`, whereby the exporter will use an auto-generated
  context in the destination cluster. The auto-generated context name
  will be reported in the status.

  If set to `NONE`, the exporter copies the source schemas as-is.
* [12] Required if `context_type` is `CUSTOM`. The context of the schema
  exporter.
* [13] If omitted, Confluent Ansible will use the default values specified in [4],
  [5], [6], and [7].
* [14] The name of the exporter to use for the switchover workflow.

  If not specified, the workflow will be an import-only workflow.

  If specified, the value must match one of `schema_exporters[0].name`.
* [15] Required. The schema importers configuration.

  Only one importer is allowed.
* [16] Required. The name of the schema importer.
* [17] Required. The subjects of the schema importer. To import all subjects,
  use `:*:` for the default context, or specify patterns: `[":.context:*"]`.
* [18] If omitted, Confluent Ansible will use the default values specified in [4],
  [5], [6], and [7].


## Setup

1.

   Clone the [confluentinc/examples](https://github.com/confluentinc/examples)
   GitHub repository and check out the `latest` branch.
   ```bash
   git clone https://github.com/confluentinc/examples
   cd examples
   git checkout latest
   ```
2. Change directory to the example for Clojure.
   ```bash
   cd clients/cloud/clojure/
   ```
3.

   Create a local file (for example, at `$HOME/.confluent/java.config`) with
   configuration parameters to connect to your Kafka cluster. Starting with one of
   the templates below, customize the file with connection information to your
   cluster. Substitute your values for `{{ BROKER_ENDPOINT }}`,
   `{{CLUSTER_API_KEY }}`, and `{{ CLUSTER_API_SECRET }}`
   (see [Configure Confluent Cloud Clients](https://docs.confluent.io/cloud/current/client-apps/config-client.html)
   for instructions on how to manually find these values, or use the [ccloud-stack utility for Confluent Cloud](/cloud/current/examples/ccloud/docs/ccloud-stack.html) to automatically create them).
   - Template configuration file for Confluent Cloud
     ```none
     # Required connection configs for Kafka producer, consumer, and admin
     bootstrap.servers={{ BROKER_ENDPOINT }}
     security.protocol=SASL_SSL
     sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ CLUSTER_API_KEY }}' password='{{ CLUSTER_API_SECRET }}';
     sasl.mechanism=PLAIN
     # Required for correctness in Apache Kafka clients prior to 2.6
     client.dns.lookup=use_all_dns_ips

     # Best practice for higher availability in Apache Kafka clients prior to 3.0
     session.timeout.ms=45000

     # Best practice for Kafka producer to prevent data loss
     acks=all
     ```
   - Template configuration file for local host
     ```none
     # Kafka
     bootstrap.servers=localhost:9092
     ```


#### Step 3: Create the connector configuration file

Create a JSON file that contains the connector configuration properties. The following example shows the required connector properties.

```none
{
  "connector.class": "ActiveMQSource",
  "name": "ActiveMQSource_0",
  "kafka.auth.mode": "KAFKA_API_KEY",
  "kafka.api.key": "<my-kafka-api-key>",
  "kafka.api.secret": "<my-kafka-api-secret>",
  "kafka.topic" : "topic_0",
  "output.data.format" : "AVRO",
  "activemq.url" : "tcp://<remotehost>:61616",
  "activemq.username" : "<username>",
  "activemq.password" : "<password>",
  "jms.destination.name" : "<JMS-queue-or-topic-name>",
  "tasks.max" : "1"
}
```

Note the following property definitions:

* `"name"`: Sets a name for your new connector.
* `"connector.class"`: Identifies the connector plugin name.

* `"kafka.auth.mode"`: Identifies the connector authentication mode you want to use. There are two options: `SERVICE_ACCOUNT` or `KAFKA_API_KEY` (the default). To use an API key and secret, specify the configuration properties `kafka.api.key` and `kafka.api.secret`, as shown in the example configuration (above).  To use a [service account](service-account.md#s3-cloud-service-account), specify the **Resource ID** in the property `kafka.service.account.id=<service-account-resource-ID>`. To list the available service account resource IDs, use the following command:
  ```bash
  confluent iam service-account list
  ```

  For example:
  ```bash
  confluent iam service-account list

     Id     | Resource ID |       Name        |    Description
  +---------+-------------+-------------------+-------------------
     123456 | sa-l1r23m   | sa-1              | Service account 1
     789101 | sa-l4d56p   | sa-2              | Service account 2
  ```

* `"kafka.topic"`: The Kafka topic name where you want data sent.
* `"output.data.format"`: Options are AVRO, JSON, JSON_SR, and PROTOBUF. [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
* `"activemq.url"`: The URL of the ActiveMQ broker. An ActiveMQ broker URL is similar to `tcp://<remotehost>:61616`.
* `"jms.destination.name"`: The name of the JMS destination `queue` or `topic` name to read from.
* `"tasks.max"`: Enter the number of [tasks](/platform/current/connect/concepts.html#tasks) in use by the connector. The connector supports multiple tasks. More tasks may improve performance.

**Single Message Transforms**: See the [Single Message Transforms (SMT)](single-message-transforms.md#cc-single-message-transforms) documentation for details about adding SMTs using the CLI.

See [Configuration Properties](#cc-activemq-source-config-properties) for all property values and definitions.


#### Step 3: Create the connector configuration file

Create a JSON file that contains the connector configuration properties. The
following entry shows a typical connector configuration. When launched, the
connector consumes data from streams `stream-1` and `stream-2` of log group
`cloudwatch-group`. It produces the data to Kafka topic
`logs.cloudwatch-group.stream-1` and topic `logs.cloudwatch-group.stream-2`.

```json
{
  "name": "CloudWatchLogsSourceConnector_0",
  "config": {
    "connector.class": "CloudWatchLogsSource",
    "name": "CloudWatchLogsSourceConnector_0",
    "kafka.auth.mode": "KAFKA_API_KEY",
    "kafka.api.key": "<my-kafka-api-key>",
    "kafka.api.secret": "<my-kafka-api-secret>",
    "kafka.topic.format": "logs.${log-group}.${log-stream}",
    "output.data.format": "STRING",
    "aws.access.key.id": "<INSERT AWS API KEY>",
    "aws.secret.access.key": "<INSERT AWS API SECRET>",
    "aws.cloudwatch.logs.url": "https://logs.us-east-1.amazonaws.com",
    "aws.cloudwatch.log.group": "cloudwatch-group",
    "aws.cloudwatch.log.streams": "stream-1, stream-2",
    "aws.poll.interval.ms": "1500",
    "log.message.format": "STRING",
    "behavior.on.error": "FAIL",
    "tasks.max": "1"
  }
}
```

Note the following property definitions:

* `"connector.class"`: Identifies the connector plugin name.
* `"name"`: Sets a name for your new connector.

* `"kafka.auth.mode"`: Identifies the connector authentication mode you want to use. There are two options: `SERVICE_ACCOUNT` or `KAFKA_API_KEY` (the default). To use an API key and secret, specify the configuration properties `kafka.api.key` and `kafka.api.secret`, as shown in the example configuration (above).  To use a [service account](service-account.md#s3-cloud-service-account), specify the **Resource ID** in the property `kafka.service.account.id=<service-account-resource-ID>`. To list the available service account resource IDs, use the following command:
  ```bash
  confluent iam service-account list
  ```

  For example:
  ```bash
  confluent iam service-account list

     Id     | Resource ID |       Name        |    Description
  +---------+-------------+-------------------+-------------------
     123456 | sa-l1r23m   | sa-1              | Service account 1
     789101 | sa-l4d56p   | sa-2              | Service account 2
  ```

* `"kafka.topic.format"`: Topic format to use for generating the names of the Kafka topics. This format string can contain `${log-group}` and `${log-stream}` as a placeholder for the original log group and log stream names. For example, `confluent.${log-group}.${log-stream}` for the log group `log-group-1` and log stream `log-stream-1` maps to the topic name `confluent.log-group-1.log-stream-1`.
* `"output.data.format"`: Enter an output data format (data going to the Kafka topic): AVRO, STRING, or JSON (schemaless). [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
* `"aws.access.key.id"` and `"aws.secret.access.key"`: Enter the AWS Access Key ID and Secret. For information about how to set these up, see [Access Keys](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
* `"aws.cloudwatch.logs.url"`: For example, `https://logs.us-east-1.amazonaws.com`. For additional information, see [Amazon CloudWatch Logs endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/cwl_region.html).
* `"aws.cloudwatch.log.group"`: Name of the log group on Amazon CloudWatch where the log streams are contained.
* `"aws.cloudwatch.log.streams"`: List of the log streams on Amazon CloudWatch where you want to track log records. If the property is not used, all log streams under the log group are tracked.
* `"aws.poll.interval.ms"`: Time in milliseconds (ms) the connector waits between polling the endpoint for updates. The default value is `1000` ms (1 second).
* `"log.message.format"`: Specifies the format for log messages received from CloudWatch Log Streams. Valid values for this configuration are `JSON` and `STRING`. The default value is `STRING`
* `"behavior.on.error"`: Determines how errors are managed by the connector. It must be set to one of the following: `IGNORE` or `FAIL`. When set to `FAIL`, the connector halts upon encountering an error while processing records. When set to `IGNORE`, the connector continues processing subsequent sets of records despite encountering errors. If a record is malformed, it is directed to the error topic associated with the connector. The default value is `FAIL`. Note: This configuration does not affect the connector’s behavior when log.message.format is set to `STRING`.
* `"tasks.max"`: Enter the number of [tasks](/platform/current/connect/concepts.html#tasks) to use with the connector. The connector supports running one or more tasks. The connector can start at one task to support all import data and can scale up to one task per log stream. One task per log stream can raise the performance, up to the greatest number of log streams that Amazon supports (100,000 logs per second or 10 MB per second).

**Single Message Transforms**: See the [Single Message Transforms (SMT)](single-message-transforms.md#cc-single-message-transforms) documentation for details about adding SMTs using the CLI.

See [Configuration Properties](#cc-amazon-cloudwatch-logs-source-config-properties) for all property
values and descriptions.


#### Step 3: Create the connector configuration file

Create a JSON file that contains the connector configuration properties. The following example shows required and optional connector properties.

```none
{
  "name": "DynamoDbSinkConnector_0",
  "config": {
    "topics": "pageviews",
    "input.data.format": "AVRO",
    "connector.class": "DynamoDbSink",
    "name": "DynamoDbSinkConnector_0",
    "kafka.auth.mode": "KAFKA_API_KEY",
    "kafka.api.key": "<my-kafka-api-key>",
    "kafka.api.secret": "<my-kafka-api-secret>",
    "aws.access.key.id": "********************",
    "aws.secret.access.key": "****************************************",
    "aws.dynamodb.pk.hash": "value.userid",
    "aws.dynamodb.pk.sort": "value.pageid",
    "table.name.format": "kafka-${topic}",
    "tasks.max": "1"
  }
}
```

Note the following property definitions:

* `"name"`: Sets a name for your new connector.
* `"connector.class"`: Identifies the connector plugin name.
* `"topics"`: Identifies the topic name or a comma-separated list of topic names.

* `"kafka.auth.mode"`: Identifies the connector authentication mode you want to use. There are two options: `SERVICE_ACCOUNT` or `KAFKA_API_KEY` (the default). To use an API key and secret, specify the configuration properties `kafka.api.key` and `kafka.api.secret`, as shown in the example configuration (above).  To use a [service account](service-account.md#s3-cloud-service-account), specify the **Resource ID** in the property `kafka.service.account.id=<service-account-resource-ID>`. To list the available service account resource IDs, use the following command:
  ```bash
  confluent iam service-account list
  ```

  For example:
  ```bash
  confluent iam service-account list

     Id     | Resource ID |       Name        |    Description
  +---------+-------------+-------------------+-------------------
     123456 | sa-l1r23m   | sa-1              | Service account 1
     789101 | sa-l4d56p   | sa-2              | Service account 2
  ```

* `"input.data.format"`:  Sets the input Kafka record value format (data coming from the Kafka topic). Valid entries are **AVRO**, **JSON_SR**, **PROTOBUF**, or **JSON**. You must have Confluent Cloud Schema Registry configured if using a schema-based message format (for example, Avro, JSON_SR (JSON Schema), or Protobuf).
* `"aws.dynamodb.pk.hash"`: Defines how the DynamoDB table hash key is extracted from the records. By default, the Kafka partition number where the record is generated is used as the hash key. The hash key can be created from other record references. See [DynamoDB hash keys and sort keys](#cc-amazon-dynamodb-sink-hash-sort) for examples. Note that the maximum size of a partition using the default configuration is limited to 10 GB (defined by Amazon DynamoDB).
* `"aws.dynamodb.pk.sort"`: Defines how the DynamoDB table sort key is extracted from the records. By default, the record offset is used as the sort key. If no sort key is required, use an empty string for this property `""`. The sort key can be created from other record references. See [DynamoDB hash keys and sort keys](#cc-amazon-dynamodb-sink-hash-sort) for examples.
* `"table.name.format"`: The property is optional and defaults to the name of the Kafka topic. To create a table name format use the syntax `${topic}`. For example, `kafka_${topic}` for the topic `orders` maps to the table name `kafka_orders`.
* `"tasks.max"`: Maximum number of tasks the connector can run. See Confluent Cloud [connector limitations](limits.md#cc-amazon-redshift-sink-limits) for additional task information.

**Single Message Transforms**: See the [Single Message Transforms (SMT)](single-message-transforms.md#cc-single-message-transforms) documentation for details about adding SMTs using the CLI. See [Unsupported transformations](single-message-transforms.md#cc-single-message-transforms-unsupported-transforms) for a list of SMTs that are not supported with this connector.

See [Configuration Properties](#cc-amazon-dynamodb-sink-config-properties) for all property values and
definitions.


#### Step 3: Create the connector configuration file

Create a JSON file that contains the connector configuration properties. The
following entry shows the required configuration properties.

```json
{
  "name": "SqsSource_0",
  "config": {
    "connector.class": "SqsSource",
    "name": "SqsSource_0",
    "kafka.auth.mode": "KAFKA_API_KEY",
    "kafka.api.key": "<my-kafka-api-key>",
    "kafka.api.secret": "<my-kafka-api-secret>",
    "sqs.url": "https://sqs.us-east-2.amazonaws.com/123456789012/MyQueue",
    "kafka.topic": "stocks",
    "aws.access.key.id": "<INSERT AWS API KEY>",
    "aws.secret.key.id": "<INSERT AWS API SECRET KEY ID>",
    "output.data.format": "JSON",
    "tasks.max": "1"
  }
}
```

Note the following property definitions:

* `"connector.class"`: Identifies the connector plugin name.
* `"name"`: Sets a name for your new connector.

* `"kafka.auth.mode"`: Identifies the connector authentication mode you want to use. There are two options: `SERVICE_ACCOUNT` or `KAFKA_API_KEY` (the default). To use an API key and secret, specify the configuration properties `kafka.api.key` and `kafka.api.secret`, as shown in the example configuration (above).  To use a [service account](service-account.md#s3-cloud-service-account), specify the **Resource ID** in the property `kafka.service.account.id=<service-account-resource-ID>`. To list the available service account resource IDs, use the following command:
  ```bash
  confluent iam service-account list
  ```

  For example:
  ```bash
  confluent iam service-account list

     Id     | Resource ID |       Name        |    Description
  +---------+-------------+-------------------+-------------------
     123456 | sa-l1r23m   | sa-1              | Service account 1
     789101 | sa-l4d56p   | sa-2              | Service account 2
  ```

* `"sqs.url"`: For example, `https://sqs.us-east-2.amazonaws.com/123456789012/MyQueue`. For details, see [Amazon SQS queue and message identifiers](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-queue-message-identifiers.html).
* `"sqs.region"`: The AWS region that the SQS queue belongs to. If this property is not used, the connector attempts to infer the region from the SQS URL.
* `"aws.access.key.id"` and `"aws.secret.key.id"`: Enter the AWS Access Key ID and Secret Key ID. For information about how to set these up, see [Access Keys](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
* `"output.data.format"`: Enter an output data format (data going to the Kafka topic): AVRO, JSON_SR (JSON Schema), PROTOBUF, or JSON (schemaless). [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
* `"tasks.max"`: Enter the number of [tasks](/platform/current/connect/concepts.html#tasks) to use with the connector. More tasks may improve performance.

**Single Message Transforms**: See the [Single Message Transforms (SMT)](single-message-transforms.md#cc-single-message-transforms) documentation for details about adding SMTs using the CLI.

See [Configuration Properties](#cc-amazon-sqs-source-config-properties) for all property values and
descriptions.


## Quick Start

Use this quick start to get up and running with the Confluent Cloud AWS Lambda Sink
connector. The quick start provides the basics of selecting the connector and
configuring it to send records to AWS Lambda.


Prerequisites
: * Authorized access to a [Confluent Cloud](https://www.confluent.io/confluent-cloud/) cluster on AWS.


    Confluent Cloud is available through the [AWS Marketplace](https://aws.amazon.com/marketplace/pp/prodview-g5ujul6iovvcy?trk=14575e70-1766-4f20-8083-0c2757a1ec75&sc_channel=el)
    or [directly from Confluent](https://www.confluent.io/get-started/).
  * The Confluent CLI installed and configured for the cluster. See [Install the Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html).
  * [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.


    #### NOTE
    If no schema is defined, values are encoded as plain strings. For example,  `"name": "Kimberley Human"` is encoded as `name=Kimberley Human`.
  * For networking considerations, see [Networking and DNS](overview.md#connect-internet-access-resources). To use a set of public egress IP addresses, see [Public Egress IP Addresses for Confluent Cloud Connectors](static-egress-ip.md#cc-static-egress-ips).
  * Your AWS Lambda project should be in the same region as your Confluent Cloud cluster where you are running the connector.
  * An AWS account configured with [Access Keys](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
  * You need to configure a Lambda IAM policy for the account to allow the following:
    * `lambda:InvokeFunction` and `lambda:GetFunction`.
    * Add resource to allow invoking all aliases and versions of the function, including `$LATEST`. When you specify function name without a version or alias suffix, all underlying versions, aliases, and `$LATEST` are implicitly included and accessible.


    The following shows a JSON example for setting this policy:
    ```json
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "lambda:InvokeFunction",
                    "lambda:GetFunction"
                ],
                "Resource": [
                  "arn:aws:lambda:*:*:function:<function-name>"
                ]
            }
        ]
    }
    ```


  #### NOTE
  If you want to restrict the connector to a particular alias or version, update the permission policy with alias or versions appended at the end as show below:


  ```none
  arn:aws:lambda:*:*:function:functionName:alias
  OR
  arn:aws:lambda:*:*:function:functionName:1
  ```


  - Kafka cluster credentials. The following lists the different ways you can provide credentials.
    - Enter an existing [service account](service-account.md#s3-cloud-service-account) resource ID.
    - Create a Confluent Cloud [service account](service-account.md#s3-cloud-service-account) for the connector. Make sure to review the ACL entries required in the [service account documentation](service-account.md#s3-cloud-service-account). Some connectors have specific ACL requirements.
    - Create a Confluent Cloud API key and secret. To create a key and secret, you can use [confluent api-key create](https://docs.confluent.io/confluent-cli/current/command-reference/api-key/confluent_api-key_create.html) *or* you can autogenerate the API key and secret directly in the Cloud Console when setting up the connector.


#### NOTE
The following steps show basic ACL entries for sink connector service accounts.
Be sure to review the [Sink connector SUCCESS and ERROR topics](#cloud-service-account-sink-additional-acls) and
[Sink connector offset management](#cloud-service-account-sink-offset-management-acls) sections
for additional ACL entries that may be required for certain connectors or tasks.

1. Create a service account named `myserviceaccount`:
   ```none
   confluent iam service-account create myserviceaccount --description "test service account"
   ```
2. Find the service account ID for `myserviceaccount`:
   ```none
   confluent iam service-account list
   ```
3. Set a DESCRIBE ACL to the cluster.
   ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" --operations describe --cluster-scope
   ```
4. Set a READ ACL to `pageviews`:
   ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" --operations read --topic pageviews
   ```
5. Set a CREATE ACL to the following topic prefix:
   ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" --operations create --prefix --topic "dlq-lcc-"
   ```
6. Set a WRITE ACL to the following topic prefix:
   ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" --operations write --prefix --topic "dlq-lcc-"
   ```
7. Set a READ ACL to a consumer group with the following prefix:
   ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" --operations read  --prefix --consumer-group "connect-lcc-"
   ```
8. Create a Kafka API key and secret for `<service-account-id>`:
   ```none
   confluent api-key create --resource "lkc-abcd123" --service-account "<service-account-id>"
   ```
9. Save the API key and secret.

The connector configuration must include either an API key and secret or a
service account ID. For additional service account information, see
[Service Accounts on Confluent Cloud](../security/authenticate/workload-identities/service-accounts/overview.md#service-accounts).


## Create a Confluent Cloud configuration file

1. Create a customized Confluent Cloud configuration file with key=value pairs of connection details for the Confluent Cloud cluster using the format
   shown in this example, and save as `/tmp/myconfig.properties`. Note: you cannot use the `~/.ccloud/config.json` generated by Confluent Cloud CLI for other Confluent Platform components or clients, which is why you need to manually create your own key=value properties file.
   ```bash
   bootstrap.servers=<BROKER ENDPOINT>
   ssl.endpoint.identification.algorithm=https
   security.protocol=SASL_SSL
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<API KEY>' password='<API SECRET>';
   ```
2. Substitute `<BROKER ENDPOINT>`, `<API KEY>`, and `<API SECRET>` in the file above, to point to your Confluent Cloud cluster using the desired service account’s cluster API key and secret.
3. If you are using Confluent Cloud Schema Registry, add the following configuration parameters to the same file above. Substitute `<SR ENDPOINT>`, `<SR API KEY>`, and `<SR API SECRET>` to point to your Confluent Cloud Schema Registry using the desired service account’s Schema Registry API key and secret (which are different from the cluster API key and secret used earlier).
   ```bash
   basic.auth.credentials.source=USER_INFO
   schema.registry.basic.auth.user.info=<SR API KEY>:<SR API SECRET>
   schema.registry.url=https://<SR ENDPOINT>
   ```
4. If you are using Confluent Cloud ksqlDB, add the following configuration parameters to same file above. Substitute `<KSQL ENDPOINT>`, `<KSQL API KEY>`, and `<KSQL API SECRET>` to point to your Confluent Cloud ksqlDB using the desired service account’s ksqlDB API key and secret (which are different from the cluster API key and secret used earlier).
   ```bash
   ksql.endpoint=<KSQL ENDPOINT>
   ksql.basic.auth.user.info=<KSQL API KEY>:<KSQL API SECRET>
   ```
5. Review the `/tmp/myconfig.properties` file, which may resemble below (with required substitutions):
   ```bash
   bootstrap.servers=<BROKER ENDPOINT>
   ssl.endpoint.identification.algorithm=https
   security.protocol=SASL_SSL
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<API KEY>' password='<API SECRET>';
   basic.auth.credentials.source=USER_INFO
   schema.registry.basic.auth.user.info=<SR API KEY>:<SR API SECRET>
   schema.registry.url=https://<SR ENDPOINT>
   ksql.endpoint=<KSQL ENDPOINT>
   ksql.basic.auth.user.info=<KSQL API KEY>:<KSQL API SECRET>
   ```


### Deploy resources

In this section, you add resources to your Terraform configuration file and
provision them when the GitHub Action runs.

1. In your repository, create a new file named “variables.tf” with the
   following code.
   ```terraform
   variable "confluent_cloud_api_key" {
     description = "Confluent Cloud API Key"
     type        = string
   }

   variable "confluent_cloud_api_secret" {
     description = "Confluent Cloud API Secret"
     type        = string
     sensitive   = true
   }
   ```
2. In the “main.tf” file, add the following code.

   This code references the Cloud API key and secret you added in the previous
   steps and creates a new environment and Kafka cluster for your organization.
   Optionally, you can choose to use an existing environment.
   ```terraform
   locals {
     cloud  = "AWS"
     region = "us-east-2"
   }

   provider "confluent" {
     cloud_api_key    = var.confluent_cloud_api_key
     cloud_api_secret = var.confluent_cloud_api_secret
   }

   # Create a new environment.
   resource "confluent_environment" "my_env" {
     display_name = "my_env"

     stream_governance {
       package = "ESSENTIALS"
     }
   }

   # Create a new Kafka cluster.
   resource "confluent_kafka_cluster" "my_kafka_cluster" {
     display_name = "my_kafka_cluster"
     availability = "SINGLE_ZONE"
     cloud        = local.cloud
     region       = local.region
     basic {}

     environment {
       id = confluent_environment.my_env.id
     }

     depends_on = [
       confluent_environment.my_env
     ]
   }

   # Access the Stream Governance Essentials package to the environment.
   data "confluent_schema_registry_cluster" "my_sr_cluster" {
     environment {
       id = confluent_environment.my_env.id
     }
   }
   ```
3. Create a Service Account and provide a role binding by adding the following
   code to “main.tf”.

   The role binding gives the Service Account the necessary permissions to
   create topics, Flink statements, and other resources. In production, you may
   want to assign a less privileged role than OrganizationAdmin.
   ```terraform
   # Create a new Service Account. This will used during Kafka API key creation and Flink SQL statement submission.
   resource "confluent_service_account" "my_service_account" {
     display_name = "my_service_account"
   }

   data "confluent_organization" "my_org" {}

   # Assign the OrganizationAdmin role binding to the above Service Account.
   # This will give the Service Account the necessary permissions to create topics, Flink statements, etc.
   # In production, you may want to assign a less privileged role.
   resource "confluent_role_binding" "my_org_admin_role_binding" {
     principal   = "User:${confluent_service_account.my_service_account.id}"
     role_name   = "OrganizationAdmin"
     crn_pattern = data.confluent_organization.my_org.resource_name

     depends_on = [
       confluent_service_account.my_service_account
     ]
   }
   ```
4. Push all changes to your repository and check the **Actions** page to ensure
   the workflow runs successfully.

   At this point, you should have a new environment, an Apache Kafka® cluster, and a
   Stream Governance package provisioned in your Confluent Cloud organization.


## Confluent Platform properties files

The following list includes the default Confluent Platform services configuration properties
files, where `$CONFLUENT_HOME` is the directory where you installed Confluent Platform. You
reference or modify the appropriate file when you work with a Confluent Platform service.

- Connect: `$CONFLUENT_HOME/etc/schema-registry/connect-avro-distributed.properties`
- Control Center:  `$C3_HOME/etc/confluent-control-center/control-center-dev.properties` <sup>[1](#f1)</sup>
- KRaft Controller: `$CONFLUENT_HOME/etc/kafka/controller.properties`
- Kafka (KRaft mode): `$CONFLUENT_HOME/etc/kafka/broker.properties`
- Kafka (ZooKeeper mode, Legacy): `$CONFLUENT_HOME/etc/kafka/server.properties`
- REST Proxy: `$CONFLUENT_HOME/etc/kafka-rest/kafka-rest.properties`
- ksqlDB: `$CONFLUENT_HOME/etc/ksqldb/ksql-server.properties`
- Schema Registry: `$CONFLUENT_HOME/etc/schema-registry/schema-registry.properties`
- ZooKeeper: `$CONFLUENT_HOME/etc/kafka/zookeeper.properties`

* **[1]** Starting with Confluent Platform 8.0, Control Center is provided in an independent release, as described in [Control Center single-node manual installation](/control-center/current/installation/overview.html#single-node-manual-installation), and the Control Center examples in this tutorial: [Getting Started with a multi-broker cluster](/platform/current/get-started/tutorial-multi-broker.html#optional-install-and-configure-c3). In previous versions of Confluent Platform, this path was `$CONFLUENT_HOME/etc/confluent-control-center/control-center-dev.properties`.


### confluent.controlcenter.kafka.<name>.cprest.url

Defines the REST endpoints for any additional Kafka clusters being monitored by Control Center to enable HTTP
servers on the broker(s). Replace `<name>` with the name that identifies this cluster.
This name should be consistent with the Kafka cluster name used for other Control Center configurations.
A comma-separated list with multiple values can be provided for a multi-broker cluster.

Note that if the REST API endpoints are secured with TLS, you must include additional properties
in the Confluent Control Center properties file that provide the security information. For more information,
see [Configure TLS for Control Center as a server](../security/ssl.md#controlcenter-ui-https) and [TLS settings for web access](#https-settings).

The following example shows REST endpoint settings for three clusters or data centers (dc1, dc2, and dc3):

```bash
confluent.controlcenter.streams.cprest.url=https://dc1:8090
confluent.controlcenter.kafka.dc2.cprest.url=https://dc2:8090
confluent.controlcenter.kafka.dc3.cprest.url=https://dc3:8090
```

* Type: list
* Default: “”
* Importance: high

For an example of configuring the Control Center `cprest.url` specifically for multiple clusters, see [Enabling Multi-Cluster Schema Registry](/platform/current/control-center/topics/schema.html#multi-cluster-sr).


## Multi-node manual installation

Use these steps for multi-node manual installation of Control Center and Confluent Platform.

1. Provision a new node using any of the Confluent Platform supported operating systems. For more information, see [Supported operating systems](/platform/current/installation/versions-interoperability.html#operating-systems). Login to the VM on which you will install Confluent Platform.

   Install Control Center on a new node/VM. To ensure a smooth transition, allow Control Center (Legacy) users to continue using Control Center (Legacy)  until the Control Center has gathered 7-15 days of historical metrics.
   For more information, see [Migration](#install-c3-migration).
2. Login to the VM and install Control Center. For more information, see [Compatibility with Confluent Platform](system-requirements.md#install-c3-supported-cp).

   Use the instructions for installing Confluent Platform but make sure to use the base URL and properties from
   these instructions to install Control Center.

   For more information, see [Confluent Platform System Requirements](/platform/current/installation/system-requirements.html#system-requirements), [Install Confluent Platform using Systemd on Ubuntu and Debian](/platform/current/installation/installing_cp/deb-ubuntu.html#systemd-ubuntu-debian-install),
   and [Install Confluent Platform using Systemd on RHEL, CentOS, and Fedora-based Linux](/platform/current/installation/installing_cp/rhel-centos.html#systemd-rhel-centos-install).

   Ubuntu and Debian
   ```bash
   export BASE_URL=https://packages.confluent.io/confluent-control-center-next-gen/deb/
   sudo apt-get update
   wget ${BASE_URL}archive.key
   sudo apt-key add archive.key
   sudo add-apt-repository -y "deb ${BASE_URL} stable main"
   sudo apt update
   ```

   ```bash
   sudo apt install -y confluent-control-center-next-gen
   ```

   RHEL, CentOS, and Fedora-based Linux
   ```bash
   export base_url=https://packages.confluent.io/confluent-control-center-next-gen/rpm/
   cat <<EOF | sudo tee /etc/yum.repos.d/Confluent.repo > /dev/null
   [Confluent]
   name=Confluent repository
   baseurl=${base_url}
   gpgcheck=1
   gpgkey=${base_url}archive.key
   enabled=1
   EOF
   ```

   ```bash
   sudo yum install -y confluent-control-center-next-gen cyrus-sasl openssl-devel
   ```
3. Install Java for your operating system (if not installed).
   ```bash
   sudo yum install java-17-openjdk -y  ---- RHEL/CentOs/Fedora
   ```

   ```bash
   sudo apt install openjdk-17-jdk -y ---- Ubuntu/Debian
   ```
4. Copy `/etc/confluent-control-center/control-center-production.properties` from your current Control Center (Legacy) into the Control Center
   node on the VM and add this property:
   ```bash
   confluent.controlcenter.id=10
   confluent.controlcenter.prometheus.enable=true
   confluent.controlcenter.prometheus.url=http://localhost:9090
   confluent.controlcenter.prometheus.rules.file=/etc/confluent-control-center/trigger_rules-generated.yml
   confluent.controlcenter.alertmanager.config.file=/etc/confluent-control-center/alertmanager-generated.yml
   ```
5. If you are using SSL, copy the certs at `/var/ssl/private` from your current Control Center (Legacy) into the Control Center node on the VM.
   If you are not using SSL, skip this step.
6. Change ownership of the configuration files. Give the Control Center process write permissions to the alert
   manager, so that the process can properly manage alert triggers. Use the `chown` command to set
   the Control Center process as the owner of the `trigger_rules-generated.yml` and `alertmanager-generated.yml`
   files.
   ```bash
   chown -c cp-control-center /etc/confluent-control-center/trigger_rules-generated.yml
   chown -c cp-control-center /etc/confluent-control-center/alertmanager-generated.yml
   ```
7. Start the following services on the Control Center node:
   ```bash
   systemctl enable prometheus
   systemctl start prometheus

   systemctl enable alertmanager
   systemctl start alertmanager

   systemctl enable confluent-control-center
   systemctl start confluent-control-center
   ```
8. Login to each broker you intend to monitor and verify brokers can reach the Control Center node on port 9090.
   ```bash
   curl http://<c3-internal-dns-url>:9090/-/healthy
   ```

   All brokers must have access to the Control Center node on port 9090, but port 9090 does not require public
   access. Restrict access as you prefer.
9. Update the following properties for every Kafka broker and KRaft controller. Pay attention to the
   notes on the highlighted lines that follow the code example.

   KRaft controller properties are located here: `/etc/controller/server.properties`
   ```bash
   metric.reporters=io.confluent.telemetry.reporter.TelemetryReporter,io.confluent.metrics.reporter.ConfluentMetricsReporter --- [1]
   confluent.telemetry.exporter._c3.type=http
   confluent.telemetry.exporter._c3.enabled=true
   confluent.telemetry.exporter._c3.metrics.include=io.confluent.kafka.server.request.(?!.*delta).*|io.confluent.kafka.server.server.broker.state|io.confluent.kafka.server.replica.manager.leader.count|io.confluent.kafka.server.request.queue.size|io.confluent.kafka.server.broker.topic.failed.produce.requests.rate.1.min|io.confluent.kafka.server.tier.archiver.total.lag|io.confluent.kafka.server.request.total.time.ms.p99|io.confluent.kafka.server.broker.topic.failed.fetch.requests.rate.1.min|io.confluent.kafka.server.broker.topic.total.fetch.requests.rate.1.min|io.confluent.kafka.server.partition.caught.up.replicas.count|io.confluent.kafka.server.partition.observer.replicas.count|io.confluent.kafka.server.tier.tasks.num.partitions.in.error|io.confluent.kafka.server.broker.topic.bytes.out.rate.1.min|io.confluent.kafka.server.request.total.time.ms.p95|io.confluent.kafka.server.controller.active.controller.count|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.total|io.confluent.kafka.server.request.total.time.ms.p999|io.confluent.kafka.server.controller.active.broker.count|io.confluent.kafka.server.request.handler.pool.request.handler.avg.idle.percent.rate.1.min|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.rate.1.min|io.confluent.kafka.server.controller.unclean.leader.elections.rate.1.min|io.confluent.kafka.server.replica.manager.partition.count|io.confluent.kafka.server.controller.unclean.leader.elections.total|io.confluent.kafka.server.partition.replicas.count|io.confluent.kafka.server.broker.topic.total.produce.requests.rate.1.min|io.confluent.kafka.server.controller.offline.partitions.count|io.confluent.kafka.server.socket.server.network.processor.avg.idle.percent|io.confluent.kafka.server.partition.under.replicated|io.confluent.kafka.server.log.log.start.offset|io.confluent.kafka.server.log.tier.size|io.confluent.kafka.server.log.size|io.confluent.kafka.server.tier.fetcher.bytes.fetched.total|io.confluent.kafka.server.request.total.time.ms.p50|io.confluent.kafka.server.tenant.consumer.lag.offsets|io.confluent.kafka.server.session.expire.listener.zookeeper.expires.rate.1.min|io.confluent.kafka.server.log.log.end.offset|io.confluent.kafka.server.broker.topic.bytes.in.rate.1.min|io.confluent.kafka.server.partition.under.min.isr|io.confluent.kafka.server.partition.in.sync.replicas.count|io.confluent.telemetry.http.exporter.batches.dropped|io.confluent.telemetry.http.exporter.items.total|io.confluent.telemetry.http.exporter.items.succeeded|io.confluent.telemetry.http.exporter.send.time.total.millis|io.confluent.kafka.server.controller.leader.election.rate.(?!.*delta).*|io.confluent.telemetry.http.exporter.batches.failed
   confluent.telemetry.exporter._c3.client.base.url=http://c3-internal-dns-hostname:9090/api/v1/otlp --- [2]
   confluent.telemetry.exporter._c3.client.compression=gzip
   confluent.telemetry.exporter._c3.api.key=dummy
   confluent.telemetry.exporter._c3.api.secret=dummy
   confluent.telemetry.exporter._c3.buffer.pending.batches.max=80 --- [3]
   confluent.telemetry.exporter._c3.buffer.batch.items.max=4000 --- [4]
   confluent.telemetry.exporter._c3.buffer.inflight.submissions.max=10 --- [5]
   confluent.telemetry.metrics.collector.interval.ms=60000 --- [6]
   confluent.telemetry.remoteconfig._confluent.enabled=false
   confluent.consumer.lag.emitter.enabled=true
   ```

   - [1] To enable metrics for both Control Center (Legacy) and Control Center, update your existing Control Center (Legacy) property `metric.reporters` to use the following values:
     ```bash
     metric.reporters=io.confluent.telemetry.reporter.TelemetryReporter,io.confluent.metrics.reporter.ConfluentMetricsReporter
     ```

     If you decommission Control Center (Legacy), enable only TelemetryReporter plugin with the following value:
     ```bash
     metric.reporters=io.confluent.telemetry.reporter.TelemetryReporter
     ```
   - [2] Ensure the URL in `confluent.telemetry.exporter._c3.client.base.url` is the actual Control Center URL, reachable from the broker host.
     ```bash
     confluent.telemetry.exporter._c3.client.base.url=http://c3-internal-dns-hostname:9090/api/v1/otlp
     ```
   - [3] [4] [5] [6] Use the following configurations for clusters up to 100,000 or fewer replicas. To get an accurate count
     of replicas, use the sum of all replicas across all clusters monitored in Control Center (Legacy) (including the Control Center (Legacy) bootstrap cluster).
     ```bash
     confluent.telemetry.exporter._c3.buffer.pending.batches.max=80
     confluent.telemetry.exporter._c3.buffer.batch.items.max=4000
     confluent.telemetry.exporter._c3.buffer.inflight.submissions.max=10
     confluent.telemetry.metrics.collector.interval.ms=60000
     ```

   <details>
   <summary style="display: list-item; cursor:pointer; color:#337ab7;">
   Configurations for clusters with 100,000 to 400,000 replicas
   </summary>

   Clusters with a replica count of 100,000 - 200,000:
   ```bash
   confluent.telemetry.exporter._c3.buffer.pending.batches.max=80
   confluent.telemetry.exporter._c3.buffer.batch.items.max=4000
   confluent.telemetry.exporter._c3.buffer.inflight.submissions.max=20
   confluent.telemetry.metrics.collector.interval.ms=60000
   ```

   Clusters with a replica count of 200,000 - 400,000:
   ```bash
   confluent.telemetry.exporter._c3.buffer.pending.batches.max=80
   confluent.telemetry.exporter._c3.buffer.batch.items.max=4000
   confluent.telemetry.exporter._c3.buffer.inflight.submissions.max=20
   confluent.telemetry.metrics.collector.interval.ms=120000
   ```

   For clusters with a replica count of 200,000 - 400,000, also update the following Control Center (Legacy) configuration:
   ```bash
   confluent.controlcenter.prometheus.trigger.threshold.time=2m
   ```

   </details>
10. Perform a rolling restart for the brokers (zero downtime). For more information, see [Rolling restart](/platform/current/kafka/post-deployment.html#rolling-restart).
    ```bash
    systemctl restart confluent-server
    ```
11. (Optional) Setup log rotation for Prometheus and Alertmanager.

    ### Prometheus

    1. Create a new configuration file at `/etc/logrotate.d/prometheus` with the following content:
       ```bash
       /var/log/confluent/control-center/prometheus.log {
          size 10MB
          rotate 5
          compress
          delaycompress
          missingok
          notifempty
          copytruncate
       }
       ```
    2. Create a script at `/usr/local/bin/logrotate-prometheus.sh`:
       ```bash
       #!/bin/bash
       /usr/sbin/logrotate -s /var/lib/logrotate/status-prometheus /etc/logrotate.d/prometheus
       ```
    3. Make the script executable
       ```bash
       chmod +x /usr/local/bin/logrotate-prometheus.sh
       ```
    4. To schedule with Cron, add the following line to your crontab (crontab -e):
       ```bash
       */10 * * * * /usr/local/bin/logrotate-prometheus.sh >> /tmp/prometheus-rotate.log 2>&1
       ```
    5. Restart Prometheus
       ```bash
       systemctl restart prometheus
       ```
    6. Perform similar steps for Alertmanager logs.

    ### Alertmanager

    1. Create a new configuration file at `/etc/logrotate.d/alertmanager` with the following content:
       ```bash
       /var/log/confluent/control-center/alertmanager.log {
          size 10MB
          rotate 5
          compress
          delaycompress
          missingok
          notifempty
          copytruncate
       }
       ```
    2. Create a script at `/usr/local/bin/logrotate-alertmanager.sh`:
       ```bash
       #!/bin/bash
       /usr/sbin/logrotate -s /var/lib/logrotate/status-alertmanager /etc/logrotate.d/alertmanager
       ```
    3. Make the script executable
       ```bash
       chmod +x /usr/local/bin/logrotate-alertmanager.sh
       ```
    4. To schedule with Cron, add the following line to your crontab (crontab -e):
       ```bash
       */10 * * * * /usr/local/bin/logrotate-alertmanager.sh >> /tmp/alertmanager-rotate.log 2>&1
       ```
    5. Restart Alertmanager
       ```bash
       systemctl restart alertmanager
       ```


# Configure RBAC for Control Center on Confluent Platform

Control Center supports [Use Role-Based Access Control (RBAC) for Authorization in Confluent Platform](/platform/current/security/authorization/rbac/overview.html#rbac-overview) (role-based access control (RBAC)). As of Confluent Platform version 5.3 and
later, RBAC provides a fine-grained security model across the platform
in a development environment. Prior versions of Control Center only provided
coarse-grained access control of either read-only or full access.

If RBAC is not enabled, or Control Center is running against Confluent Platform versions prior
to 5.3 and later, Control Center functions as it has before without restricted
access (unless access control feature flags have been turned off
in the `control-center-properties` files). When RBAC is not enabled,
Access Control settings (referred to as feature flags) in Control Center
configuration options can remove access for the features that have those flags;
such as ksqlDB, License Manager, Schema Registry, topic inspections,
broker configurations, and more. For more information on those available settings,
see [Access control settings](../installation/configuration.md#controlcenter-access-control-settings).

If RBAC is enabled,
[Features](../installation/configuration.md#controlcenter-access-control-settings) are superseded by RBAC
role permissions. RBAC works in conjunction with [ACLs](/platform/current/security/authorization/acls/overview.html#kafka-authorization)
and [LDAP](c3-auth-ldap.md#controlcenter-security-ldap) security.

In general, RBAC in Control Center enforces access for only a few resources for
which it manages; typically those resources for which it keeps internal state
(license management, broker metrics, and alerts). See
[Control Center resource access by role](#c3-feature-access-by-role) for more details. The remainder of
RBAC-enforced operations on resources managed by Control Center are delegated
downstream to Apache Kafka®, Schema Registry, Connect, and ksqlDB.


### Recommended Confluent Platform RBAC reading

Review the following documentation to gain a thorough understanding
of the RBAC feature in Confluent Platform:

- [Use Role-Based Access Control (RBAC) for Authorization in Confluent Platform](/platform/current/security/authorization/rbac/overview.html#rbac-overview)
- [Configure Metadata Service (MDS) in Confluent Platform](/platform/current/kafka/configure-mds/index.html#rbac-mds-config)
- [Use Predefined RBAC Roles in Confluent Platform](/platform/current/security/authorization/rbac/rbac-predefined-roles.html#rbac-predefined-roles)
- [RBAC role use cases](/platform/current/security/authorization/rbac/rbac-predefined-roles.html#rbac-roles-use-cases)
- [Role-Based Access Control for Confluent Platform Quick Start](/platform/current/security/authorization/rbac/rbac-cli-quickstart.html#rbac-cli-quickstart)
- [Configure Role-Based Access Control for Schema Registry in Confluent Platform](/platform/current/schema-registry/security/rbac-schema-registry.html#schemaregistry-rbac)
- [Deploy Secure ksqlDB with RBAC in Confluent Platform](/platform/current/security/authorization/rbac/ksql-rbac.html#ksql-rbac)
- [Configure RBAC for a Connect Cluster](/platform/current/connect/rbac/connect-rbac-connect-cluster.html#connect-rbac-connect-cluster)

Configure Metadata Service (MDS) in Confluent Platform.
Use Predefined RBAC Roles in Confluent Platform and RBAC role use cases.
Role-Based Access Control for Confluent Platform Quick Start.
Configure Role-Based Access Control for Schema Registry in Confluent Platform.
Deploy Secure ksqlDB with RBAC in Confluent Platform.
Configure RBAC for a Connect Cluster.


### Initialization

The Producer is configured using a dictionary in the examples below.

If you are running Kafka locally, you can initialize the Producer as shown below.

```python
from confluent_kafka import Producer
import socket

conf = {'bootstrap.servers': 'host1:9092,host2:9092',
        'client.id': socket.gethostname()}

producer = Producer(conf)
```

If you are connecting to a Kafka cluster in Confluent Cloud, you need to provide
credentials for access. The example below shows using a cluster API key and
secret.

```python
from confluent_kafka import Producer
import socket

conf = {'bootstrap.servers': 'pkc-abcd85.us-west-2.aws.confluent.cloud:9092',
        'security.protocol': 'SASL_SSL',
        'sasl.mechanism': 'PLAIN',
        'sasl.username': '<CLUSTER_API_KEY>',
        'sasl.password': '<CLUSTER_API_SECRET>',
        'client.id': socket.gethostname()}

producer = Producer(conf)
```

* For information on the available configuration properties, refer to the
  [API Documentation](/platform/current/clients/confluent-kafka-python/html/index.html).
* For a step-by-step tutorial using the Python client including code samples for
  the producer and consumer see [this guide](https://developer.confluent.io/get-started/python/).


### REST-based example

1. Use this setting with [distributed workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to config.json, configure all of the required values, and use the following command to post the configuration to one of the distributed connect workers. Check here for more information about the Kafka Connect [Kafka Connect REST Interface](/platform/current/connect/references/restapi.html)
   ```json
   {
     "name" : "AzureBlobStorageSourceConnector",
     "config" : {
       "connector.class" : "io.confluent.connect.azure.blob.storage.AzureBlobStorageSourceConnector",
       "tasks.max" : "1",
       "azblob.account.name" : "your-account",
       "azblob.account.key" : "your-key",
       "azblob.container.name" : "confluent-kafka-connect-azBlobStorage-testing",
       "format.class" : "io.confluent.connect.azure.blob.storage.format.avro.AvroFormat",
       "confluent.topic.bootstrap.servers" : "localhost:9092",
       "confluent.topic.replication.factor" : "1",
       "transforms" : "AddPrefix",
       "transforms.AddPrefix.type" : "org.apache.kafka.connect.transforms.RegexRouter",
       "transforms.AddPrefix.regex" : ".*",
       "transforms.AddPrefix.replacement" : "copy_of_$0"
     }
   }
   ```

   #### NOTE
   Change the `confluent.topic.bootstrap.servers` property to include your broker address(es), and change the `confluent.topic.replication.factor` to 3 for staging or production use.
2. Use curl to post a configuration to one of the Kafka Connect Workers. Change `http://localhost:8083/` to the endpoint of one of your Kafka Connect worker(s).
   ```bash
   curl -s -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
   ```
3. Use the following command to update the configuration of existing connector.
   ```bash
   curl -s -X PUT -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors/AzureBlobStorageSourceConnector/config
   ```
4. To consume records written by connector to the configured Kafka topic, run the following command:
   ```bash
   kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081  --topic copy_of_blob_topic --from-beginning
   ```


## Quick Start

This quick start uses the Azure Cognitive Search Sink connector to consume
records and write them as documents to an Azure Cognitive Search service.

Prerequisites
: - [Confluent Platform](/platform/current/installation/index.html)
  - [Confluent CLI](https://docs.confluent.io/confluent-cli/current/installing.html) (requires separate installation)

1. Before starting the connector, create and deploy an Azure Cognitive Search service.
   * Navigate to the Microsoft [Azure Portal](https://portal.azure.com/).
   * Create a Search service following this [Azure Cognitive Search quick start guide](https://docs.microsoft.com/en-us/azure/search/search-create-service-portal).
   * Create an index in the service following this [index quick start guide](https://docs.microsoft.com/en-us/azure/search/search-get-started-portal).
   * Copy the admin key and the Search service name from the portal and save it for later. Azure Cognitive Search should now be set up for the connector.

   #### NOTE
   Ensure the index has the default name `hotels-sample-index` and only has the fields `HotelId`, `HotelName`, `Description`. All others should be deleted.
2. Install the connector through the [Confluent Hub Client](/kafka-connectors/self-managed/confluent-hub/client.html).
   ```bash
   # run from your CP installation directory
   confluent connect plugin install confluentinc/kafka-connect-azure-seach:latest
   ```
3. Start Confluent Platform using the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands.
   ```bash
   confluent local start
   ```
4. Produce test data to the `hotels-sample` topic in Kafka.

   Start the Avro console producer to import a few records to Kafka:
   ```bash
   ${CONFLUENT_HOME}/bin/kafka-avro-console-producer --broker-list localhost:9092 --topic hotels-sample \
   --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"HotelName","type":"string"},{"name":"Description","type":"string"}]}' \
   --property key.schema='{"type":"string"}' \
   --property "parse.key=true" \
   --property "key.separator=,"
   ```

   Then in the console producer, enter:
   ```bash
   "marriotId",{"HotelName": "Marriot", "Description": "Marriot description"}
   "holidayinnId",{"HotelName": "HolidayInn", "Description": "HolidayInn description"}
   "motel8Id",{"HotelName": "Motel8", "Description": "motel8 description"}
   ```

   The three records entered are published to the Kafka topic `hotels-sample` in Avro format.
5. Create a `azure-search.json` file with the following contents:
   ```bash
   {
     "name": "azure-search",
     "config": {
       "topics": "hotels-sample",
       "tasks.max": "1",
       "connector.class": "io.confluent.connect.azure.search.AzureSearchSinkConnector",
       "key.converter": "io.confluent.connect.avro.AvroConverter",
       "key.converter.schema.registry.url": "http://localhost:8081",
       "value.converter": "io.confluent.connect.avro.AvroConverter",
       "value.converter.schema.registry.url": "http://localhost:8081",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1",
       "azure.search.service.name": "<the created Search service name>",
       "azure.search.api.key": "<the copied api key>",
       "index.name": "${topic}-index",
       "reporter.bootstrap.servers": "localhost:9092",
       "reporter.error.topic.name": "test-error",
       "reporter.error.topic.replication.factor": 1,
       "reporter.error.topic.key.format": "string",
       "reporter.error.topic.value.format": "string",
       "reporter.result.topic.name": "test-result",
       "reporter.result.topic.key.format": "string",
       "reporter.result.topic.value.format": "string",
       "reporter.result.topic.replication.factor": 1
     }
   }
   ```

   #### NOTE
   For details about using this connector with Kafka Connect Reporter, see
   [Connect
   Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
6. Load the Azure Cognitive Search Sink connector.
   ```bash
   confluent local load azure-search --config path/to/azure-search.json
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands in
   production environments.
7. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status azure-search
   ```
8. Confirm that the messages were delivered to the result topic in Kafka
   ```bash
   confluent local consume test-result --from-beginning
   ```
9. Confirm that the messages were delivered to Azure Cognitive Search.
10. Log in to the service and check that the index `hotel-samples-index`
    contains the three written records from before.
11. Clean up resources:
    1. Delete the connector
       ```bash
       confluent local unload azure-search
       ```
    2. Stop Confluent Platform
       ```bash
       confluent local stop
       ```
    3. Delete the created Azure Cognitive Search service and its resource group
       in the Azure portal.


### REST-based example

Use this setting with [distributed workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to `pubsub-source-source.json`, configure all of the required values, and use the following command to
post the configuration to one of the distributed connect workers. Check here for more information about the
Kafka Connect [REST API](/platform/current/connect/references/restapi.html)

```json
{
    "name" : "pubsub-source",
    "config" : {
       "connector.class" : "io.confluent.connect.gcp.pubsub.PubSubSourceConnector",
       "tasks.max" : "1",
       "kafka.topic" : "pubsub-topic",
       "gcp.pubsub.project.id" : "project-1",
       "gcp.pubsub.topic.id" : "topic-1",
       "gcp.pubsub.subscription.id" : "subscription-1",
       "gcp.pubsub.credentials.path" : "/home/some_directory/credentials.json",
       "confluent.topic.bootstrap.servers" : "localhost:9092",
       "confluent.topic.replication.factor" : "1"
    }
}
```

Use `curl` to post the configuration to one of the Kafka Connect Workers. Change `http://localhost:8083/` to the endpoint of one of your Kafka Connect worker(s).

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @pubsub-source.json http://localhost:8083/connectors
```

Use the following command to update the configuration of existing connector.

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @pubsub-source.json http://localhost:8083/connectors/pubsub-source/config
```

To consume records written by connector to the configured Kafka topic, run the following command:

```bash
kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081  --topic pubsub-topic --from-beginning
```


### REST-based example

1. Use this setting with [distributed
   workers](/platform/current/connect/concepts.html#distributed-workers). Write the following
   JSON, which can be used to read all the data list directly under a GCS
   bucket, to `config.json`, configure all of the required values, and use
   the following command to post the configuration to one of the distributed
   connect workers. Check here for more information about the Kafka Connect
   [REST API](/platform/current/connect/references/restapi.html).
   ```json
   {
      "name": "gcs-source-generalized",
      "config": {
      "connector.class": "io.confluent.connect.gcs.GcsSourceConnector",
      "tasks.max": "1",
      "value.converter": "org.apache.kafka.connect.json.JsonConverter",
      "mode": "GENERIC",
      "topics.dir": " ",
      "topic.regex.list": "mytopic:.",
      "format.class": "io.confluent.connect.gcs.format.json.JsonFormat",
      "gcs.bucket.name": "<bucket-name>",
      "gcs.credentials.path": "</full/path/to/credentials/keys.json>",
      "value.converter.schemas.enable": "false",
      "confluent.topic.bootstrap.servers" : "localhost:9092",
      "confluent.topic.replication.factor" : "1",
      "confluent.license" : " Omit to enable trial mode ",
      "transforms" : "AddPrefix",
      "transforms.AddPrefix.type" : "org.apache.kafka.connect.transforms.RegexRouter",
      "transforms.AddPrefix.regex" : ".",
      "transforms.AddPrefix.replacement" : "copy_of_$0"
     }
   }
   ```

   #### NOTE
   Change the `confluent.topic.bootstrap.servers` property to include your broker address(es), and change the `confluent.topic.replication.factor` to 3 for staging or production use.
2. Use curl to post a configuration to one of the Kafka Connect Workers. Change `http://localhost:8083/` to the endpoint of one of your Kafka Connect worker(s).
   ```bash
   curl -s -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
   ```
3. Use the following command to update the configuration for an existing connector.
   ```bash
   curl -s -X PUT -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors/GCSSourceConnector/config
   ```
4. To consume records written by the connector to the configured Kafka topic, run the following command:
   ```bash
   kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081  --topic copy_of_gcs_topic --from-beginning
   ```


#### Basic authentication example

1. Run the demo app with the `basic-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=basic-auth
   ```

   If the demo app is already running, you will need to kill that instance
   (`CTRL + C`) before running a new instance to avoid port conflicts.
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=HttpSinkBasicAuth
   topics=http-messages
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=BASIC
   connection.user=admin
   connection.password=password
   ```

   For details about using this connector with Kafka Connect Reporter, see
   [Connect
   Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
3. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart).


#### SSL with basic authentication example

1. Run the demo app with the `ssl-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=ssl-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=SSLHttpSink
   topics=string-topic
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   http.api.url=https://localhost:8443/api/messages
   # http sink connector SSL config
   ssl.enabled=true
   https.ssl.truststore.location=/path/to/http-sink-demo/src/main/resources/localhost-keystore.jks
   https.ssl.truststore.type=JKS
   https.ssl.truststore.password=changeit
   https.ssl.keystore.location=/path/to/http-sink-demo/src/main/resources/localhost-keystore.jks
   https.ssl.keystore.type=JKS
   https.ssl.keystore.password=changeit
   https.ssl.key.password=changeit
   https.ssl.protocol=TLSv1.2

   auth.type=BASIC
   connection.user=admin
   connection.password=password
   ```
3. Run and validate the connector as described in the [Quick start](#http-connector-quickstart).


#### Proxy authentication example

This proxy authentication example is dependent on MacOS X 10.6.8 or higher due
to the proxy that is utilized.

1. Run the demo app with the `simple-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=simple-auth
   ```
2. Install [Squidman Proxy](https://squidman.net/squidman).
3. In SquidMan, navigate to the **Preferences > General** tab, and set the HTTP port
   to `3128`.
4. In SquidMan, navigate to the **Preferences > Template** tab, and add the
   following criteria:
   ```text
   auth_param basic program /usr/local/squid/libexec/basic_ncsa_auth /etc/squid/passwords
   auth_param basic realm proxy
   acl authenticated proxy_auth REQUIRED
   http_access allow authenticated
   ```
5. Create a credentials file for the proxy.
   ```bash
   sudo mkdir /etc/squid
   sudo htpasswd -c /etc/squid/passwords proxyuser
   # set password to proxypassword
   ```
6. Open the SquidMan application and select `Start Squid`.
7. Create a `http-sink.properties` file with the following contents:
   ```text
   name=HttpSinkProxyAuth
   topics=http-messages
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   http.proxy.host=localhost
   http.proxy.port=3128
   http.proxy.user=proxyuser
   http.proxy.password=proxypassword
   ```
8. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart).


#### JSON converter example

1. Run the demo app with the `basic-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=basic-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=JsonHttpSink
   topics=json-topic
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.json.JsonConverter
   value.converter.schemas.enable=false
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=BASIC
   connection.user=admin
   connection.password=password
   ```

   Note that you should publish JSON messages to the `json-topic` instead of
   to the String messages shown in the [Quick start](#http-connector-quickstart).
3. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart).


#### Regex replacement example

1. Run the demo app with the `basic-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=basic-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=RegexHttpSink
   topics=email-topic,non-email-topic
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=BASIC
   connection.user=admin
   connection.password=password
   # regex to mask emails
   regex.patterns=^.+@.+$
   regex.replacements=********
   ```
3. Publish messages to the topics that are configured. Emails should be
   redacted with `********` before being sent to the demo app.
   ```bash
   confluent local produce email-topic
   > example@domain.com
   > another@email.com

   confluent local produce non-email-topic
   > not an email
   > another normal string
   ```
4. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart). Note that regex replacement is not
   supported when `request.body.format` configuration is set to `JSON`.


#### Retries example

1. Run the demo app with the `basic-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=basic-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=RetriesExample
   topics=http-messages
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=BASIC
   connection.user=admin
   connection.password=password
   behavior.on.null.values=delete
   # retry configurations
   max.retries=20
   retry.backoff.ms=5000
   ```
3. Publish messages to the topic that have keys and values.
   ```bash
   confluent local produce http-messages --property parse.key=true --property key.separator=,
   > 1,message-value
   > 2,another-message
   ```
4. Stop the demo app.
5. Run and validate the connector as described in the [Quick start](#http-connector-quickstart).
6. The Connector will retry for maximum 20 times with an initial backoff
   duration of 5000ms. If the http operation is successful then the retry will
   be stopped. In this case the connector will retry for 20 times and the
   connector task will get failed.
7. The default value for `max.retries` is 10 and for `retry.backoff.ms` is
   3000ms.


### REST-based example

In this section, you will complete the steps in a REST-based example.

1. Copy the following JSON object to `influxdb-sink-connector.json` and configure
   all of the required values. This configuration is typically used along with
   [distributed workers](/platform/current/connect/concepts.html#distributed-workers).
   ```json
   {
   "name" : "InfluxDBSinkConnector",
   "config" : {
           "connector.class" : "io.confluent.influxdb.InfluxDBSinkConnector",
           "tasks.max" : "1",
           "topics" : "orders",
           "influxdb.url" : "http://localhost:8086",
           "influxdb.db" : "influxTestDB",
           "measurement.name.format" : "${topic}",
           "value.converter": "io.confluent.connect.avro.AvroConverter",
           "value.converter.schema.registry.url": "http://localhost:8081"
   }
   }
   ```
2. Use the following `curl` command to post the configuration to one of the
   Kafka Connect workers while changing `http://localhost:8083/` to the
   endpoint of one of your Kafka Connect workers (for more information, see
   the Kafka Connect [REST API](/platform/current/connect/references/restapi.html)):
   ```bash
   curl -X POST -d @influxdb-sink-connector.json http://localhost:8083/connectors -H "Content-Type: application/json"
   ```
3. Create a record in the `orders` topic:
   ```bash
   bin/kafka-avro-console-producer \
   --broker-list localhost:9092 --topic orders \
   --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"id","type":"int"},{"name":"product", "type": "string"}, {"name":"quantity", "type": "int"}, {"name":"price",
   "type": "float"}]}'
   ```

   The console producer waits for input.
4. Copy and paste the following record into the terminal:
   ```bash
   {"id": 999, "product": "foo", "quantity": 100, "price": 50}
   ```
5. Log in to the Docker container using the following command:
   ```bash
   docker exec -it <containerid> bash
   ```

   To find the container ID, use the `docker ps` command.
6. Once you are in the Docker container, log in to InfluxDB shell:
   ```bash
   influx
   ```

   Your output should resemble:
   ```bash
   Connected to http://localhost:8086 version 1.7.7
   InfluxDB shell version: 1.7.7
   ```
7. Run the following query to verify the records:
   ```bash
   > USE influxTestDB;
   Using database influxTestDB

   > SELECT * FROM orders;
   name: orders
   time                id  price product quantity
   ----                --  ----- ------- --------
   1567164248415000000 999 50    foo     100
   ```


### Use the JMS Source connector with TIBCO EMS

You can use the JMS Source connector with TIBCO EMS and its support for JMS.
Note that this is a specialization of the connector that avoids JNDI and instead
uses system-specific APIs to establish connections. This is often easier to
configure and use in most cases. To get started, you must install the latest
TIBCO EMS JMS client libraries into the same directory where this connector is
installed. For more details, see the [TIBCO EMS product documentation](https://docs.tibco.com/products/tibco-enterprise-message-service?_ga=2.158352819.868240457.1698346221-814484783.1670445207)

Next, you must create a connector configuration for your environment, using the
appropriate configuration properties. The following example shows a typical
configuration of the connector for use with [distributed
mode](/platform/current/connect/concepts.html#distributed-workers).

```bash
{
  "name": "connector1",
  "config": {
    "connector.class": "io.confluent.connect.jms.JmsSourceConnector",
    "kafka.topic":"MyKafkaTopicName",
    "jms.destination.name":"MyQueueName",
    "jms.destination.type":"queue",
    "java.naming.factory.initial":"com.tibco.tibjms.naming.TibjmsInitialContextFactory",
    "java.naming.provider.url":"tibjmsnaming://<host>:<port>"
    "confluent.license":""
    "confluent.topic.bootstrap.servers":"localhost:9092"
    "confluent.topic.ssl.truststore.location"="omitted"
    "confluent.topic.ssl.truststore.password"="<password>"
    "confluent.topic.ssl.keystore.location"="omitted"
    "confluent.topic.ssl.keystore.password"="<password>"
    "confluent.topic.ssl.key.password"="<password>"
    "confluent.topic.security.protocol"="SSL"
  }
}
```

Note that any extra properties defined on the connector will be passed into the
JNDI InitialContext. This makes it easy to use any TIBCO EMS specific settings.

Finally, deploy your connector by posting it to a Kafka Connect distributed
worker.


#### Connector configuration

1. Create your `oracle-cdc-confluent-cloud-json.json` file based on the
   following example:
   ```json
    {
      "name": "OracleCDC_Confluent_Cloud",
      "config":{
      "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
      "name": "OracleCDC_Confluent_Cloud",
      "tasks.max":3,

      "oracle.server": "<database-url>",
      "oracle.sid":"<SID of the CDB>",
      "oracle.pdb.name":"<name of the PDB where tables reside. If you don't have PDB, remove this config property>",
      "oracle.username": "<username e.g. C##MYUSER>",
      "oracle.password": "<password>",
      "start.from":"snapshot",

      "redo.log.topic.name": "oracle-redo-log-topic",
      "redo.log.consumer.bootstrap.servers":"<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>",
      "redo.log.consumer.sasl.jaas.config":"org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<kafka-api-key>\" password=\"<kafka-api-secret>\";",
      "redo.log.consumer.security.protocol":"SASL_SSL",
      "redo.log.consumer.sasl.mechanism":"PLAIN",

      "table.inclusion.regex":"<regex-expression e.g. ORCL[.]ADMIN[.]MARIPOSA.*>",
      "_table.topic.name.template_":"Using template vars to set change event topic for each table",
      "table.topic.name.template": "${databaseName}.${schemaName}.${tableName}",
      "connection.pool.max.size": 20,
      "confluent.topic.replication.factor":3,

      "topic.creation.groups":"redo",
      "topic.creation.redo.include":"oracle-redo-log-topic",
      "topic.creation.redo.replication.factor":3,
      "topic.creation.redo.partitions":1,
      "topic.creation.redo.cleanup.policy":"delete",
      "topic.creation.redo.retention.ms":1209600000,
      "topic.creation.default.replication.factor":3,
      "topic.creation.default.partitions":5,
      "topic.creation.default.cleanup.policy":"compact",

     "confluent.topic.bootstrap.servers":"<your-bootstrap-server e.g. xyz.us-central1.gcp.confluent.cloud:9092>",
     "confluent.topic.sasl.jaas.config":"org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<kafka-api-key>\" password=\"<kafka-api-secret>\";",
     "confluent.topic.security.protocol":"SASL_SSL",
     "confluent.topic.sasl.mechanism":"PLAIN",

     "value.converter": "org.apache.kafka.connect.json.JsonConverter",
     "value.converter.schemas.enable": "false"
    }
   }
   ```
2. Create `oracle-redo-log-topic`. Make sure the topic name matches the value you put for `"redo.log.topic.name"`.

   **Confluent Platform CLI**
   ```text
   bin/kafka-topics --create --topic oracle-redo-log-topic \
   --bootstrap-server broker:9092 --replication-factor 1 \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```

   **Confluent Cloud CLI**
   ```text
   confluent kafka topic create oracle-redo-log-topic \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```
3. Start the Oracle CDC Source connector using the following command:
   ```text
   curl -s -H "Content-Type: application/json" -X POST -d @oracle-cdc-confluent-cloud-json.json http://localhost:8083/connectors/ | jq
   ```


### Property-based example

Create a configuration file `pagerduty-sink.properties` with the following
content. This file should be placed inside the Confluent Platform installation directory. This
configuration is used typically along with [standalone
workers](/platform/current/connect/concepts.html#standalone-workers).

```text
name=pagerduty-sink-connector

topics=incidents
connector.class=io.confluent.connect.pagerduty.PagerDutySinkConnector
tasks.max=1

pagerduty.api.key=****
behavior.on.error=fail

key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081

confluent.topic.bootstrap.servers=localhost:9092
confluent.topic.replication.factor=1
confluent.license=

reporter.bootstrap.servers=localhost:9092
reporter.result.topic.replication.factor=1
reporter.error.topic.replication.factor=1
```


### Property-based example

The following steps provide a property-based example.

1. Create a configuration file, `salesforce-bulk-api.properties`. This
   configuration is used typically along with [standalone
   workers](/platform/current/connect/concepts.html#standalone-workers).
   ```properties
   name=SalesforceBulkApiSourceConnector
   tasks.max=1
   connector.class=io.confluent.connect.salesforce.SalesforceBulkApiSourceConnector
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=io.confluent.connect.avro.AvroConverter
   value.converter.schema.registry.url=http://localhost:8081
   salesforce.username=< Required Configuration >
   salesforce.password=< Required Configuration >
   salesforce.password.token=< Required Configuration >
   salesforce.object=< Required Configuration >
   salesforce.since=< Required Configuration >
   kafka.topic=< Required Configuration >
   salesforce.instance=< Required Configuration >
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   confluent.license=Omit to enable trial mode
   ```
2. Ensure the configurations in `salesforce-bulk-api.properties` are properly
   set.
3. Start the Salesforce Bulk API source connector by loading its configuration
   with the following command:
   ```bash
   confluent local load salesforce-bulk-api-source -- -d salesforce-bulk-api.properties
   {
       "name" : "SalesforceBulkApiSourceConnector",
       "config" : {
         "connector.class", "io.confluent.connect.salesforce.SalesforceBulkApiSourceConnector",
         "tasks.max" : "1",
         "key.converter": "org.apache.kafka.connect.storage.StringConverter",
         "value.converter": "io.confluent.connect.avro.AvroConverter",
         "value.converter.schema.registry.url": "http://localhost:8081",
         "kafka.topic" : "< Required Configuration >",
         "salesforce.password" : "< Required Configuration >",
         "salesforce.password.token" : "< Required Configuration >",
         "salesforce.object" : "< Required Configuration >",
         "salesforce.username" : "< Required Configuration >",
         "salesforce.since" : "< Required Configuration >",
         "confluent.topic.bootstrap.servers": "localhost:9092",
         "confluent.topic.replication.factor": "1",
         "confluent.license": ""
       },
     "tasks": []
     }
   ```
4. Verify the connector starts successfully and review the Connect worker’s log
   by entering the following:
   ```bash
   confluent local log connect
   ```
5. Confirm the connector is in a `RUNNING` state.
   ```bash
   confluent local status SalesforceBulkApiSourceConnector
   ```
6. Confirm messages are being sent to Kafka.
   ```bash
   kafka-avro-console-consumer \
       --bootstrap-server localhost:9092 \
       --property schema.registry.url=http://localhost:8081 \
       --topic <topic-name> \
       --from-beginning | jq '.'
   ```


## Upsert with SObject Sink Connector

The `upsert` operation can be used when you want to update existing records in Salesforce (located by `external_id`) or otherwise, insert new records. The following example shows how to `upsert` records for an `Orders` object in Salesforce.

1. Create an `external_id` field in Salesforce.
   1. Click your user name and then click **Setup**.
   2. Under **Build**, click **Customize**, and then select **Orders**.
   3. Click the **Add a custom field to orders** link.
   4. In the **Order Custom Fields and Relationships** section, click **New**.
   5. In the **Data Type** list, select a data type, `Text`, then click **Next**.
   6. Enter the details for the field. For example, Field Label(`extid`), Length, Field Name(`extid`), Description.
   7. Check the External ID box, then click **Next**.
   8. The external ID (`extid`) is created and appears in the list under **Order Custom Fields and Relationships**.
2. Create a configuration file named `salesforce-sobject-orders-sink-config.json` with the following contents. Make sure to enter a real username, password, security token, consumer key, and consumer secret. Additionally, make sure you put the API name (`extid__c`) for the external ID (`extid`). See [Salesforce SObject Sink Connector Configuration Properties](salesforce_sobject_sink_connector_config.md#salesforce-sobject-sink-connector-config) for more information about these and the other configuration properties.
   ```none
   {
      "name": "upsert-orders",
      "config": {
          "connector.class" : "io.confluent.salesforce.SalesforceSObjectSinkConnector",
          "tasks.max" : "1",
          "topics" : "orders",
          "salesforce.object" : "Order",
          "salesforce.username" : "<Required>",
          "salesforce.password" : "<Required>",
          "salesforce.password.token" : "<Required>",
          "salesforce.consumer.key" : "<Required>",
          "salesforce.consumer.secret" : "<Required>",
          "confluent.topic.bootstrap.servers":"localhost:9092",
          "confluent.topic.replication.factor": "1",

          "key.converter": "org.apache.kafka.connect.storage.StringConverter",
          "value.converter": "org.apache.kafka.connect.json.JsonConverter",
          "value.converter.schemas.enable":true,
          "behavior.on.api.errors": "fail",

          "reporter.bootstrap.servers": "localhost:9092",
          "reporter.error.topic.name": "error-responses",
          "reporter.error.topic.replication.factor": 1,
          "reporter.result.topic.name": "success-responses",
          "reporter.result.topic.replication.factor": 1,
          "salesforce.sink.object.operation": "upsert",
          "override.event.type": "true",

          "request.max.retries.time.ms": 60000,
          "salesforce.custom.id.field.name": "extid__c",
          "salesforce.use.custom.id.field": true

       }
   }
   ```
3. Enter the Confluent CLI [confluent local services connect connector load](https://docs.confluent.io/confluent-cli/current/command-reference/local/services/connect/connector/confluent_local_services_connect_connector_load.html) command to start the Salesforce source connector.
   ```bash
   confluent local load upsert-orders --config salesforce-sobject-orders-sink-config.json
   ```

   Your output should resemble:
   ```none
   {
       "name": "upsert-orders",
       "config": {
           "connector.class" : "io.confluent.salesforce.SalesforceSObjectSinkConnector",
           "tasks.max" : "1",
           "topics" : "orders",
           "salesforce.object" : "Order",
           "salesforce.username" : "<Required>",
           "salesforce.password" : "<Required>",
           "salesforce.password.token" : "<Required>",
           "salesforce.consumer.key" : "<Required>",
           "salesforce.consumer.secret" : "<Required>",
           "confluent.topic.bootstrap.servers":"localhost:9092",
           "confluent.topic.replication.factor": "1",

           "key.converter": "org.apache.kafka.connect.storage.StringConverter",
           "value.converter": "org.apache.kafka.connect.json.JsonConverter",
           "value.converter.schemas.enable":true,
           "behavior.on.api.errors": "fail",

           "reporter.bootstrap.servers": "localhost:9092",
           "reporter.error.topic.name": "error-responses",
           "reporter.error.topic.replication.factor": 1,
           "reporter.result.topic.name": "success-responses",
           "reporter.result.topic.replication.factor": 1,
           "salesforce.sink.object.operation": "upsert",
           "override.event.type": "true",

           "request.max.retries.time.ms": 60000,
           "salesforce.custom.id.field.name": "extid__c",
           "salesforce.use.custom.id.field": true
       },
       "tasks": [
           ...
       ],
       "type": null
   }
   ```
4. In order to insert an order into Salesforce with a Kafka record, the record should have a valid `AccountID`, `ContractID`, `EffectiveDate`, and `Status`. Please create an Account record and a Contract record in Salesforce. The values used in this example are: `"AccountId": "0012L0000176cdVQAQ"`, `"ContractId": "8002L000000ANqwQAG"` and `"EffectiveDate": 1608922098000` (the Epoch timestamp for 12/25/2020 in milliseconds).
   ```none
   kafka-console-producer \
   --broker-list localhost:9092 \
   --topic orders
   {"schema":{"type":"struct","fields":[{"type":"string","optional":false,"field":"Id"},{"type":"string","optional":false,"field":"AccountId"},{"type":"string","optional":false,"field":"ContractId"},{"type":"string","optional":false,"field":"Description"},{"type":"string","optional":false,"field":"Status"},{"type": "int64","optional": false,"field": "EffectiveDate"},{"type":"string","optional":false,"field":"_ObjectType"}, {"type":"string","optional":false,"field":"_EventType"}],"optional":false,"name":"myOrder","version":1},"payload": {"Id": "200", "AccountId": "0012L0000176cdVQAQ", "ContractId": "8002L000000ANqwQAG", "Status": "Draft", "EffectiveDate": 1608922098000, "Description":"Order record has been upserted.", "_ObjectType":"Order", "_EventType":"updated"}}
   ```
5. Log in to Salesforce and verify that the `Order` object exists with the external ID.
   ![Salesforce screen 1](images/salesforce-sobject-sink-upsert-1.png)
6. Update the description of the order object the connector just created.
   ```none
   kafka-console-producer \
   --broker-list localhost:9092 \
   --topic orders
   {"schema":{"type":"struct","fields":[{"type":"string","optional":false,"field":"Id"},{"type":"string","optional":false,"field":"AccountId"},{"type":"string","optional":false,"field":"ContractId"},{"type":"string","optional":false,"field":"Description"},{"type":"string","optional":false,"field":"Status"},{"type": "int64","optional": false,"field": "EffectiveDate"},{"type":"string","optional":false,"field":"_ObjectType"}, {"type":"string","optional":false,"field":"_EventType"}],"optional":false,"name":"myOrder","version":1},"payload": {"Id": "200", "AccountId": "0012L0000176cdVQAQ", "ContractId": "8002L000000ANqwQAG", "Status": "Draft", "EffectiveDate": 1608922098000, "Description":"Order record has been updated.", "_ObjectType":"Order", "_EventType":"updated"}}
   ```
7. Login to Salesforce and verify that the `Order` object has been updated with the external ID.
   ![Salesforce screen 2](images/salesforce-sobject-sink-upsert-2.png)


## Quick Start

In this quick start guide, the Zendesk connector is used to consume records from
a Zendesk resource called `tickets` and send the records to a Kafka topic named
`ZD_tickets`.

To run this quick start, ensure you have a [Zendesk Developer Account](https://developer.zendesk.com/).

1. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your confluent platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-zendesk:latest
   ```
2. Start the Confluent Platform.
   ```bash
   confluent local start
   ```
3. Check the status of all services.
   ```bash
   confluent local services status
   ```
4. Configure your connector by first creating a JSON file named `zendesk.json` with the following properties.
   ```bash
   // substitute <> with your config
   {
       "name": "ZendeskConnector",
       "config": {
           "connector.class": "io.confluent.connect.zendesk.ZendeskSourceConnector",
           "key.converter": "org.apache.kafka.connect.storage.StringConverter",
           "value.converter": "org.apache.kafka.connect.json.JsonConverter",
           "value.converter.schemas.enable": "false",
           "confluent.topic.bootstrap.servers": "127.0.0.1:9092",
           "confluent.topic.replication.factor": 1,
           "confluent.license": "<license>", // leave it empty for evaluation license
           "tasks.max": 1,
           "poll.interval.ms": 1000,
           "topic.name.pattern": "ZD_${entityName}",
           "zendesk.auth.type": "basic",
           "zendesk.url": "https://<sub-domain>.zendesk.com",
           "zendesk.user": "<username>",
           "zendesk.password": "<password>",
           "zendesk.tables": "tickets",
           "zendesk.since": "2019-08-01"
       }
   }
   ```
5. Start the Zendesk Source connector by loading the connector’s configuration with the following command:
   ```bash
   confluent local load zendesk --config zendesk.json
   ```
6. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status ZendeskConnector
   ```
7. Create one ticket record using Zendesk API as follows.
   ```bash
   curl https://{subdomain}.zendesk.com/api/v2/tickets.json \
     -d '{"ticket": {"subject": "My printer is on fire!", "comment": { "body": "The smoke is very colorful." }}}' \
     -H "Content-Type: application/json" -v -u {email_address}:{password} -X POST
   ```
8. Confirm the messages were delivered to the `ZD_tickets` topic in Kafka. Note, it may take a minute before the record populates the topic.
   ```bash
   confluent local consume ZD_tickets --from-beginning
   ```


## Annotate Confluent custom resources

Confluent for Kubernetes (CFK) provides a set of public annotations that you can use to modify
a certain workflow or a state of Confluent Platform components. The annotations are applied to
Confluent Platform custom resources (CRs).

platform.confluent.io/force-reconcile
: Triggers a reconcile cycle of cluster. Once the reconcile cycle is complete,
  the annotation value gets reset to `false`.


  * Supported values: `true`, `false`
  * Default value: `false`
  * CR types applied to: All CRs

platform.confluent.io/block-reconcile
: Blocks the reconcile even when internal resources or the CR
  spec is changed. This is used primarily to allow users to perform manual
  workflows. When this is enabled, CFK discards any changes done out of band to
  the CR.


  * Supported values: `true`, `false`
  * Default value: `false`
  * CR types applied to: All CRs

platform.confluent.io/roll-precheck
: When set to `disable`, CFK does not perform the pre-check for
  under-replicated partitions.


  * Supported values: `disable`, `enable`
  * Default value: `enable`
  * CR types applied to: Kafka

platform.confluent.io/roll-pause
: When set to `true`, the current pod roll will be paused.


  * Supported values: `false`, `true`
  * Default value: `false`
  * CR types applied to: Kafka

platform.confluent.io/disable-garbage-collection
: Disables CFK from garbage collecting Kubernetes resources that CFK
  internally manages.


  * Supported values: `false`, `true`
  * Default value: `true`
  * CR types applied to: Control Center, Connect, Kafka, REST Proxy, ksqlDB, Schema Registry,
    ZooKeeper

platform.confluent.io/enable-shrink
: Enables the shrink workflow for the Kafka CR. This should only be enabled when
  the Kafka image of the version 7.0 or higher.


  * Supported values: `true`, `false`
  * Default value: `true`
  * CR types applied to: Kafka

platform.confluent.io/disable-internal-rolebindings-creation
: Defines whether to disable internal rolebinding creation in RBAC security
  settings.


  * Supported values: `true`, `false`
  * Default value: `false`
  * CR types applied to: Control Center, Connect, REST Proxy, ksqlDB, Schema Registry

platform.confluent.io/soft-delete-versions
: A list of versions to trigger a soft delete workflow for the Schema CR.


  * Supported values: A JSON formatted array, for example, `[1,2,3]`
  * Default value: None
  * CR types applied to: Schema

platform.confluent.io/delete-versions
: A list of versions to trigger a hard delete workflow for the Schema CR.


  * Supported values: A JSON formatted array, for example, `[1,2,3]`
  * Default value: None
  * CR types applied to: Schema

platform.confluent.io/restart-connector
: Triggers a restart of the Connector.


  * Supported values: `true`, `false`
  * Default value: `false`
  * CR types applied to: Connector

platform.confluent.io/pause-connector
: Pauses the connector.


  * Supported values: `true`, `false`
  * Default value: `false`
  * CR types applied to: Connector

platform.confluent.io/resume-connector
: Resumes the connector.


  * Supported values: `true`, `false`
  * Default value: `false`
  * CR types applied to: Connector

platform.confluent.io/restart-task
: Triggers a restart of the specified Connector task.


  * Supported values: A `int32` type number
  * Default value: None
  * CR types applied to: Connector

platform.confluent.io/http-timeout-in-seconds
: Specifies the HTTP client timeout in seconds for the CR workflows.


  * Supported values: A `int32` type number
  * Default value: None
  * CR types applied to: Control Center, Connect, Kafka, KafkaTopic, ClusterLink,
    Schema

platform.confluent.io/confluent-hub-install-extra-args
: Additional arguments for the Connect CR. The extra arguments will be used
  when the Connect starts up and downloads plugins from the Confluent Hub.


  * Supported values:  A string of flags, for example,
    `--worker-configs /dev/null --component-dir /mnt/plugins`
  * Default value: None
  * CR types applied to: Connect

platform.confluent.io/pod-overlay-configmap-name
: Configures additional Kubernetes features that are not supported in the CFK
  API.


  * Supported values: A ConfigMap name. For details on the Pod Overlay feature
    and the associated ConfigMap, see [Customize Confluent Platform pods with Pod Overlay](#co-pod-overlay).
  * Default value: None
  * CR types applied to: Control Center, Connect, Kafka, REST Proxy, ksqlDB, Schema Registry,
    ZooKeeper, KRaft

platform.confluent.io/enable-dynamic-configs
: Enables dynamic TLS certificates rotation for Kafka listeners and Kafka
  REST class service so that the Kafka cluster does not roll when certificates
  change.


  * Supported values: `true`, `false`
  * Default value: `false`
  * CR types applied to: Kafka

platform.confluent.io/pvc-access-mode
: Sets the Persistent Volume Claim (PVC) access mode which specifies how CFK
  pods can interact with the underlying storage provided by a Persistent Volume.


  * Supported values: `ReadWriteOnce`, `ReadWriteMany`
  * Default value: `ReadWriteOnce`
  * CR types applied to: Kafka, ZooKeeper (Confluent Platform 7.9 or earlier only), KRaft, ksqlDB, Control Center, REST Proxy


  For details, see [Configure Storage for Confluent Platform Using Confluent for Kubernetes](co-storage.md#co-storage).

platform.confluent.io/disable-hard-delete-schema
: Disables hard delete of a schema. A hard delete removes all metadata, including
  schema IDs.


  * Supported values: `true`, `false`
  * Default value: `false`
  * CR types applied to: Schema

To add an annotation, run the following command:

```bash
kubectl annotate <CR type> <CR name> -n <namespace> <annotation>="<annotation value>"
```

To delete an annotation, run the following command:

```bash
kubectl annotate <CR type> <CR name> -n <namespace> <annotation>-
```


### Configure CSFLE in CFK

To deploy CSFLE with CFK:

1. Deploy Confluent Platform components.

   Specifically, configure Schema Registry with the required configuration for CSFLE using
   the `configOverrides.server` property in the SchemaRegistry custom resource
   (CR) YAML.

   For example:
   ```yaml
   kind: SchemaRegistry
   spec:
     replicas: 1
     image:
       application: confluentinc/cp-schema-registry
       init: confluentinc/confluent-init-container
     configOverrides:
       server:
       - resource.extension.class=io.confluent.kafka.schemaregistry.rulehandler.RuleSetResourceExtension,io.confluent.dekregistry.DekRegistryResourceExtension
       - confluent.license.addon.csfle=<cp-enterprise-license-key>
   ```

   The value for `confluent.license.addon.csfle` is the same as your main
   Confluent Platform Enterprise license key.
2. Grant necessary RBAC permissions for users and key resources (topics,
   subjects, and KEKs). For example:
   * Give the `ResourceOwner` role to Schema Registry’s internal Kafka client for the
     `_dek_registry_keys` topic.
   * Grant user roles for topics, subjects, and KEK resources.

     You can use the [ConfluentRolebinding CR](co-manage-rbac.md#co-create-rolebinding) or the
     [confluent iam rbac role-binding create](https://docs.confluent.io/confluent-cli/current/command-reference/iam/rbac/role-binding/confluent_iam_rbac_role-binding_create.html)
     command. For example:
     ```bash
     confluent iam rbac role-binding create \
       --principal User:sr \
       --role ResourceOwner \
       --resource Topic:_dek_registry_keys \
       --kafka-cluster <CLUSTER_ID>
     ```
   * For enabling RBAC for DEK Registry, see [Access control (RBAC) for CSFLE](https://docs.confluent.io/platform/current/security/protect-data/csfle/client-side.html#access-control-rbac-for-csfle).
3. Register Schemas and RuleSets using the REST API, tagging fields for
   encryption.
   * Define the [schema for the topic](https://docs.confluent.io/platform/current/schema-registry/schema.html#create-a-topic-schema-in-c3-short)
     and add [tags to the schema fields](https://docs.confluent.io/platform/current/security/protect-data/csfle/client-side.html#step-3-add-tags-to-the-schema-fields)
     in the schema that you want to encrypt.
   * Define an [encryption policy](https://docs.confluent.io/platform/current/security/protect-data/csfle/client-side.html#step-3-define-an-encryption-policy)
     that specifies rules to use to encrypt the tags.
4. Register the KEK resource using the [Schema Registry REST API](https://docs.confluent.io/cloud/current/api.html#tag/Key-Encryption-Keys-(v1)/operation/createKek)
   or the [register-deks](https://docs.confluent.io/platform/current/security/protect-data/csfle/manage-keys.html#register-a-dek)
   command.

   An example JSON payload for the REST API:
   ```json
   {
     "name": "my-kek",
     "kmsType": "local-kms",
     "kmsKeyId": "mykey",
     "shared": false
   }
   ```

   For details on managing CSFLE keys, see [Manage CSFLE keys](https://docs.confluent.io/platform/current/security/protect-data/csfle/manage-keys.html).
5. Configure the Java client with KMS credentials or local secret, and produce
   messages with encrypted fields.
   * Clients must be configured to provide the secret or KMS credentials at
     runtime. For local KEK, this is usually a base64 string; for AWS KMS,
     environment variables for credentials must be set.

     Example Java client properties:
     ```properties
     props.put("rule.executors._default_.param.secret", "pgbju8SjcaWJOtTSgeBckA==");
     props.put("schema.registry.basic.auth.user.info", "testadmin:testadmin");
     ```
   * Only fields of type `string` or `byte` with the correct tag are
     supported for encryption.
   * When a message is produced with CSFLE, the tagged field is encrypted using
     the configured KEK.
   * Consumers must provide the correct key/credential to decrypt and read the
     tagged field.


### Requirements and considerations for RBAC with LDAP

The following are the requirements and considerations for enabling and using
RBAC using LDAP:

* You must have an LDAP server that Confluent Platform can use for authentication.

  Currently, CFK only supports the `GROUPS` LDAP search mode. The search
  mode indicates if the user-to-group mapping is retrieved by searching for
  group or user entries. If you need to use the `USERS` search mode, specify
  using the `configOverrides` setting in the Kafka CR as below:
  ```yaml
  spec:
    configOverrides:
      server:
        - ldap.search.mode=USERS
  ```

  See [Sample Configuration for User-Based Search](https://docs.confluent.io/platform/current/security/authorization/ldap/configure.html#sample-configuration-for-user-based-search)
  for more information.
* You must create the user principals in LDAP that will be used by Confluent Platform components.
  These are the default user principals:
  * Kafka: `kafka`/`kafka-secret`
  * Confluent REST API: `erp`/`erp-secret`
  * Control Center: `c3`/`c3-secret`
  * ksqlDB: `ksql`/`ksql-secret`
  * Schema Registry: `sr`/`sr-secret`
  * Replicator: `replicator`/`replicator-secret`
  * Connect: `connect`/`connect-secret`
* Create the LDAP user/password for a user who has a minimum of LDAP
  read-only permissions to allow Metadata Service (MDS) to query LDAP about other users. For
  example, you’d create a user `mds` with password `Developer!`
* Create a user for the Admin REST service in LDAP and provide the username and
  password.


## Validate connections

The following are example steps to validate external accesses to Confluent Platform
components, using the `example.com` domain and default component prefixes.

Control Center (Legacy) UI
: In your browser, navigate to [https://controlcenter.example.com:443](https://controlcenter.example.com:443).

Kafka
: 1. Get the external endpoints of Kafka.
     * To get the broker endpoints:
       ```bash
       oc get kafka kafka -ojsonpath='{.status.listeners.external.advertisedExternalEndpoints}'
       ```
     * To get the Kafka bootstrap server endpoint:
       ```bash
       oc get kafka kafka -ojsonpath='{.status.listeners.external.externalEndpoint}'
       ```
  2. Create a topic.


     For this step, you need the Confluent CLI tool on your local system.
     [Install Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html) on your
     local system to get access to the tool.


     For example:
     ```bash
     confluent kafka topic create mytest \
       --partitions 3 --replication-factor 2 \
       --url kafka.example.com:443
     ```
  3. In Control Center (Legacy), validate that the `mytest` topic was created.

Connect
: 1. Get the external endpoint of the component:
     ```bash
     oc get connect connect -ojsonpath='{.status.restConfig.externalEndpoint}'
     ```
  2. Verify that you can reach the component endpoint. For example:
     ```bash
     curl https://connect.example.com:443 -ik -s -H "Content-Type: application/json"
     ```

ksqlDB
: 1. Get the external endpoint of the component:
     ```bash
     oc get ksqldb ksqldb -ojsonpath='{.status.restConfig.externalEndpoint}'
     ```
  2. Verify that you can reach the component endpoint. For example:
     ```bash
     curl https://ksqldb.example.com:443/ksql -ik -s -H "Content-Type: application/vnd.ksql.v1+json; charset=utf-8" -X POST --data '{"ksql": "LIST ALL TOPICS;", "streamsProperties": {}}'
     ```

Schema Registry
: 1. Get the external endpoint of the component:
     ```bash
     oc get schemaregistry schemaregistry -ojsonpath='{.status.restConfig.externalEndpoint}'
     ```
  2. Verify that you can reach the component endpoint. For example:
     ```bash
     curl -ik https://schemaregistry.example.com:443/subjects
     ```

Control Center (Legacy)
: 1. Get the external endpoint of the component:
     ```bash
     oc get controlcenter controlcenter -ojsonpath='{.status.restConfig.externalEndpoint}'
     ```
  2. Verify that you can reach the component endpoint. For example:
     ```bash
     curl https://controlcenter.example.com:443/2.0/health/status -ik -s -H "Content-Type: application/json"
     ```

REST Proxy
: 1. Get the external endpoint of the component:
     ```bash
     oc get kafkarestproxy kafkarestproxy -ojsonpath='{.status.restConfig.externalEndpoint}'
     ```
  2. Verify that you can reach the component endpoint. For example:
     ```bash
     curl -ik https://kafkarestproxy.example.com:443/v3/clusters
     ```


### Use statically provisioned persistent volumes

By default, CFK automates disk management by leveraging Kubernetes dynamic
storage provisioning. If your Kubernetes cluster does not support dynamic
provisioning, you can follow the instructions in this section to use
statically-provisioned disks for your Confluent Platform deployments.

Connect and Schema Registry do not use persistent storage volumes, so you do not need to
follow the steps in this section.

To use statically-provisioned persistent volumes for a Confluent Platform component:

1. Create a StorageClass in Kubernetes for local provisioning. For example:
   ```yaml
   apiVersion: storage.k8s.io/v1
   kind: StorageClass
   metadata:
     name: my-storage-class
   provisioner: kubernetes.io/no-provisioner
   volumeBindingMode: WaitForFirstConsumer
   ```
2. Create [PersistentVolumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistent-volumes)
   with the desired host path and the hostname label for each of the desired
   worker nodes.

   You need the following number of persistent volumes for Confluent Platform components:
   * 2 persistent volumes on each ZooKeeper host (Confluent Platform 7.9 or earlier only)
   * 1 persistent volume on each Kafka, ksqlDB, Control Center host

   For example:
   ```yaml
   apiVersion: v1
   kind: PersistentVolume
   metadata:
     name: pv-1                                                --- [1]
   spec:
     capacity:
       storage: 10Gi                                           --- [2]
     volumeMode: Filesystem
     accessModes:
     - ReadWriteOnce
     persistentVolumeReclaimPolicy: Retain                     --- [3]
     storageClassName: my-storage-class                        --- [4]
     local:
        path: /mnt/data/broker-1-data                          --- [5]
     nodeAffinity:
       required:
         nodeSelectorTerms:
         - matchExpressions:
           - key: kubernetes.io/hostname
             operator: In
             values:
             - gke-myhost-cluster-default-pool-5cc13882-k0gb  --- [6]
   ```

   * [1] Choose a name for the PersistentVolume.
   * [2] Choose a storage size that is greater than or equal to the storage
     you’re requesting for each Kafka broker instance. This corresponds to the
     `spec.dataVolumeCapacity` property of the component CR.
   * [3] Choose `Retain` if you want the data to be retained after you delete
     the PersitentVolumeClaim that CFK will eventually create and which
     Kubernetes will eventually bind to this PersistentVolume.

     Choose `Delete` if you want this data to be garbage-collected when the
     PersistentVolumeClaim is deleted.

     #### WARNING
     With `persistentVolumeReclaimPolicy: Delete`, your data on the volume
     will be deleted when you delete the CFK component custom resource (CR),
     for example, when you delete the Kafka CR with the `kubectl delete kafka
     <kafka_cluster-name>` .
   * [4] The `storageClassName` must match the one created in Step 1.
   * [5] This is the directory path you want to use on the worker node for the
     broker as its persistent data volume. The path must exist on the worker
     node.
   * [6] This is the value of the `kubernetes.io/hostname` label of the worker
     node you want to host this broker instance. To find this hostname, run the
     following command:
     ```bash
     kubectl get nodes \
       -o 'custom-columns=NAME:metadata.name,HOSTNAME:metadata.labels.kubernetes\.io/hostname'

     NAME                                           HOSTNAME
     gke-myhost-cluster-default-pool-5cc13882-k0gb   gke-myhost-cluster-default-pool-5cc13882-k0gb
     gke-myhost-cluster-default-pool-5cc13882-n8vr   gke-myhost-cluster-default-pool-5cc13882-n8vr
     gke-myhost-cluster-default-pool-5cc13882-tbbj   gke-myhost-cluster-default-pool-5cc13882-tbbj
     ```
3. Add the storageClass to the component CR, for example:
   ```yaml
   spec:
     storageClass:
       name: my-storage-class
   ```
4. After deploying the new CR, validate that the PersistentVolumes are bound:
   ```bash
   kubectl get pv

   NAME  CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM                   STORAGECLASS
   pv-1  10Gi     RWO          Retain         Bound  operator/data0-kafka-0  my-storage-class
   pv-2  10Gi     RWO          Retain         Bound  operator/data0-kafka-2  my-storage-class
   pv-3  10Gi     RWO          Retain         Bound  operator/data0-kafka-1  my-storage-class
   ```
5. Validate that the Confluent Platform pods are healthy. For example:
   ```bash
   kubectl get pods -l app=kafka

   NAME      READY   STATUS    RESTARTS   AGE
   kafka-0   1/1     Running   0          40m
   kafka-1   1/1     Running   0          40m
   kafka-2   1/1     Running   0          40m
   ```


#### Use a Dead Letter Queue with security

When you use Confluent Platform with security enabled, the Confluent Platform [Admin Client](../installation/configuration/admin-configs.md#cp-config-admin) creates the Dead Letter Queue (DLQ) topic. Invalid records
are first passed to an internal producer constructed to send these records, and
then, the Admin Client creates the DLQ topic.

For the DLQ to work in a secure Confluent Platform environment, you must add additional Admin
Client configuration properties (prefixed with `admin.*`) to the Connect
Worker configuration. The following [SASL/PLAIN](../security/authentication/mutual-tls/overview.md#kafka-ssl-authentication)
example shows additional Connect Worker configuration properties:

```bash
admin.ssl.endpoint.identification.algorithm=https
admin.sasl.mechanism=PLAIN
admin.security.protocol=SASL_SSL
admin.request.timeout.ms=20000
admin.retry.backoff.ms=500
admin.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
  username="<user>" \
  password="<secret>";
```

For details about configuring your Connect worker, sink connector, and dead
letter queue topic in a Role-Based Access Control (RBAC) environment, see
[Kafka Connect and RBAC](rbac-index.md#connect-rbac-index).


## ResourceOwner and UserAdmin

The `ResourceOwner` (the user creating a new connector) is responsible for
submitting the request for connector credentials to the `UserAdmin` before
creating the connector.

Once the request is received, the `UserAdmin` creates the secrets for the
connector with a path consisting of the connector name and the keys `username`
and `password` for the service account that has permissions to access the
topics that the connector will consume from or produce to. The secrets are
created using a [POST API request](#connect-rbac-secret-registry-api). For
example:

```text
POST /secret/paths/<connector-name>/keys/<username>/versions
{
  "secret": "<password>"
}
```

The following properties are then included in the connector configuration:

**Sink connector properties:**

```properties
consumer.override.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  username="${secret:<connector-name>:<username>}" \
  password="${secret:<connector-name>:<password>}" \
  metadataServerUrls="http://<metadata server URLS>:8090";
```

**Source connector properties:**

```properties
producer.override.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  username="${secret:<connector-name>:<username>}" \
  password="${secret:<connector-name>:<password>}" \
  metadataServerUrls="http:/<metadata server URLS>:8090";
```

When the user submits the connector configuration, Connect validates that
all external variable references have a path that matches the connector ID. The
connector configuration is rejected if the connector configuration has variable
references with a path that does not match the connector ID.


# Configure Kerberos Authentication for Brokers Running MDS

This configuration describes how to combine LDAP authentication for MDS with
Kerberos broker authentication, essentially combining the two authentication
methods.

Prerequisites
: - The prerequisites for configuring Kerberos authentication for MDS are the same
    as the prerequisites for configuring MDS. See [Configure Metadata Service (MDS) in Confluent Platform](index.md#rbac-mds-config).
  - Create a user for the Kafka broker.
  - Generate the keytab. See [Configure GSSAPI in Confluent Platform clusters](../../security/authentication/sasl/gssapi/overview.md#kafka-sasl-auth-gssapi).
  - [Create a PEM key pair](index.md#create-pem-key-pair).

1. Add the following required configuration options to the `etc.kafka.server.properties` file.
   Any content in brackets (`<>`) must be customized for your environment.

   #### NOTE
   The LDAP configuration attributes in this example reflect a system using Active Directory (AD). If you
   use a different directory system, contact your LDAP administrator for details.

   ```RST
    ############################# Confluent Authorizer Settings #############################
    authorizer.class.name=io.confluent.kafka.security.authorizer.ConfluentServerAuthorizer
    confluent.authorizer.access.rule.providers=ZK_ACL,CONFLUENT
    confluent.metadata.server.listeners=http://0.0.0.0:8090
    confluent.metadata.server.advertised.listeners=http://localhost:8090
    #### Semi-colon separated list of super users in the format <principalType>:<principalName> ####
    #### For example: super.users=User:admin;User:mds ####
    super.users=User:<org-super-user>;User:<org-kerberos-principal>

    ############################# Identity Provider Settings (LDAP) #############################
    #### JNDI Connection Settings ####
    ldap.java.naming.factory.initial=com.sun.jndi.ldap.LdapCtxFactory
    ldap.java.naming.provider.url=ldap://<hostname>:389
    ####  MDS Authentication Settings ####
    ldap.java.naming.security.principal=<mds-user-DN>
    ldap.java.naming.security.credentials=<password>
    ldap.java.naming.security.authentication=simple
    #### Client Authentication Settings ####
    ldap.user.search.base=<user-search-base-DN>
    ldap.user.name.attribute=sAMAccountName
    ldap.group.search.base=CN=Users,DC=rbac,DC=confluent,DC=io
    ldap.group.object.class=group
    ldap.group.member.attribute.pattern=UID=(.*),OU=Users,DC=EXAMPLE,DC=COM
    ldap.user.object.class=account

    ############################# MDS Server Settings #############################
    confluent.metadata.server.authentication.method=BEARER

    ############################# MDS Token Service Settings #############################
    confluent.metadata.server.token.key.path=<path-to-token-key-pair.pem>

    ############################# Listener Settings #############################
    listeners=INTERNAL_SASL_PLAINTEXT://:9093,EXTERNAL_RBAC_SASL_PLAINTEXT://:9092
    advertised.listeners=INTERNAL_SASL_PLAINTEXT://localhost:9093,EXTERNAL_RBAC_SASL_PLAINTEXT://localhost:9092
    inter.broker.listener.name=INTERNAL_SASL_PLAINTEXT

    ############################# Listener SASL Configuration Settings #############################
    listener.security.protocol.map=INTERNAL_SASL_PLAINTEXT:SASL_PLAINTEXT,EXTERNAL_RBAC_SASL_PLAINTEXT:SASL_PLAINTEXT

    ############################# Broker Internal Listener SASL Configuration Settings #############################
    sasl.mechanism.inter.broker.protocol=GSSAPI
    listener.name.internal_sasl_plaintext.sasl.enabled.mechanisms=GSSAPI
    listener.name.internal_sasl_plaintext.sasl.kerberos.service.name=kafka
    listener.name.internal_sasl_plaintext.gssapi.sasl.jaas.config = \
      com.sun.security.auth.module.Krb5LoginModule required \
      debug=true \
      useKeyTab=true \
      storeKey=true \
      keyTab="<path-to-your-keytab>" \
      principal="<org-kerberos-principal>"; (for example: kafka/kafka1.hostname.com@EXAMPLE.COM)

   ############################# Broker External (Client) Listener SASL Configuration Settings #############################
   listener.name.external_rbac_sasl_plaintext.sasl.enabled.mechanisms=OAUTHBEARER
   listener.name.external_rbac_sasl_plaintext.oauthbearer.sasl.jaas.config= \
     org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
     publicKeyPath="<path-to-your-public-key";
   listener.name.external_rbac_sasl_plaintext.oauthbearer.sasl.server.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerValidatorCallbackHandler
   listener.name.external_rbac_sasl_plaintext.oauthbearer.sasl.login.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerServerLoginCallbackHandler
   ```

   For a description of the parameters, see:
   - **Lines 2-8:** Enables RBAC. For more information, see [Configure Confluent Server Authorizer in Confluent Platform](../../security/csa-introduction.md#confluent-server-authorizer).
   - **Lines 11-24:** Configures LDAP so that RBAC can use it. For more information, see [Configure LDAP Group-Based Authorization for MDS](ldap-auth-config.md#ldap-auth-config)
     and [Configure LDAP Authentication](ldap-auth-mds.md#ldap-auth-mds).
   - **Line 27:** Defines listeners and configures HTTPs for brokers. For more information, see [Metadata Service Configuration Settings](mds-configuration.md#mds-configuration-options).
   - **Line 30:** Defines private key configuration properties. For more information, see [Metadata Service Configuration Settings](mds-configuration.md#mds-configuration-options).
   - **Lines 33-58:** Enables SASL authentication and Kerberos authentication. For more information, see [SASL](../../security/authentication/overview.md#kafka-sasl-auth) and [Configure GSSAPI in Confluent Platform clusters](../../security/authentication/sasl/gssapi/overview.md#kafka-sasl-auth-gssapi).
2. [Start Confluent Platform](../../installation/overview.md#installation).


# Configure a Multi-Node Confluent Platform Environment with Docker

This topic demonstrates how to configure a multi-node Apache Kafka® environment with Docker and cloud providers.

Kafka is a distributed system and data is read from and written to the partition leader. The leader can be on any broker in
a cluster. When a client (producer or consumer) starts, it will request metadata about which broker is the leader for a
partition. This request for metadata can come from any broker. The metadata that is returned will include the available
endpoints for the lead broker of that partition. The client will use those endpoints to connect to the broker to read or
write data as required.

![image](images/multi-node-1.png)

Kafka needs to know how the brokers can communicate with each other, and how external clients (producers and consumers) can reach the
broker. The required host and IP address is determined based on the data that the broker passes back in the initial
connection (For example, if it is a single node, the broker returned is the same as the one connected to).

Kafka brokers can have multiple listeners. A listener is a combination of Host/IP, Port, and Protocol. Following is an example
Docker configuration of multiple listeners for KRaft mode:

```none
KAFKA_LISTENERS: CONTROLLER://kafka0:29093,LISTENER_BOB://kafka0:29092,LISTENER_FRED://localhost:9092
KAFKA_ADVERTISED_LISTENERS: LISTENER_BOB://kafka0:29092,LISTENER_FRED://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,LISTENER_BOB:PLAINTEXT,LISTENER_FRED:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: LISTENER_BOB
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
```

KAFKA_LISTENERS
: A comma-separated list of listeners, host/IP, and port that Kafka binds and listens to. For
  more complex networking, this can be an IP address that is associated with a network interface on a machine. The default
  is `0.0.0.0`, which means listening on all interfaces. This is equivalent to the `listeners` configuration parameter
  in the properties file. For example, for a broker, `CONFLUENT_HOME/etc/kafka/broker.properties`).

KAFKA_ADVERTISED_LISTENERS
: A comma-separated list of listeners with their the host/IP and port. This is the
  metadata that is passed back to clients. This is equivalent to the `advertised.listeners` configuration parameter
  in the broker properties file.

KAFKA_CONTROLLER_LISTENER_NAMES
: In KRaft mode, A comma-separated list of the names of the listeners used by the controller. On a node with `process.roles=broker`,
  only the first listener in the list will be used by the broker. For KRaft controllers in isolated or combined mode,
  the node will listen as a KRaft controller on all listeners that are listed for this property, and each must appear in the `listeners` property.

KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
: Defines key/value pairs for the security protocol to use, per listener name.
  This is equivalent to the `listener.security.protocol.map` configuration parameter in the broker properties file.

KAFKA_INTER_BROKER_LISTENER_NAME
: Defines which listener to use for inter-broker communication. Kafka brokers
  communicate between themselves, usually on the internal network (e.g. Docker network, AWS VPC, etc). The host/IP must
  be accessible from the broker machine to others. This is equivalent to the `inter.broker.listener.name` configuration
  parameter in the broker properties file.

If Kafka clients are not local to the broker’s network, additional listeners are required. Each listener will report the
address where it can be reached. The broker address depends on the network used. For example, if you’re connecting to the
broker from an internal network, the host/IP is different than when connecting externally.


### POST /clusters/{cluster_id}/links

**Create a cluster link**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Cluster link creation requires source cluster security configurations in
the configs JSON section of the data request payload.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
* **Query Parameters:**
  * **link_name** (*string*) – The link name
    (Required)
  * **validate_only** (*boolean*) – To validate the action can be performed successfully or not. Default: false
  * **validate_link** (*boolean*) – To synchronously validate that the source cluster ID is expected and the dest cluster has the permission to read topics in the source cluster. Default: true

**destination_initiated_link:**

```http
POST /clusters/{cluster_id}/links?link_name=link-sb1 HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "remote_cluster_id": "cluster-1",
    "configs": [
        {
            "name": "bootstrap.servers",
            "value": "cluster-1-bootstrap-server"
        },
        {
            "name": "acl.sync.enable",
            "value": "false"
        },
        {
            "name": "consumer.offset.sync.ms",
            "value": "30000"
        },
        {
            "name": "sasl.mechanism",
            "value": "PLAIN"
        },
        {
            "name": "security.protocol",
            "value": "SASL_SSL"
        },
        {
            "name": "sasl.jaas.config",
            "value": "sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<API Key>' password='<API Secret>';"
        }
    ]
}
```

**source_initiated_link_at_source_cluster:**

```http
POST /clusters/{cluster_id}/links?link_name=link-sb1 HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "remote_cluster_id": "cluster-2",
    "configs": [
        {
            "name": "bootstrap.servers",
            "value": "cluster-2-bootstrap-server"
        },
        {
            "name": "link.mode",
            "value": "SOURCE"
        },
        {
            "name": "sasl.mechanism",
            "value": "PLAIN"
        },
        {
            "name": "security.protocol",
            "value": "SASL_SSL"
        },
        {
            "name": "sasl.jaas.config",
            "value": "sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<REMOTE CLUSTER API Key>' password='<REMOTE CLUSTER API Secret>';"
        },
        {
            "name": "local.sasl.mechanism",
            "value": "PLAIN"
        },
        {
            "name": "local.security.protocol",
            "value": "SASL_SSL"
        },
        {
            "name": "local.sasl.jaas.config",
            "value": "sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<LOCAL CLUSTER API Key>' password='<LOCAL CLUSTER API Secret>';"
        }
    ]
}
```

**source_initiated_link_at_destination_cluster:**

```http
POST /clusters/{cluster_id}/links?link_name=link-sb1 HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "remote_cluster_id": "cluster-1",
    "configs": [
        {
            "name": "link.mode",
            "value": "DESTINATION"
        },
        {
            "name": "connection.mode",
            "value": "INBOUND"
        },
        {
            "name": "acl.sync.enable",
            "value": "false"
        }
    ]
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – Operation succeeded, no content in the response
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


## Starting the ksqlDB Server

The ksqlDB servers are run separately from the ksqlDB CLI client and Kafka brokers. You can deploy servers on remote machines,
VMs, or containers and then the CLI connects to these remote servers.

You can add or remove servers from the same resource pool during live operations, to elastically scale query processing. You
can use different resource pools to support workload isolation. For example, you could deploy separate pools for production
and for testing.

You can only connect to one ksqlDB server at a time. The ksqlDB CLI does not support automatic failover to another ksqlDB server.

![image](ksqldb/images/client-server.png)

Follow these instructions to start ksqlDB server using the `ksql-server-start` script.

1. Specify your ksqlDB server configuration parameters. You can also set any property for the Kafka Streams API, the Kafka
   producer, or the Kafka consumer. The required parameters are `bootstrap.servers` and `listeners`. You can specify
   the parameters in the ksqlDB properties file or the `KSQL_OPTS` environment variable. Properties set with `KSQL_OPTS`
   take precedence over those specified in the properties file.

   A recommended approach is to configure a common set of properties using the ksqlDB configuration file and override
   specific properties as needed, using the `KSQL_OPTS` environment variable.

   Here are the default settings:
   ```none
   bootstrap.servers=localhost:9092
   listeners=http://0.0.0.0:8088
   ```

   For more information, see
   [Configure ksqlDB Server](operate-and-deploy/installation/server-config.md#ksqldb-install-configure-server).
2. Start a server node with this command:
   ```bash
   ksql-server-start ${CONFLUENT_HOME}/etc/ksqldb/ksql-server.properties
   ```
3. Have a look at [this page](operate-and-deploy/installation/server-config.md#ksqldb-install-configure-server-non-interactive-usage)
   for instructions on running ksqlDB in non-interactive (aka headless) mode.


## Specify ksqlDB Server configuration parameters

You can specify the configuration for your ksqlDB Server instances by
using these approaches:

- **The \`\`environment\`\` key:** In the stack file, populate the
  `environment` key with your settings. By convention, the ksqlDB
  setting names are prepended with `KSQL_`.
- **\`\`–env\`\` option:** On the [docker
  run](https://docs.docker.com/engine/reference/commandline/run/)
  command line, specify your settings by using the `--env` option
  once for each parameter. For more information, see
  [Configure ksqlDB with Docker](install-ksqldb-with-docker.md#ksqldb-install-configure-with-docker).
- **ksqlDB Server config file:** Add settings to the
  `ksql-server.properties` file. This requires building your own
  Docker image for ksqlDB Server. For more information, see
  [Configuring ksqlDB Server](server-config.md#ksqldb-install-configure-server).

For a complete list of ksqlDB parameters, see the
[Configuration Parameter Reference](../../reference/server-configuration.md#ksqldb-reference-server-configuration).

You can also set any property for the Kafka Streams API, the
Kafka producer, or the Kafka consumer.

A recommended approach is to configure a common set of properties using
the ksqlDB Server configuration file and override specific properties as
needed, using the environment variables.

ksqlDB must have access to a running Kafka cluster, which can be
on your local machine, in a data center, a public cloud, or
Confluent Cloud. For ksqlDB Server to connect to a Kafka cluster,
the required parameters are `KSQL_LISTENERS` and
`KSQL_BOOTSTRAP_SERVERS`, which have the following default values:

```yaml
environment:
    KSQL_LISTENERS: http://0.0.0.0:8088
    KSQL_BOOTSTRAP_SERVERS: localhost:9092
```

ksqlDB runs separately from your Kafka cluster, so you specify the
IP addresses of the cluster’s bootstrap servers when you start a
container for ksqlDB Server. For more information, see
[Configuring ksqlDB Server](server-config.md#ksqldb-install-configure-server).

To start ksqlDB containers in different configurations, see
[Configure ksqlDB with Docker](install-ksqldb-with-docker.md#ksqldb-install-configure-with-docker).


## Connecting to Confluent Cloud ksqlDB

To use the `ksql-migrations` tool with your
[Confluent Cloud ksqlDB cluster](/cloud/current/ksqldb/overview.html)
cluster, set the following configurations in your
`ksql-migrations.properties` file, which is created as part of
[setting up your migrations project](#ksqldb-manage-metadata-schemas-initial-setup).

```properties
ksql.auth.basic.username=<CCLOUD_KSQLDB_APIKEY>
ksql.auth.basic.password=<CCLOUD_KSQLDB_APIKEY_SECRET>
ksql.migrations.topic.replicas=3
ssl.alpn=true
```


#### Commands and flags

To create a cluster link, use `kafka-cluster-links` along with [bootstrap-server](#bootstrap-cluster-links) and the following flags.

`--link`
: (Required) The name of the cluster link to create. Must be a unique cluster link name within the cluster.


  * Type: string

`--cluster-id`
: (Required) The ID of the source cluster to link to. You can find a cluster’s ID with the CLI command `kafka-cluster cluster-id`.


  * Type: string

(Required) One of the following parameters must be provided (not both) to specify how the destination cluster
should communicate with the source. The available configurations are those that would be used to configure a client,
including the required `bootstrap.servers` and other necessary security and authorization properties.

`--config`
: Comma-separated configurations to be applied to the cluster link on creation of the form “key=value”.
  When you use this flag, the configurations are specified directly on the command line (as opposed to in a file, as described for the next flag).
  You can use square brackets to group values that contain commas. For a full list of available configurations, see [Link Properties](configs.md#cluster-link-specific-configs).


  * Type: string

`--config-file`
: Property file containing [configurations](configs.md#cluster-link-specific-configs) for the cluster link. This is the recommended way to specify cluster link configurations.


  * Type: string

For example, if you specify the following configuration for a secure cluster link in a file named `link-config.properties`:

```bash
bootstrap.servers=example-1:9092,example-2:9092,example-3:9092
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="example-user" password="example-password"
security.protocol=SASL_SSL
ssl.endpoint.identification.algorithm=https
```

Then, you can create the cluster link `example-link` with the following command:

```bash
kafka-cluster-links --bootstrap-server localhost:9093 --create --link example-link --config-file link-config.properties --cluster-id pz-s7W72Sdm7A11wzku9gA
```

Optional configurations:

`--command-config`
: Property file containing configurations to be passed to the [AdminClient](../../installation/configuration/admin-configs.md#cp-config-admin). For example,
  with security credentials for authorization and authentication.

`--consumer-group-filters-json`
: JSON string to use for configuration of `consumer.offset.group.filters`. To learn more, see [Migrating consumer groups from source to destination cluster](#cluster-link-migrate-consumer-groups).


  * Type: string

`--consumer-group-filters-json-file`
: Path to JSON file to use for configuration of `consumer.offset.group.filters`. To learn more, see [Migrating consumer groups from source to destination cluster](#cluster-link-migrate-consumer-groups).


  * Type: string

`--acl-filters-json-file`
: Path to the ACL filters JSON file to use for configuration of `acl.filters`. To learn more, see [Migrating ACLs from Source to Destination Cluster](security.md#cluster-link-acls-migrate).


  * Type: string

`--validate-only`
: If provided, validates that the cluster link can be created as specified, but does not create it.

`--exclude-validate-link`
: If provided, creates the link without validating that the source cluster can be reached. This is helpful only
  if the source cluster is not yet running or reachable. If the source cluster is running and available,
  using this option is not recommended, as it skips helpful validations.

`--topic-filters-json`
: JSON string to use for configuration of `auto.create.mirror.topics.filters`. To learn more, see [Mirror Topics](mirror-topics-cp.md#mirror-topics-concepts).

`--topic-filters-json-file`
: Path to JSON file to use for configuration of `auto.create.mirror.topics.filters`. To learn more, see [Mirror Topics](mirror-topics-cp.md#mirror-topics-concepts).


### Confluent Cloud


Confluent Cloud prerequisites are:

- A Confluent Cloud account
- Permission to create a topic and schema in a cluster in Confluent Cloud
- Stream Governance Package enabled
- API key and secret for Confluent Cloud cluster (`$APIKEY`, `$APISECRET`)
- API key and secret for Schema Registry (`$SR_APIKEY`, `$SR_APISECRET`)
- Schema Registry endpoint URL (`$SCHEMA_REGISTRY_URL`)
- Cluster ID (`$CLUSTER_ID`)
- Schema registry cluster ID (`$SR_CLUSTER_ID`)

The examples assume that API keys, secrets, cluster IDs, and API endpoints are
stored in persistent environment variables wherever possible, and refer to them
as such. You can store these in shell variables if your setup is temporary. If
you want to return to this environment and cluster for future work, consider
storing them in a profile (such as `.zsh`, `.bashrc`, or `powershell.exe` profiles).

The following steps provide guidelines on these prerequisites specific to these examples. To learn more general information, see [Manage Clusters](/cloud/current/clusters/create-cluster.html#manage-clusters-in-ccloud).

1. Log in to Confluent Cloud:
   ```bash
   confluent login
   ```
2. Create a Kafka cluster in Confluent Cloud
   ```bash
   confluent kafka cluster create <name> [flags]
   ```

   For example:
   ```bash
   confluent kafka cluster create quickstart_cluster --cloud "aws" --region "us-west-2"
   ```

   Your output will include a cluster ID (in the form of `lkc-xxxxxx`), show the cluster name and
   [cluster type](/cloud/current/clusters/cluster-types.html#ccloud-features-and-limits-by-cluster-type) (in this case, “Basic”),
   and endpoints. Take note of the cluster ID, and store it in an environment variable such as `$CLUSTER_ID`.
3. Get an API key and secret for the cluster:
   ```bash
   confluent api-key create --resource $CLUSTER_ID
   ```

   Store the API key and secret for your cluster in a safe place, such as shell environment variables: `$APIKEY`, `$APISECRET`
4. View Stream Governance packages and Schema Registry endpoint URL.

   A [Stream Governance package](/cloud/current/stream-governance/packages.html#how-to-enable-sr-or-upgrade-to-a-stream-governance-package)
   was enabled as a part of creating the environment.
   - To view governance packages, use the Confluent CLI command [confluent environment list](https://docs.confluent.io/confluent-cli/current/command-reference/environment/confluent_environment_list.html):
     ```bash
     confluent environment list
     ```

     Your output will show the environment ID, name, and associated Stream Governance packages.
   - To view the Stream Governance API endpoint URL, use the command [confluent schema-registry cluster describe](https://docs.confluent.io/confluent-cli/current/command-reference/schema-registry/cluster/confluent_schema-registry_cluster_describe.html):
     ```bash
     confluent schema-registry cluster describe
     ```

     Your output will show the Schema Registry cluster ID in the form of `lsrc-xxxxxx`) and endpoint URL, which is also available to you in Cloud Console on the
     right side panel under “Stream Governance API” in the environment. Store these in environment variables: `$SR_CLUSTER_ID` and `$SCHEMA_REGISTRY_URL`.
5. Create a Schema Registry API key, using the Schema Registry cluster ID (`$SR_CLUSTER_ID`) from the previous step as the resource ID.
   ```bash
   confluent api-key create --resource $SR_CLUSTER_ID
   ```

   Store the API key and secret for your Schema Registry in a safe place, such as shell environment variables: `$SR_APIKEY` and `$SR_APISECRET`


### Confluent Cloud

1. Create a Kafka topic:
   ```bash
   confluent kafka topic create transactions-avro --cluster $CLUSTER_ID
   ```
2. Copy the following schema and store it in a file called `schema.txt`:
   ```none
   {
      "type": "record",
      "name": "Transaction",
      "fields": [
          {"name": "id", "type": "string"},
          {"name": "amount", "type": "double"}
      ]
    }
   ```
3. Run the following command to create a producer with the schema created in the previous step:
   ```bash
   confluent kafka topic produce transactions-avro \
   --cluster $CLUSTER_ID \
   --schema "<full path to file>/schema.txt"
   --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
   --schema-registry-api-key $SR_APIKEY \
   --schema-registry-api-secret $SR_APISECRET \
   --api-key $APIKEY --api-secret $APISECRET  \
   --value-format "avro"
   ```

   Your output should resemble:
   ```bash
   Successfully registered schema with ID 100001
   Starting Kafka Producer. Use Ctrl-C or Ctrl-D to exit.
   ```
4. Type the following command in the shell, and hit return.
   ```none
   { "id":"1000", "amount":500 }
   ```
5. Open another terminal and run a consumer to read from topic `transactions-avro` and get the value of the message in JSON:
   ```bash
   confluent kafka topic consume transactions-avro \
   --cluster $CLUSTER_ID \
   --from-beginning \
   --value-format "avro" \
   --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
   --schema-registry-api-key $SR_APIKEY \
   --schema-registry-api-secret $SR_APISECRET \
   --api-key $APIKEY --api-secret $APISECRET
   ```

   Your output should be:
   ```bash
   {"id":"1000","amount":500}
   ```
6. Register a new schema version under the same subject by adding a new field, `customer_id`.

   Since the default subject level compatibility is BACKWARD, you must add the new field as “optional” in order for it to be compatible with the previous version.
   Create a new file as `schema2.txt` and copy the following schema in it:
   ```bash
   {
    "type": "record",
    "name": "Transaction",
    "fields": [
        {"name": "id", "type": "string"},
        {"name": "amount", "type": "double"},
        {"name": "customer_id", "type": "string", "default":"null"}
    ]
   }
   ```

   Open another terminal, and run the following command:
   ```bash
   confluent kafka topic produce transactions-avro \
   --cluster $CLUSTER_ID \
   --schema "<full path to file>/schema2.txt" \
   --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
   --schema-registry-api-key $SR_APIKEY \
   --schema-registry-api-secret $SR_APISECRET \
   --api-key $APIKEY \
   --api-secret "$APISECRET" \
   --value-format "avro"
   ```

   Your output should resemble:
   ```bash
   Successfully registered schema with ID 100002
   Starting Kafka Producer. Use Ctrl-C or Ctrl-D to exit.
   ```
7. Type the following into your producer, and hit return:
   ```none
   { "id":"1001", "amount":500, "customer_id":"1221" }
   ```
8. Switch to the terminal with your running consumer to read from topic `transactions-avro` and get the new message.

   You should see the new output added to the original.:
   ```none
   {"id":"1000","amount":500.0}
   {"id":"1001","amount":500.0,"customer_id":"1221"}
   ```

   (If by chance you closed the original consumer, just restart it using the same command shown in step 5.)
9. View the schemas that were registered with Schema Registry as versions 1 and 2.
   ```none
   confluent schema-registry schema describe --subject transactions-avro-value --version 1
   ```

   Your output should be similar to the following, showing the `id` and `amount` fields added in version 1 of the schema:
   ```none
   Schema ID: 100003
   Type: JSON
   Schema: {"type":"object","properties":{"id":{"type":"string"},"amount":{"type":"number"}}}
   ```

   To view version 2:
   ```none
   confluent schema-registry schema describe --subject transactions-avro-value --version 2
   ```

   Output for version 2 will include the `customer_id` field:
   ```none
   Schema ID: 100002
   Schema: {"type":"record","name":"Transaction","fields":[{"name":"id","type":"string"},{"name":"amount","type":"double"},{"name":"customer_id","type":"string","default":"null"}]}
   ```
10. Use the Confluent Cloud Console to examine schemas and messages.


    Messages that were successfully produced also show on the Confluent Cloud Console ([https://confluent.cloud/](https://confluent.cloud/)).
    in **Topics > <topicName> > Messages**. You may have to select a partition or jump to a timestamp to see messages sent earlier.
    (For timestamp, type in a number, which will default to partition `1/Partition: 0`, and press return. To get the message view shown here,
    select the **cards** icon on the upper right.)
    ![image](images/serdes-avro-cloud-ui-messages.png)

    Schemas you create are available on the **Schemas** tab for the selected topic.
    ![image](images/serdes-avro-cloud-ui-schema.png)
11.


    Run shutdown and cleanup tasks.
    - You can stop the consumer and producer with Ctl-C in their respective command windows.
    - If you were using shell environment variables and want to keep them for later, remember to store them in a safe, persistent location.
    - You can remove topics, clusters, and environments from the [command line](https://docs.confluent.io/confluent-cli/current/command-reference/overview.html) or from the [Confluent Cloud Console](https://confluent.cloud/).


### Confluent Cloud

1. Create a Kafka topic:
   ```bash
   confluent kafka topic create transactions-json --cluster $CLUSTER_ID
   ```
2. Copy the following schema and store it in a file called `schema.txt`:
   ```bash
   {
       "type":"object",
       "properties":{
           "id":{"type":"string"},
           "amount":{"type":"number"}
       }
   }
   ```
3. Run the following command to create a producer with the schema created in the previous step:
   ```bash
   confluent kafka topic produce transactions-json \
   --cluster $CLUSTER_ID \
   --schema "<full path to file>/schema.txt"
   --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
   --schema-registry-api-key $SR_APIKEY \
   --schema-registry-api-secret $SR_APISECRET \
   --api-key $APIKEY --api-secret $APISECRET  \
   --value-format "jsonschema"
   ```

   Your output should resemble:
   ```bash
   Successfully registered schema with ID 100001
   Starting Kafka Producer. Use Ctrl-C or Ctrl-D to exit.
   ```
4. Type the following command in the shell, and hit return.
   ```none
   { "id":"1000", "amount":500 }
   ```
5. Open another terminal and run a consumer to read from topic `transactions-json` and get the value of the message in JSON:
   ```bash
   confluent kafka topic consume transactions-json \
   --cluster $CLUSTER_ID \
   --from-beginning \
   --value-format "jsonschema" \
   --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
   --schema-registry-api-key $SR_APIKEY \
   --schema-registry-api-secret $SR_APISECRET \
   --api-key $APIKEY --api-secret $APISECRET
   ```

   Your output should be:
   ```bash
   {"id":"1000","amount":500}
   ```
6. Use the producer to send another record as the message value, which includes a new property not explicitly declared in the schema.

   JSON Schema has an open content model, which allows any number of additional properties to appear in a JSON document without being specified in the JSON schema.
   This is achieved with `additionalProperties` set to `true`, which is the default. If you do not explicitly disable `additionalProperties` (by setting it to `false`),
   undeclared properties are allowed in records. These next few steps demonstrate this unique aspect of JSON Schema.

   Return to the producer session that is already running and send the following message, which includes a new property `"customer_id"` that is not declared in the schema
   with which we started this producer. (Hit return to send the message.)
   ```none
   {"id":"1000","amount":500,"customer_id":"1221"}
   ```
7. Return to your running consumer to read from topic `transactions-json` and get the new message.

   You should see the new output added to the original.
   ```none
   {"id":"1000","amount":500}
   {"id":"1000","amount":500,"customer_id":"1221"}
   ```

   The message with the new property (`customer_id`) is successfully produced and read. If you try this with the other schema formats (Avro, Protobuf),
   it will fail at the producer command because those specifications require that all properties be explicitly declared in the schemas.

   Keep this consumer running.
8. Update the compatibility requirement for the subject `transactions-json-value`.
   ```none
   confluent schema-registry subject update transactions-json-value --compatibility "none"
   ```

   The output message is
   ```none
   Successfully updated Subject Level compatibility to "none" for subject "transactions-json-value"
   ```
9. Store the following schema in a file called `schema2.txt`:
   ```bash
   {
       "type":"object",
       "properties":{
           "id":{"type":"string"},
           "amount":{"type":"number"}
       },
       "additionalProperties": false
   }
   ```

   Note that this schema is almost the same as the original in `schema.txt`, except that in this schema `additionalProperties` is explicitly set to false.
10. Run another producer to register the new schema.

    Use Ctl-C to shut down the running producer, and start a new one to register the new schema.
    ```bash
    confluent kafka topic produce transactions-json \
    --cluster $CLUSTER_ID \
    --schema "<full path to file>/schema2.txt"
    --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
    --schema-registry-api-key $SR_APIKEY \
    --schema-registry-api-secret $SR_APISECRET \
    --api-key $APIKEY --api-secret $APISECRET  \
    --value-format "jsonschema"
    ```
11. Attempt to use this producer to register a new schema, and send another record as the message value, which includes a new property not explicitly declared in the schema.
    ```none
    { "id":"1001","amount":500,"customer_id":"this-will-break"}
    ```

    This will break. You will get the following error:
    ```none
    Error: the JSON document is invalid
    ```

    The consumer will continue running, but no new messages will be displayed.

    This is the same behavior you would see by default if using Avro or Protobuf in this scenario.
12. Rerun the producer in default mode as before (by using `schema.txt`) and send a follow-on message with an undeclared property.

    In the producer command window, stop the producer with Ctl+C.

    Run the original producer command. Note that there is no need to explicitly declare `additionalProperties` as `true` in the schema (although you could), as this is the default.
    ```bash
    confluent kafka topic produce transactions-json \
    --cluster $CLUSTER_ID \
    --schema "<full path to file>/schema.txt"
    --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
    --schema-registry-api-key $SR_APIKEY \
    --schema-registry-api-secret $SR_APISECRET \
    --api-key $APIKEY --api-secret $APISECRET  \
    --value-format "jsonschema"
    ```
13. Use the producer to send another record as the message value, which again includes a new property not explicitly declared in the schema.
    ```none
    {"id":"1001","amount":500,"customer_id":"this-will-work-again"}
    ```
14. Return to the consumer session to read the new message.

    The consumer should still be running and reading from topic `transactions-json`. You will see following new message in the console.
    ```none
    {"id":"1001","amount":500,"customer_id":"this-will-work-again"}
    ```

    More specifically, if you followed all steps in order and started the consumer with the `--from-beginning` flag
    as mentioned earlier, the consumer shows a history of all messages sent:
    ```none
    {"id":"1000","amount":500}
    {"id":"1000","amount":500,"customer_id":"1221"}
    {"id":"1001","amount":500,"customer_id":"this-will-work-again"}
    ```
15. View the schemas that were registered with Schema Registry as versions 1 and 2.
    ```none
    confluent schema-registry schema describe --subject transactions-json-value --version 1
    ```

    Your output should be similar to the following, showing the `id` and `amount` fields added in version 1 of the schema:
    ```none
    Schema ID: 100001
    Type: JSON
    Schema: {"type":"object","properties":{"id":{"type":"string"},"amount":{"type":"number"}}}
    ```

    To view version 2
    ```none
    confluent schema-registry schema describe --subject transactions-avro-value --version 2
    ```

    Output for version 2 will include the same fields but include the `additionalProperties` flag set to `false`:
    ```none
    Schema ID: 100002
    Type: JSON
    Schema: {"type":"object","properties":{"id":{"type":"string"},"amount":{"type":"number"}},"additionalProperties":false}
    ```
16. Use the Confluent Cloud Console to examine schemas and messages.


    Messages that were successfully produced also show on the Confluent Cloud Console ([https://confluent.cloud/](https://confluent.cloud/)).
    in **Topics > <topicName> > Messages**. You may have to select a partition or jump to a timestamp to see messages sent earlier.
    (For timestamp, type in a number, which will default to partition `1/Partition: 0`, and press return. To get the message view shown here,
    select the **cards** icon on the upper right.)
    ![image](images/serdes-json-cloud-ui-messages.png)

    Schemas you create are available on the **Schemas** tab for the selected topic.
    ![image](images/serdes-json-cloud-ui-schema.png)
17.


    Run shutdown and cleanup tasks.
    - You can stop the consumer and producer with Ctl-C in their respective command windows.
    - If you were using shell environment variables and want to keep them for later, remember to store them in a safe, persistent location.
    - You can remove topics, clusters, and environments from the [command line](https://docs.confluent.io/confluent-cli/current/command-reference/overview.html) or from the [Confluent Cloud Console](https://confluent.cloud/).


### Confluent Cloud

1. Create a Kafka topic:
   > ```bash
   > confluent kafka topic create transactions-protobuf --cluster $CLUSTER_ID
   > ```
2. Copy the following schema and store it in a file called `schema.txt`:
   > ```bash
   > syntax = "proto3";
   > message MyRecord {
   >         string id = 1;
   >         float amount = 2;
   >         }
   > ```
3. Run the following command to create a producer with the schema created in the previous step:
   > ```bash
   > confluent kafka topic produce transactions-protobuf \
   > --cluster $CLUSTER_ID \
   > --schema "<full path to file>/schema.txt"
   > --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
   > --schema-registry-api-key $SR_APIKEY \
   > --schema-registry-api-secret $SR_APISECRET \
   > --api-key $APIKEY --api-secret $APISECRET  \
   > --value-format "protobuf"
   > ```

   > Your output should resemble:
   > ```bash
   > Successfully registered schema with ID 100001
   > Starting Kafka Producer. Use Ctrl-C or Ctrl-D to exit.
   > ```
4. Type the following command in the shell, and hit return.
   > ```none
   > { "id":"1000", "amount":500 }
   > ```
5. Open another terminal and run a consumer to read from topic `transactions-protobuf` and get the value of the message in JSON:
   > ```bash
   > confluent kafka topic consume transactions-json \
   > --cluster $CLUSTER_ID \
   > --from-beginning \
   > --value-format "protobuf" \
   > --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
   > --schema-registry-api-key $SR_APIKEY \
   > --schema-registry-api-secret $SR_APISECRET \
   > --api-key $APIKEY --api-secret $APISECRET
   > ```

   > Your output should be:
   > ```bash
   > {"id":"1000","amount":500}
   > ```
6. Register a new schema version under the same subject by adding a new field, `customer_id`.
   > ```bash
   > syntax = "proto3";
   > message MyRecord {
   >         string id = 1;
   >         float amount = 2;
   >         string customer_id=3;
   >         }
   > ```

   > Open another terminal, and run the following command:
   > ```bash
   > confluent kafka topic produce transactions-protobuf \
   > --cluster $CLUSTER_ID \
   > --schema "<full path to file>/schema.txt"
   > --schema-registry-endpoint $SCHEMA_REGISTRY_URL \
   > --schema-registry-api-key $SR_APIKEY \
   > --schema-registry-api-secret $SR_APISECRET \
   > --api-key $APIKEY --api-secret $APISECRET  \
   > --value-format "protobuf"
   > ```

   > Your output should resemble:
   > ```bash
   > Successfully registered schema with ID 100002
   > Starting Kafka Producer. Use Ctrl-C or Ctrl-D to exit.
   > ```
7. Type the following into your producer, and hit return:
   ```none
   { "id":"1001", "amount":700, "customer_id":"1221"}
   ```
8. Switch to the terminal with your running consumer to read from topic `transactions-avro` and get the new message.
   > You should see the new output added to the original.:
   > ```none
   > {"id":"1000","amount":500}
   > {"id":"1001","amount":700,"customerId":"1221"}
   > ```

   > (If by chance you closed the original consumer, just restart it using the same command shown in step 5.)
9. View the schemas that were registered with Schema Registry as versions 1 and 2.
   > ```none
   > confluent schema-registry schema describe --subject transactions-protobuf-value --version 1
   > ```

   > Your output should be similar to the following, showing the `id` and `amount` fields added in version 1 of the schema
   > ```none
   > Schema ID: 100001
   > Type: PROTOBUF
   > Schema: syntax = "proto3";

   > message MyRecord {
   >   string id = 1;
   >   float amount = 2;
   > }
   > ```

   > To view version 2
   > ```none
   > confluent schema-registry schema describe --subject transactions-protobuf-value --version 2
   > ```

   > Output for version 2 will include the `customer_id` field
   > ```none
   > Schema ID: 100002
   > Type: PROTOBUF
   > Schema: syntax = "proto3";

   > message MyRecord {
   >   string id = 1;
   >   float amount = 2;
   >   string customer_id = 3;
   > }
   > ```

> 1.
>

>    Run shutdown and cleanup tasks.
>    - You can stop the consumer and producer with Ctl-C in their respective command windows.
>    - If you were using shell environment variables and want to keep them for later, remember to store them in a safe, persistent location.
>    - You can remove topics, clusters, and environments from the [command line](https://docs.confluent.io/confluent-cli/current/command-reference/overview.html) or from the [Confluent Cloud Console](https://confluent.cloud/).


### Confluent Replicator

Confluent Replicator is a type of Kafka source connector that replicates data from a source to destination Kafka cluster. An embedded consumer inside Replicator consumes data from the source cluster, and an embedded producer inside the Kafka Connect worker produces data to the destination cluster.

Replicator version 4.0 and earlier requires a connection to ZooKeeper in the origin and destination Kafka clusters. If ZooKeeper is configured for authentication, the client configures the ZooKeeper security credentials via the global JAAS configuration setting `-Djava.security.auth.login.config` on the Connect workers, and the ZooKeeper security credentials in the origin and destination clusters must be the same.

To configure Confluent Replicator security, you must configure the Replicator connector as shown below and additionally you must configure:

* [Kafka Connect](#sasl-gssapi-connect-workers)

Configure Confluent Replicator to use SASL/GSSAPI by adding these properties in the Replicator’s JSON configuration file.

```bash
{
  "name":"replicator",
    "config":{
      ....
      "src.kafka.security.protocol" : "SASL_SSL",
      "src.kafka.sasl.mechanism" : "GSSAPI",
      "src.kafka.sasl.kerberos.service.name" : "kafka",
      "src.kafka.sasl.jaas.config" : "com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab=\"/etc/security/keytabs/kafka_client.keytab\" principal=\"replicator@EXAMPLE.COM\";",
      ....
    }
  }
}
```


#### SEE ALSO
To see an example Confluent Replicator configuration, see the [SASL source authentication demo script](https://github.com/confluentinc/examples/tree/latest//replicator-security/scripts/submit_replicator_source_sasl_plain_auth.sh). For demos of common security configurations see: [Replicator security demos](https://github.com/confluentinc/examples/tree/latest//replicator-security)

To configure Confluent Replicator for a destination cluster with SASL/GSSAPI authentication, modify the Replicator JSON configuration to include the following:

```bash
{
  "name":"replicator",
    "config":{
      ....
      "dest.kafka.security.protocol" : "SASL_SSL",
      "dest.kafka.sasl.mechanism" : "GSSAPI",
      "dest.kafka.sasl.kerberos.service.name" : "kafka",
      "dest.kafka.sasl.jaas.config" : "com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab=\"/etc/security/keytabs/kafka_client.keytab\" principal=\"replicator@EXAMPLE.COM\";",
      ....
    }
  }
}
```

Additionally, you can configure the following properties on the Connect worker:

```bash
sasl.mechanism=GSSAPI
security.protocol=SASL_SSL
sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="/etc/security/keytabs/kafka_client.keytab" principal="replicator@EXAMPLE.COM";
sasl.kerberos.service.name=kafka
producer.sasl.mechanism=GSSAPI
producer.security.protocol=SASL_SSL
producer.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="/etc/security/keytabs/kafka_client.keytab" principal="replicator@EXAMPLE.COM";
producer.sasl.kerberos.service.name=kafka
```

For more information, see the general security configuration for Connect workers in [Kafka Connect Security Basics](../../../../connect/security.md#connect-security), and [Replicator Security Overview](../../../../multi-dc-deployments/replicator/index.md#replicator-security-overview).


### Schema Registry

Schema Registry uses Kafka to persist schemas, and so it acts as a client to write data to the Kafka cluster. Therefore, if the Kafka brokers are configured for security, you should also configure Schema Registry to use security.  You may also refer to the complete list of [Schema Registry configuration options](../../../../schema-registry/installation/config.md#schemaregistry-config).

1. Here is an example subset of `schema-registry.properties` configuration parameters to add for SASL authentication:
   ```bash
   kafkastore.bootstrap.servers=kafka1:9093
   # Configure SASL_SSL if TLS/SSL encryption is enabled, otherwise configure SASL_PLAINTEXT
   kafkastore.security.protocol=SASL_SSL
   kafkastore.sasl.mechanism=GSSAPI
   ```
2. Since you are using GSSAPI, configure a service name that matches the primary name of the Kafka server configured in the broker JAAS file.
   ```bash
   kafkastore.sasl.kerberos.service.name=kafka
   ```
3. Configure the JAAS configuration property with a unique principal, i.e., usually the same name as the user running Schema Registry, and keytab, i.e., secret key.
   ```bash
   kafkastore.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
      useKeyTab=true \
      storeKey=true \
      keyTab="/etc/security/keytabs/kafka_client.keytab" \
      principal="schemaregistry@EXAMPLE.COM";
   ```


#   If the validation is successful, brokers authenticate the incoming client.

1. After the client authentication is successful, the client principal is
   extracted from the token and used for authorization using RBAC or ACLs
   specified on the Confluent Platform cluster.
2. If the client principal extracted from the OAuth token has access policies
   specified for requested topic A, then the client request for access to
   topic A is successfully authorized.
3. After successful authentication and authorization, the Kafka client can
   proceed with Kafka operations, such as producing or consuming messages.

The same flow as described above (named the client credentials grant flow
in the OpenID Connect specification and RFC 6749) is also triggered when
OAuth/OIDC is enabled for authentication between clients (producers or
consumers) and other Confluent Platform services, such as Schema Registry or REST Proxy. Also, when
OAuth/OIDC is used for securing service to service communication between
all Confluent Platform services, for example, between Schema Registry and Confluent Server brokers or between
Confluent Server brokers to Schema Registry.

The SASL/OAUTHBEARER authentication flow provides the following benefits:

* The client identities are hosted on an OIDC-compliant identity provider,
  enabling centralized identity management and streamlined authentication.
* The use of short-lived tokens enhances security.
* [Fine-grained](../../../../_glossary.md#term-granularity) access control, using the tokens to
  carry specific permissions.


#### Use the configuration properties file

If you have used the producer API, consumer API, or Streams API with Kafka clusters
before, then you might be aware that the connectivity details to the cluster are
specified using configuration properties. While some users may recognize this
for applications developed to interact with Kafka, others might be unaware that the
administration tools that come with Kafka work the same way, meaning that after
you have defined the configuration properties (often in a form of a `config.properties`
file), either applications or tools will be able to connect to clusters.

When you create a configuration properties file in the user home directory,
any subsequent command that you issue (be sure to include the path for the
configuration file) reads that file and uses it to establish connectivity to
the Kafka cluster. The first thing you must do to interact with your Kafka clusters
using native Kafka tools is to generate a configuration properties file.

The `--command-config` argument supplies the Confluent CLI tools with the
configuration properties that they require to connect to the Kafka cluster, in the
`.properties` file format. Typically, this includes the `security.protocol`
that the cluster uses to connect and any information necessary to authenticate
to the cluster. For example:

```text
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
  username="alice" \
  password="s3cr3t";
```


#### IMPORTANT
There is no guarantee that this naming pattern will continue in future
releases, because it’s not part of the public API.

For best security, set only the minimum ACL operations. Allow only the
following operations for the Kafka Streams principal.

- Topic resource (for internal topics): READ, DELETE, WRITE, CREATE
- Consumer Group resource: READ, DESCRIBE
- Topic resource (for input topic): READ
- Topic resource (for output topic): WRITE

For example, given the following setup of your Kafka Streams application:

- Configuration `application.id` value is `team1-streams-app1`.
- Authenticating with the Kafka cluster as a `team1` user.
- The application’s coded topology reads from input topics `input-topic1` and `input-topic2`.
- The application’s topology write to output topics `output-topic1` and `output-topic2`.
- The application has Exactly-Once processing guarantee enabled `processing.guarantee=exactly_once`.

The following commands create the necessary ACLs in the Kafka cluster to allow
your application to operate:

```bash

#### Steps

1. Log in to Confluent Cloud with the command `confluent login`, and use your Confluent Cloud username and password. To prevent being logged out, use the `--save` argument to save your Confluent Cloud user login credentials or refresh token (in the case of SSO) to your home profile.
   ```bash
   confluent login --save
   ```
2. Start the end-to-end example by running the provided script.

   This example uses the [ccloud-stack utility for Confluent Cloud](/cloud/current/examples/ccloud/docs/ccloud-stack.html) to automatically
   create a stack of fully managed services in Confluent Cloud. By default, the `ccloud-stack` utility creates resources in a
   new Confluent Cloud environment in cloud provider `aws` in region `us-west-2`. If you want to reuse an existing Confluent Cloud
   environment, or if `aws` and `us-west-2` are not the target provider and region, you may configure
   [other ccloud-stack options](/cloud/current/examples/ccloud/docs/ccloud-stack.html#ccloud-stack-options) before you run this example.
   ```bash
   ./start-ccloud.sh
   ```
3. After starting the example, the microservices applications will be running locally and your Confluent Cloud instance will have Kafka topics with data in them.
   ![image](images/microservices-exercises-combined.png)
4. Sample topic data by running the following command, substituting your configuration file name with the file located in the `stack-configs` folder example (`java-service-account-12345.config`).
   ```bash
   source delta_configs/env.delta; CONFIG_FILE=/opt/docker/stack-configs/java-service-account-<service-account-id>.config ./read-topics-ccloud.sh
   ```
5. Explore the data with Elasticsearch and Kibana
   ![image](images/elastic-search-kafka.png)

   Full-text search is added via an Elasticsearch database connected through Kafka’s Connect API ([source](https://www.confluent.io/designing-event-driven-systems)). View the Kibana dashboard at [http://localhost:5601/app/kibana#/dashboard/Microservices](http://localhost:5601/app/kibana#/dashboard/Microservices)
   ![image](images/kibana_microservices.png)
6. View and monitor the streaming applications. Use the [Confluent Cloud Console](http://confluent.cloud) to explore topics, consumers, Stream Lineage, and the ksqlDB application.
   ![image](images/stream-lineage.png)
7. View the ksqlDB flow screen for the `ORDERS_ENRICHED` stream to observe events occurring and examine the stream’s schema.
   ![image](images/ksqldb-orders-flow.png)
8. When you are done, make sure to stop the example before proceeding to the exercises. Run the command below, where the `java-service-account-<service-account-id>.config` file matches the file in your `stack-configs` folder.
   ```bash
   ./stop-ccloud.sh stack-configs/java-service-account-sa-123456.config
   ```


## Change cluster settings for Dedicated clusters

The following table lists editable cluster settings for Dedicated clusters and their default parameter values.

| Parameter Name                                       | Default             | Editable   | More Info                                                                                            |
|------------------------------------------------------|---------------------|------------|------------------------------------------------------------------------------------------------------|
| [auto.create.topics.enable](#topic-creation)         | false               | Yes        |                                                                                                      |
| [ssl.enabled.protocols](#manage-tls-protocols)       | TLSv1.2             | Yes        | Options: `TLSv1.2`, `TLSv1.3`, or both.                                                              |
| [ssl.cipher.suites](#restrict-ciphers)               | “”                  | Yes        |                                                                                                      |
| [num.partitions](#default-partitions)                | 6                   | Yes        | Limits vary, see:
[Kafka Cluster Types in Confluent Cloud](cluster-types.md#cloud-cluster-types) |
| [log.cleaner.max.compaction.lag.ms](#lag-compaction) | 9223372036854775807 | Yes        | Min: `21600000` ms                                                                                   |
| [log.retention.ms](#log-retention)                   | 604800000           | Yes        | Set to -1 for Infinite Storage                                                                       |

To modify these settings, use the
[CLI](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/cluster/configuration/index.html#confluent-kafka-cluster-configuration)
or the [Kafka REST APIs](https://docs.confluent.io/cloud/current/api.html#operation/updateKafkaClusterConfig). For more
information, see [Get Started with Confluent CLI](https://docs.confluent.io/confluent-cli/current/overview.html)
or [Kafka REST API Quick Start for Confluent Cloud](../kafka-rest/krest-qs.md#cloud-rest-api-quickstart).

Changes to the settings are applied to your Confluent Cloud cluster without additional
action on your part and are persistent until the setting is explicitly changed again.


## Related content

- Learn how to use Confluent Cloud to create topics, produce and consume to a Kafka cluster with the [Quick Start for Confluent Cloud](../get-started/index.md#cloud-quickstart)
- For more information about Confluent CLI, see [confluent kafka cluster](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/cluster/index.html) in the Confluent CLI Command Reference
- For more information about Confluent Cloud APIs, see [Cluster API reference](https://docs.confluent.io/cloud/current/api.html#tag/Clusters-(cmkv2))
- For more information about creating a network using an API, see the [Networking API reference](https://docs.confluent.io/cloud/current/api.html#tag/Networks-(networkingv1))
- For more information about supported providers and regions, see [Cloud Providers and Regions for Confluent Cloud](regions.md#providers-regions)
- For more information about BYOK encrypted clusters, see [Protect Data at Rest Using Self-Managed Encryption Keys on Confluent Cloud](../security/encrypt/byok/overview.md#byok-encrypted-clusters)
- [How to use kafka-consumer-groups command with Confluent Cloud](https://support.confluent.io/hc/en-us/articles/360022562212) (Confluent Support)
- For cost estimates, see [Confluent Cost Estimator](https://www.confluent.io/pricing/cost-estimator/)
- For information about migrating from open source Kafka to Confluent Cloud, see the
  [Migrating from Kafka services to Confluent](https://assets.confluent.io/m/2745775bbd1fa224/original/20240425-EB-Migrating_From_Kafka_To_Confluent.pdf) PDF


## Quick Start

Use this quick start to get up and running with the Confluent Cloud AlloyDB sink
connector. The quick start provides the basics of selecting the connector and
configuring it to stream events to an AlloyDB database.


Prerequisites
: - Authorized access to a [Confluent Cloud](https://www.confluent.io/confluent-cloud/) cluster on Google Cloud.
  - Authorized access to a AlloyDB database via [AlloyDB Auth Proxy](https://cloud.google.com/alloydb/docs/auth-proxy/connect) running on an intermediary VM accessible over a public IP.
  - The database and Kafka cluster should be in the same region.
  - For networking considerations, see [Networking and DNS](overview.md#connect-internet-access-resources). To use a set of public egress IP addresses, see [Public Egress IP Addresses for Confluent Cloud Connectors](static-egress-ip.md#cc-static-egress-ips).
  - The Confluent CLI installed and configured for the cluster. See [Install the Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html).
  - [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.


  - Kafka cluster credentials. The following lists the different ways you can provide credentials.
    - Enter an existing [service account](service-account.md#s3-cloud-service-account) resource ID.
    - Create a Confluent Cloud [service account](service-account.md#s3-cloud-service-account) for the connector. Make sure to review the ACL entries required in the [service account documentation](service-account.md#s3-cloud-service-account). Some connectors have specific ACL requirements.
    - Create a Confluent Cloud API key and secret. To create a key and secret, you can use [confluent api-key create](https://docs.confluent.io/confluent-cli/current/command-reference/api-key/confluent_api-key_create.html) *or* you can autogenerate the API key and secret directly in the Cloud Console when setting up the connector.


## Quick Start

Use this quick start to get up and running with the Confluent Cloud Amazon DynamoDB Sink connector. The quick start provides the basics of selecting the connector
and configuring it to stream events to Amazon Redshift.


Prerequisites
: - Authorized access to a [Confluent Cloud](https://www.confluent.io/confluent-cloud/) cluster on Amazon Web Services.
  - The Confluent CLI installed and configured for the cluster. See [Install the Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html).
  - [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
  - Authorized access to AWS and the Amazon DynamoDB database. For more information, see [DynamoDB IAM policy](#cc-dynamodb-policy).
  - The database must be in the same region as your Confluent Cloud cluster.
  - For networking considerations, see [Networking and DNS](overview.md#connect-internet-access-resources). To use a set of public egress IP addresses, see [Public Egress IP Addresses for Confluent Cloud Connectors](static-egress-ip.md#cc-static-egress-ips).


  - Kafka cluster credentials. The following lists the different ways you can provide credentials.
    - Enter an existing [service account](service-account.md#s3-cloud-service-account) resource ID.
    - Create a Confluent Cloud [service account](service-account.md#s3-cloud-service-account) for the connector. Make sure to review the ACL entries required in the [service account documentation](service-account.md#s3-cloud-service-account). Some connectors have specific ACL requirements.
    - Create a Confluent Cloud API key and secret. To create a key and secret, you can use [confluent api-key create](https://docs.confluent.io/confluent-cli/current/command-reference/api-key/confluent_api-key_create.html) *or* you can autogenerate the API key and secret directly in the Cloud Console when setting up the connector.


# Process Data with Confluent Cloud for Apache Flink


* [Overview](overview.md)
* [Get Started](get-started/index.md)
  * [Overview](get-started/overview.md)
  * [Quick Start with Cloud Console](get-started/quick-start-cloud-console.md)
  * [Quick Start with SQL Shell in Confluent CLI](get-started/quick-start-shell.md)
  * [Quick Start with Java Table API](get-started/quick-start-java-table-api.md)
  * [Quick Start with Python Table API](get-started/quick-start-python-table-api.md)
* [Concepts](concepts/index.md)
  * [Overview](concepts/overview.md)
  * [Autopilot](concepts/autopilot.md)
  * [Batch and Stream Processing](concepts/batch-and-stream-processing.md)
  * [Billing](concepts/flink-billing.md)
  * [Comparison with Apache Flink](concepts/comparison-with-apache-flink.md)
  * [Compute Pools](concepts/compute-pools.md)
  * [Delivery Guarantees and Latency](concepts/delivery-guarantees.md)
  * [Determinism](concepts/determinism.md)
  * [Private Networking](concepts/flink-private-networking.md)
  * [Schema and Statement Evolution](concepts/schema-statement-evolution.md)
  * [Snapshot Queries](concepts/snapshot-queries.md)
  * [Statements](concepts/statements.md)
  * [Statement CFU Metrics](concepts/statement-cfu-metrics.md)
  * [Tables and Topics](concepts/dynamic-tables.md)
  * [Time and Watermarks](concepts/timely-stream-processing.md)
  * [User-defined Functions](concepts/user-defined-functions.md)
* [How-To Guides](how-to-guides/index.md)
  * [Overview](how-to-guides/overview.md)
  * [Aggregate a Stream in a Tumbling Window](how-to-guides/aggregate-tumbling-window.md)
  * [Combine Streams and Track Most Recent Records](how-to-guides/combine-and-track-most-recent-records.md)
  * [Compare Current and Previous Values in a Stream](how-to-guides/compare-current-and-previous-values.md)
  * [Convert the Serialization Format of a Topic](how-to-guides/convert-serialization-format.md)
  * [Create a UDF](how-to-guides/create-udf.md)
  * [Deduplicate Rows in a Table](how-to-guides/deduplicate-rows.md)
  * [Generate Custom Sample Data](how-to-guides/custom-sample-data.md)
  * [Handle Multiple Event Types](how-to-guides/multiple-event-types.md)
  * [Log Debug Messages in UDFs](how-to-guides/enable-udf-logging.md)
  * [Mask Fields in a Table](how-to-guides/mask-fields.md)
  * [Process Schemaless Events](how-to-guides/process-schemaless-events.md)
  * [Profile a Query](how-to-guides/profile-query.md)
  * [Resolve Statement Issues](how-to-guides/resolve-common-query-problems.md)
  * [Scan and Summarize Tables](how-to-guides/scan-and-summarize-tables.md)
  * [Run a Snapshot Query](how-to-guides/run-snapshot-query.md)
  * [Transform a Topic](how-to-guides/transform-topic.md)
  * [View Time Series Data](how-to-guides/view-time-series-data.md)
* [Operate and Deploy](operate-and-deploy/index.md)
  * [Overview](operate-and-deploy/overview.md)
  * [Carry-over Offsets](operate-and-deploy/carry-over-offsets.md)
  * [Deploy a Statement with CI/CD](operate-and-deploy/deploy-flink-sql-statement.md)
  * [Enable Private Networking](operate-and-deploy/private-networking.md)
  * [Generate a Flink API Key](operate-and-deploy/generate-api-key-for-flink.md)
  * [Grant Role-Based Access](operate-and-deploy/flink-rbac.md)
  * [Manage Compute Pools](operate-and-deploy/create-compute-pool.md)
  * [Manage Connections](operate-and-deploy/manage-connections.md)
  * [Monitor and Manage Statements](operate-and-deploy/monitor-statements.md)
  * [Move SQL Statements to Production](operate-and-deploy/best-practices.md)
  * [Profile Queries](operate-and-deploy/query-profiler.md)
  * [REST API](operate-and-deploy/flink-rest-api.md)
* [Flink Reference](reference/index.md)
  * [Overview](reference/overview.md)
  * [SQL Syntax](reference/sql-syntax.md)
  * [DDL Statements](reference/statements/index.md)
  * [DML Statements](reference/queries/index.md)
  * [Functions](reference/functions/index.md)
  * [Data Types](reference/datatypes.md)
  * [Data Type Mappings](reference/serialization.md)
  * [Time Zone](reference/timezone.md)
  * [Keywords](reference/keywords.md)
  * [Information Schema](reference/flink-sql-information-schema.md)
  * [Example Streams](reference/example-data.md)
  * [Supported Cloud Regions](reference/cloud-regions.md)
  * [SQL Examples](reference/sql-examples.md)
  * [Table API](reference/table-api.md)
  * [CLI Reference](reference/flink-sql-cli.md)
* [Get Help](get-help.md)
* [FAQ](flink-faq.md)


## [1/25/2023] Confluent CLI v3.0.0 Release Notes

Confluent CLI is now [source-available](https://github.com/confluentinc/cli) under the Confluent Community License.
For more details, check out the [Announcing the Source Available Confluent CLI](https://www.confluent.io/blog/announcing-the-source-available-confluent-cli/)
blog post.

**Breaking Changes**
: - Default to HTTPS for on-prem login with `confluent login`
  - Require acknowledgment before deleting any resource, and use `--force` flag to skip acknowledgment
  - Use correct and consistent JSON and YAML formatting across all commands
  - Remove leading “v” from archive names, so the format matches binary names
  - Place asterisk in “Current” column in `confluent api-key list`, `confluent environment list`, and `confluent kafka list`
  - Delete `confluent ksql app` commands in favor of `confluent ksql cluster` commands
  - In `confluent iam rbac role-binding list`, combine “Service Name” and “Pool Name” into “Name”
  - In `confluent iam rbac role-binding list`, require the `--inclusive` flag to list role bindings in nested scopes
  - Move Connect cluster management commands under `confluent connect cluster`
  - Prevent using numeric IDs for `confluent kafka acl` commands
  - Print a table instead of a list in `confluent schema-registry compatibility validate` and `confluent schema-registry config describe`
  - Remove the “KAPI” field, “API Endpoint” field, and the corresponding `--all` flag from `confluent kafka cluster describe`
  - Remove shorthand flags: `-D`, `-P`,  `-r`, `-S`, and `-V`
  - Rename “Exporter” to “exporter” in serialized output for `confluent schema-registry exporter list`
  - Rename “task” to “tasks” in `confluent connect cluster describe`
  - Rename `--current-env` to `--current-environment`
  - Rename `--no-auth` to `--no-authentication`
  - Rename `--operation` to `--operations` where appropriate
  - Rename `--refs` to `--references`
  - Rename `--show-refs` to `--show-references`
  - Rename `--sr-apikey` and `--sr-api-key` to `--schema-registry-api-key`
  - Rename `--sr-apisecret` and `--sr-api-secret` to `--schema-registry-api-secret`
  - Rename `--sr-endpoint` to `--schema-registry-endpoint`
  - Rename `confluent audit-log migrate config` to `confluent audit-log config migrate`
  - Rename `confluent kafka link describe` to `confluent kafka link configuration list`
  - Rename `confluent kafka partition get-reassignments` to `confluent kafka partition reassigments list`
  - Rename values for `confluent price list --cluster-type`
  - Replace all instances of “First Name” and “Last Name” with “Full Name”
  - Require `--environment` in `confluent schema-registry cluster delete`

**New Features**
: - Use Kafka REST for all `confluent kafka` commands
  - Remove login requirement for `confluent secret` commands
  - Add “Read Only” column to `confluent kafka topic describe`
  - Add detailed Kafka REST examples for `confluent kafka acl` commands


# Get Started with Confluent Cloud for Government

Confluent Cloud for Government is a data streaming service based on Apache Kafka® and delivered as a
fully-managed, cloud-native service. Use Confluent Cloud for Government to collect real-time data from
multiple sources and put it in motion. Confluent Cloud for Government provides multiple interfaces to manage your data
streams, including a web interface, a command-line interface, and APIs.

Use this quick start to create a Kafka [cluster](../_glossary.md#term-Kafka-cluster) and
a [topic](../_glossary.md#term-Kafka-topic). Connect a self-managed data source to the cluster and [produce](../_glossary.md#term-producer)
data to the topic. Then use the Cloud Console to review the incoming data, similar to
how a self-managed client would [consume](../_glossary.md#term-consumer) the data you’re producing. This quick start lists all the
interfaces available to complete a step, but you should only use the interface that best
fits your needs.


## Control Center resource access by role

All of the other components besides Control Center (Kafka, Schema Registry, Connect, ksqlDB)
are being enforced with RBAC by those components themselves. The only resources
that Control Center directly enforces with RBAC are:

- [License management](../installation/license.md#controlcenter-licenses) (global operation on Confluent Platform).
- [Broker metrics](../brokers.md#controlcenter-userguide-brokers) (system health per cluster).
- [Alerts](#c3-rbac-alerts-access) (global view for all clusters in a Control Center instance).

| Role Scope    | License management       | Broker metrics   | Alerts                   |
|---------------|--------------------------|------------------|--------------------------|
| SystemAdmin   | Yes <sup>[1](#id3)</sup> | Yes              | Yes                      |
| ClusterAdmin  | No                       | Yes              | Yes                      |
| Operator      | No                       | Yes              | Yes                      |
| ResourceOwner | No                       | No               | Yes <sup>[2](#id4)</sup> |
* **[1]** To access license management, the `SystemAdmin` role must be granted on the Kafka cluster running MDS.
* **[2]** Resource owners on a topic or consumer group can create a trigger or an action for that resource. They can also view the fired alert in the Alerts History and the Alerts REST API pages.


#### IMPORTANT
Starting with Confluent Platform version 8.0, ZooKeeper is no longer part of Confluent Platform.

| Confluent Component                                                       | Management Aspect             | Declarative API (CRD)                    | Confluent CLI                                                                                                                                 | Confluent REST API                                                                                                                                              |
|---------------------------------------------------------------------------|-------------------------------|------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Kafka                                                                     | Create, update, delete topics | kafkatopic CRD                           | [kafka topic](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/topic/index.html), kafka-topics.sh, kafka-configs.sh    | [Topic](https://docs.confluent.io/platform/current/kafka-rest/api.html#topic-v3)                                                                                |
| Kafka                                                                     | Simple ACLs                   | N/A                                      | [kafka acl](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/acl/index.html)                                           | [ACL](https://docs.confluent.io/platform/current/kafka-rest/api.html#acl-v3)                                                                                    |
| Kafka                                                                     | Delete, config update brokers | kafka CRD                                | [kafka broker](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/broker/index.html)                                     | [Configs](https://docs.confluent.io/platform/current/kafka-rest/api.html#configs-v3)                                                                            |
| Kafka                                                                     | Cluster Linking               | clusterlink CRD                          | [kafka link](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/broker/index.html)                                       | [Cluster Linking](https://docs.confluent.io/platform/current/kafka-rest/api.html#cluster-linking-v3)                                                            |
| Kafka                                                                     | Mirror topics                 | clusterlink CRD                          | [kafka mirror](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/mirror/index.html)                                     | [Create a cluster link](https://docs.confluent.io/cloud/current/multi-cloud/cluster-linking/cluster-links-cc.html#creating-a-cluster-link-through-the-rest-api) |
| Kafka                                                                     | View partitions               | N/A                                      | [kafka partition](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/partition/index.html)                               | [Partition](https://docs.confluent.io/platform/current/kafka-rest/api.html#partition-v3)                                                                        |
| Kafka                                                                     | Manage partition assignment   | Enable Self-Balancing Cluster capability | [kafka-reassign-partitions.sh](https://cwiki.apache.org/confluence/display/kafka/replication+tools#Replicationtools-4.ReassignPartitionsTool) | [Partition](https://docs.confluent.io/platform/current/kafka-rest/api.html#partition-v3)                                                                        |
| Kafka                                                                     | Manage consumer groups        | N/A                                      | [kafka-consumer-groups.sh](https://docs.confluent.io/platform/current/clients/consumer.html#ak-consumer-group-command-tool)                   | [Consumer Group](https://docs.confluent.io/platform/current/kafka-rest/api.html#consumer-group-v3)                                                              |
| Schema Registry                                                           | Schemas                       | Schema CRD                               | [schema-registry schema](https://docs.confluent.io/confluent-cli/current/command-reference/schema-registry/schema/)                           | [Schema Registry](https://docs.confluent.io/platform/current/schema-registry/develop/api.html)                                                                  |
| Schema Registry                                                           | Link schemas                  | SchemaExporter CRD                       | [schema-registry exporter](https://docs.confluent.io/confluent-cli/current/command-reference/schema-registry/exporter/)                       | [Schema Registry](https://docs.confluent.io/platform/current/schema-registry/schema-linking-cp.html#rest-apis)                                                  |
| Connect                                                                   | Connectors                    | Connector CRD                            | [connect plugin](https://docs.confluent.io/confluent-cli/current/command-reference/connect/plugin/)                                           | [Kafka Connect](https://docs.confluent.io/platform/current/connect/references/restapi.html)                                                                     |
| Kafka / MDS                                                               | Manage RBAC                   | ConfluentRoleBinding CRD                 | [iam rbac](https://docs.confluent.io/confluent-cli/current/command-reference/iam/rbac/)                                                       | [Confluent Metadata](https://docs.confluent.io/platform/current/security/rbac/mds-api.html)                                                                     |
| ZooKeeper, KRaft, Kafka, Schema Registry, ksqlDB, Connect, Control Center | Restart components            | Component CRDs                           | N/A                                                                                                                                           | N/A                                                                                                                                                             |
| ZooKeeper, KRaft, Kafka, Schema Registry, ksqlDB, Connect, Control Center | Delete deployments            | Component CRDs                           | N/A                                                                                                                                           | N/A                                                                                                                                                             |


# Confluent APIs for Confluent Platform

The Confluent APIs listed in this topic are a set of software interfaces that enable you to interact
with the enterprise features of Confluent Platform. You can use these APIs to build applications that can consume, produce, and process data in real-time.

Following are the APIs for Confluent Platform:

- [Confluent REST Proxy](../kafka-rest/index.md#kafkarest-intro)
- [Connect REST API](../connect/references/restapi.md#connect-userguide-rest)
- [Flink REST API](../flink/clients-api/rest.md#af-rest-api)
- [Schema Registry API](../schema-registry/develop/api.md#schemaregistry-api)
- [ksqlDB REST API](../ksqldb/developer-guide/ksqldb-rest-api/overview.md#ksqldb-rest-api)
- [Metadata API](../security/authorization/rbac/mds-api.md#mds-api)

Note that before using these APIs, you should review the [Confluent Public API Terms of Use](https://www.confluent.io/legal/confluent-public-api-terms-of-use/).


# Manage Self-Balancing Clusters


* [Overview](index.md)
  * [How Self-Balancing simplifies Kafka operations](index.md#how-sbc-simplifies-ak-operations)
  * [Self-Balancing vs. Auto Data Balancer](index.md#sbc-vs-adb)
  * [How it works](index.md#how-it-works)
    * [Architecture of a Self-Balancing cluster](index.md#architecture-of-a-sbc-cluster)
    * [Enabling Self-Balancing Clusters](index.md#enabling-sbc-long)
    * [What defines a “balanced” cluster and what triggers a rebalance?](index.md#what-defines-a-balanced-cluster-and-what-triggers-a-rebalance)
    * [What happens if the lead broker (controller) is removed or lost?](index.md#what-happens-if-the-lead-broker-controller-is-removed-or-lost)
    * [How do the brokers leverage Cruise Control?](index.md#how-do-the-brokers-leverage-cruise-control)
    * [What internal topics does the Self-Balancing Clusters feature create and use?](index.md#what-internal-topics-does-the-sbc-long-feature-create-and-use)
    * [Limitations](index.md#limitations)
  * [Configuration and monitoring](index.md#configuration-and-monitoring)
    * [Getting status on the balancer](index.md#getting-status-on-the-balancer)
    * [Using Control Center](index.md#using-c3-short)
    * [Kafka server properties and commands](index.md#ak-server-properties-and-commands)
    * [Metrics for monitoring a rebalance](index.md#metrics-for-monitoring-a-rebalance)
  * [Replica placement and rack configurations](index.md#replica-placement-and-rack-configurations)
    * [Racks](index.md#racks)
    * [Replica placement and multi-region clusters](index.md#replica-placement-and-multi-region-clusters)
    * [Capacity](index.md#capacity)
    * [Distribution](index.md#distribution)
    * [Debugging rebalance failures](index.md#debugging-rebalance-failures)
  * [Security considerations](index.md#security-considerations)
  * [Troubleshooting](index.md#troubleshooting)
    * [Self-Balancing options do not show up on Control Center](index.md#sbc-options-do-not-show-up-on-c3-short)
    * [Broker metrics are not displayed on Control Center](index.md#broker-metrics-are-not-displayed-on-c3-short)
    * [Consumer lag reflected on Control Center](index.md#consumer-lag-reflected-on-c3-short)
    * [Broker removal attempt fails during Self-Balancing initialization](index.md#broker-removal-attempt-fails-during-sbc-initialization)
    * [Broker removal cannot complete due to offline partitions](index.md#broker-removal-cannot-complete-due-to-offline-partitions)
    * [Too many excluded topics causes problems with Self-Balancing](index.md#too-many-excluded-topics-causes-problems-with-sbc)
    * [The balancer status for a KRaft controller hangs](index.md#the-balancer-status-for-a-kraft-controller-hangs)
  * [Related content](index.md#related-content)
* [Tutorial: Adding and Remove Brokers](sbc-tutorial.md)
  * [Configuring and starting controllers and brokers in KRaft mode](sbc-tutorial.md#configuring-and-starting-controllers-and-brokers-in-kraft-mode)
  * [Prerequisites](sbc-tutorial.md#prerequisites)
  * [Environment variables](sbc-tutorial.md#environment-variables)
  * [Configure Kafka brokers](sbc-tutorial.md#configure-ak-brokers)
    * [Create broker-0 to use a template for the other brokers](sbc-tutorial.md#create-broker-0-to-use-a-template-for-the-other-brokers)
    * [Enable the Metrics Reporter for Control Center](sbc-tutorial.md#enable-the-cmetric-for-c3-short)
    * [Configure replication factors for Self-Balancing](sbc-tutorial.md#configure-replication-factors-for-sbc)
    * [Verify that Self-Balancing is enabled](sbc-tutorial.md#verify-that-sbc-is-enabled)
    * [Save the file](sbc-tutorial.md#save-the-file)
    * [Create a basic configuration for a five-broker cluster](sbc-tutorial.md#create-a-basic-configuration-for-a-five-broker-cluster)
  * [Start Confluent Platform, create topics, and generate test data](sbc-tutorial.md#start-cp-create-topics-and-generate-test-data)
    * [Start the controller and brokers](sbc-tutorial.md#start-the-controller-and-brokers)
    * [Create a topic and test the cluster](sbc-tutorial.md#create-a-topic-and-test-the-cluster)
    * [What’s next](sbc-tutorial.md#what-s-next)
  * [(Optional) Install and configure Confluent Control Center](sbc-tutorial.md#optional-install-and-configure-c3)
    * [1. Download, extract, and configure Control Center](sbc-tutorial.md#download-extract-and-configure-c3-short)
    * [2. Configure Control Center with REST endpoints and advertised listeners](sbc-tutorial.md#configure-c3-short-with-rest-endpoints-and-advertised-listeners)
    * [3. Start Prometheus and (Control Center)](sbc-tutorial.md#start-prometheus-and-c3-short)
    * [4. Configure the controller and brokers to send metrics to Control Center with Prometheus](sbc-tutorial.md#configure-the-controller-and-brokers-to-send-metrics-to-c3-short-with-prometheus)
    * [5. Restart the controller and brokers](sbc-tutorial.md#restart-the-controller-and-brokers)
  * [Use the command line to test rebalancing](sbc-tutorial.md#use-the-command-line-to-test-rebalancing)
    * [List topics and generate data to your test topic](sbc-tutorial.md#list-topics-and-generate-data-to-your-test-topic)
    * [Verify status of brokers and topic data](sbc-tutorial.md#verify-status-of-brokers-and-topic-data)
    * [Remove a broker](sbc-tutorial.md#remove-a-broker)
    * [Add a broker (restart)](sbc-tutorial.md#add-a-broker-restart)
  * [Use Control Center to test rebalancing](sbc-tutorial.md#use-c3-short-to-test-rebalancing)
    * [Verify status of brokers and topic data](sbc-tutorial.md#id1)
    * [Remove a broker](sbc-tutorial.md#sbc-tutorial-c3-remove-broker)
    * [Add a broker (restart)](sbc-tutorial.md#id3)
  * [Shutdown and cleanup tasks](sbc-tutorial.md#shutdown-and-cleanup-tasks)
  * [(Optional) Running the other components](sbc-tutorial.md#optional-running-the-other-components)
  * [Related content](sbc-tutorial.md#related-content)
* [Configure](configuration-options.md)
  * [Self-Balancing configuration](configuration-options.md#sbc-configuration)
    * [confluent.balancer.enable](configuration-options.md#confluent-balancer-enable)
    * [confluent.balancer.heal.uneven.load.trigger](configuration-options.md#confluent-balancer-heal-uneven-load-trigger)
    * [confluent.balancer.heal.broker.failure.threshold.ms](configuration-options.md#confluent-balancer-heal-broker-failure-threshold-ms)
    * [confluent.balancer.throttle.bytes.per.second](configuration-options.md#confluent-balancer-throttle-bytes-per-second)
    * [confluent.balancer.disk.max.load](configuration-options.md#confluent-balancer-disk-max-load)
    * [confluent.balancer.max.replicas](configuration-options.md#confluent-balancer-max-replicas)
    * [confluent.balancer.exclude.topic.names](configuration-options.md#confluent-balancer-exclude-topic-names)
    * [confluent.balancer.exclude.topic.prefixes](configuration-options.md#confluent-balancer-exclude-topic-prefixes)
    * [confluent.balancer.topic.replication.factor](configuration-options.md#confluent-balancer-topic-replication-factor)
  * [Self-Balancing internal topics](configuration-options.md#sbc-internal-topics)
  * [Required Configurations for Control Center](configuration-options.md#required-configurations-for-c3-short)
    * [Configure REST Endpoints in the Control Center properties file](configuration-options.md#configure-rest-endpoints-in-the-c3-short-properties-file)
    * [Configure authentication for REST endpoints on Kafka brokers (Secure Setup)](configuration-options.md#configure-authentication-for-rest-endpoints-on-ak-brokers-secure-setup)
  * [Examples: Update broker configurations on the fly](configuration-options.md#examples-update-broker-configurations-on-the-fly)
    * [Enable or disable Self-Balancing](configuration-options.md#enable-or-disable-sbc)
    * [Set trigger condition for rebalance](configuration-options.md#set-trigger-condition-for-rebalance)
    * [Set or remove a custom throttle](configuration-options.md#set-or-remove-a-custom-throttle)
  * [Monitoring the balancer with kafka-rebalance-cluster](configuration-options.md#monitoring-the-balancer-with-kafka-rebalance-cluster)
    * [Get the balancer status](configuration-options.md#get-the-balancer-status)
    * [Get the workload optimization status (AnyUnevenLoad)](configuration-options.md#get-the-workload-optimization-status-anyunevenload)
  * [kafka-remove-brokers](configuration-options.md#kafka-remove-brokers)
    * [Flags](configuration-options.md#flags)
    * [Examples](configuration-options.md#examples)
    * [Broker removal phases](configuration-options.md#broker-removal-phases)
    * [Self-Balancing initialization](configuration-options.md#sbc-initialization)
    * [Broker removal task priority](configuration-options.md#broker-removal-task-priority)
  * [Related content](configuration-options.md#related-content)
* [Performance and Resource Usage](performance.md)
  * [Add brokers to expand a small cluster with a high partition count](performance.md#add-brokers-to-expand-a-small-cluster-with-a-high-partition-count)
    * [Test Description](performance.md#test-description)
    * [Cluster Configurations](performance.md#cluster-configurations)
    * [Performance Results](performance.md#performance-results)
  * [Test scalability of a large cluster with many partitions](performance.md#test-scalability-of-a-large-cluster-with-many-partitions)
    * [Test Description](performance.md#id1)
    * [Cluster Configurations](performance.md#id2)
    * [Performance Results](performance.md#id3)
  * [Repeatedly bounce the controller](performance.md#repeatedly-bounce-the-controller)
    * [Test Description](performance.md#id4)
    * [Cluster Configurations](performance.md#id5)
    * [Performance Results](performance.md#id6)
    * [Related content](performance.md#related-content)


## License types

The following table lists the Kafka and Confluent features and whether they are covered under the [Enterprise license](#cp-enterprise-subs-license) ,
a [Community license](https://www.confluent.io/confluent-community-license/) or an [Apache Kafka 2.0 license](https://github.com/apache/kafka/blob/trunk/LICENSE). For more information, see the
[Community license FAQ](https://www.confluent.io/confluent-community-license-faq/).

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:18px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:18px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-cdrw{background-color:#01CEDB;border-color:inherit;color:#ffffff;text-align:left;vertical-align:top}
.tg .tg-0lax{background-color:#f5f7ff;text-align:left;vertical-align:top;border-color:inherit}
.tg .tg-0pky{background-color:#f5f7ff;border-color:inherit;text-align:left;vertical-align:top}
.tg .tg-c6of{background-color:#f5f7ff;;border-color:inherit;text-align:left;vertical-align:top}
</style>
<table width=75% class="tg">
<thead>
  <tr>
    <th class="tg-cdrw" colspan="7">            Confluent Enterprise License for Confluent Platform subscription            </th>
  </tr>
</thead>
<tbody>
<tr>
    <td class="tg-0lax" colspan="5">Auto Data Balancer</td>
</tr>
<tr>
    <td class="tg-c6of" colspan="5">Confluent for Kubernetes</td>
  </tr>
<tr>
    <td class="tg-c6of" colspan="5">Confluent Replicator</td>
  </tr>
<tr>
    <td class="tg-0pky" rowspan="9">Confluent Server</td>
    <td class="tg-0pky" colspan="4">Cluster Linking</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="4">Multi-Region Clusters</td>
  </tr>
   <tr>
    <td class="tg-0pky" colspan="4">Role-based Access Control</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="4">Schema Registry Security Plug-in</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="4">Schema Validation</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="4">Secrets Protection</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="4">Self-Balancing Clusters</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="4">Structured Audit Logs</td>
  </tr>
 <tr>
    <td class="tg-0pky" colspan="4">Tiered Storage</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="5">Control Center</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="5">Health+</td>
  </tr>
  <tr>
    <td class="tg-0pky" rowspan="2">Kafka Connect</td>
    <td class="tg-0pky" colspan="4">Commercial Connectors</td>
  </tr>
    <tr>
    <td class="tg-0pky" colspan="4">Premium Connectors</td>
  </tr>
 <tr>
    <td class="tg-0pky" colspan="5">MQTT Proxy</td>
  </tr>
  </tr>
   <tr>
    <td class="tg-0pky" colspan="5">Schema Linking</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="5">Confluent Platform for Apache Flink</td>
  </tr>
  <tr>
    <td class="tg-cdrw" colspan="7">     Confluent Enterprise License for Customer-managed Confluent Platform for Confluent Cloud subscription     </td>
  </tr>
  <tr>
    <td class="tg-0lax">Kafka Connect Worker</td>
    <td class="tg-0lax" colspan="4">Commercial Connectors</td>
  </tr>
  <tr>
  <td class="tg-0lax" colspan="5">Control Center</td>
  </tr>
   <tr>
    <td class="tg-0lax" colspan="5">Confluent for Kubernetes</td>
  </tr>
  <tr>
    <td class="tg-0lax" colspan="5">Replicator</td>
  </tr>
  <tr>
    <td class="tg-cdrw" colspan="5">            Confluent Community License              </td>
  </tr>
  <tr>
    <td class="tg-0lax" colspan="5">Admin REST API</td>
  </tr>
  <tr>
  <td class="tg-0pky" colspan="5">Confluent CLI</td>
  </tr>
  <tr>
    <td class="tg-0lax">Kafka Connect</td>
    <td class="tg-0lax" colspan="4">Community-licensed Connectors</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="5">ksqlDB</td>
  </tr>
  <tr>
    <td class="tg-0lax" colspan="5">REST Proxy</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="5">Schema Registry</td>
  </tr>
  <tr>
    <td class="tg-cdrw" colspan="5">            Apache 2.0 License            </td>
  </tr>
  <tr>
    <td class="tg-0lax" colspan="5">Apache Kafka (with Connect and Streams)</td>
  </tr>
  <tr>
  <td class="tg-0lax" colspan="5">Ansible Playbooks</td>
  </tr>
   <tr>
    <td class="tg-0lax" colspan="5">Apache ZooKeeper</td>
  </tr>
  <tr>
    <td class="tg-0lax" colspan="5">Confluent Clients</td>
  </tr>
  <tr>
    <td class="tg-0pky" colspan="5">Open Source Connectors</td>
  </tr>
</tbody>
</table>


## Stream

`bootstrap.servers`
: A list of host/port pairs to use for establishing the initial connection to the Apache Kafka® cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping&mdash;this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form <code>host1:port1,host2:port2,…</code>. Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).


  * Type: string
  * Importance: high

`topic.regex.list`
: A comma-separated list of pairs of type `<kafka topic>:<regex>` that is used to map MQTT topics to Kafka topics.


  * Type: list
  * Valid Values: A list of pairs in the form `<kafka topic1>:<regex1>, <kafka topic2>:<regex2>, ...`
  * Importance: high

`stream.threads.num`
: Number of threads publishing records to Kafka


  * Type: int
  * Default: 1
  * Valid Values: [1,…]
  * Importance: high

`producer.buffer.memory`
: The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for max.block.ms after which it will throw an exception.This setting should correspond roughly to the total memory the producer will use, but is not a hard bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests.


  * Type: long
  * Default: 33554432
  * Valid Values: [0,…]
  * Importance: high

`producer.compression.type`
: The compression type for all data generated by the producer. The default is none (i.e. no compression). Valid  values are none, gzip, snappy, or lz4. Compression is of full batches of data, so the efficacy of batching will also impact the compression ratio (more batching means better compression).


  * Type: string
  * Default: none
  * Importance: high

`producer.batch.size`
: The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes. No attempt will be made to batch records larger than this size. Requests sent to brokers will contain multiple batches, one for each partition with data available to be sent. A small batch size will make batching less common and may reduce throughput (a batch size of zero will disable batching entirely). A very large batch size may use memory a bit more wastefully as we will always allocate a buffer of the specified batch size in anticipation of additional records.


  * Type: int
  * Default: 16384
  * Valid Values: [0,…]
  * Importance: medium

`producer.linger.ms`
: The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out.
  However in some circumstances the client may want to reduce the number of requests even under moderate load.
  This setting accomplishes this by adding a small amount of artificial delay&mdash;that is, rather than immediately
  sending out a record the producer will wait for up to the given delay to allow other records to be sent so that the
  sends can be batched together. This can be thought of as analogous to Nagle’s algorithm in TCP. This setting gives the
  upper bound on the delay for batching: once we get `batch.size` worth of records for a partition it will be sent
  immediately regardless of this setting, however if we have fewer than this many bytes accumulated for this partition
  we will ‘linger’ for the specified time waiting for more records to show up. This setting defaults to 0 (i.e. no delay).
  Setting linger.ms=5, for example, would have the effect of reducing the number of requests sent but would add up to 5ms
  of latency to records sent in the absence of load.


  * Type: long
  * Default: 0
  * Valid Values: [0,…]
  * Importance: medium

`producer.client.id`
: An id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.


  * Type: string
  * Default: “”
  * Importance: medium

`producer.send.buffer.bytes`
: The size of the TCP send buffer (SO_SNDBUF) to use when sending data. If the value is -1, the OS default will be used.


  * Type: int
  * Default: 131072
  * Valid Values: [-1,…]
  * Importance: medium

`producer.receive.buffer.bytes`
: The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If the value is -1, the OS default will be used.


  * Type: int
  * Default: 32768
  * Valid Values: [-1,…]
  * Importance: medium

`producer.max.request.size`
: The maximum size of a request in bytes. This setting will limit the number of record batches the producer will send in a single request to avoid sending huge requests. This is also effectively a cap on the maximum record batch size. Note that the server has its own cap on record batch size which may be different from this.


  * Type: int
  * Default: 1048576
  * Valid Values: [0,…]
  * Importance: medium

`producer.reconnect.backoff.ms`
: The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.


  * Type: long
  * Default: 50
  * Valid Values: [0,…]
  * Importance: low

`producer.reconnect.backoff.max.ms`
: The maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.


  * Type: long
  * Default: 1000
  * Valid Values: [0,…]
  * Importance: low

`producer.max.block.ms`
: The configuration controls how long KafkaProducer.send() and KafkaProducer.partitionsFor() will block.These methods can be blocked either because the buffer is full or metadata unavailable.Blocking in the user-supplied serializers or partitioner will not be counted against this timeout.


  * Type: long
  * Default: 60000
  * Valid Values: [0,…]
  * Importance: medium

`producer.request.timeout.ms`
: The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted. This should be larger than replica.lag.time.max.ms (a broker configuration) to reduce the possibility of message duplication due to unnecessary producer retries.


  * Type: int
  * Default: 30000
  * Valid Values: [0,…]
  * Importance: medium

`producer.metadata.max.age.ms`
: The period of time in milliseconds after which we force a refresh of metadata even if we haven’t seen any partition leadership changes to proactively discover any new brokers or partitions.


  * Type: long
  * Default: 300000
  * Valid Values: [0,…]
  * Importance: low

`producer.metrics.sample.window.ms`
: The window of time a metrics sample is computed over.


  * Type: long
  * Default: 30000
  * Valid Values: [0,…]
  * Importance: low

`producer.metrics.num.samples`
: The number of samples maintained to compute metrics.


  * Type: int
  * Default: 2
  * Valid Values: [1,…]
  * Importance: low

`producer.metrics.recording.level`
: The highest recording level for metrics.


  * Type: string
  * Default: INFO
  * Valid Values: [INFO, DEBUG]
  * Importance: low

`producer.metric.reporters`
: A list of classes to use as metrics reporters. Implementing the <code>org.apache.kafka.common.metrics.MetricsReporter</code> interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.


  * Type: list
  * Default: “”
  * Valid Values: [org.apache.kafka.common.config.ConfigDef$NonNullValidator@614eb172](mailto:org.apache.kafka.common.config.ConfigDef$NonNullValidator@614eb172)
  * Importance: low

`producer.max.in.flight.requests.per.connection`
: The maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this setting is set to be greater than 1 and there are failed sends, there is a risk of message re-ordering due to retries (i.e., if retries are enabled).


  * Type: int
  * Default: 5
  * Valid Values: [1,…]
  * Importance: low

`producer.connections.max.idle.ms`
: Close idle connections after the number of milliseconds specified by this config.


  * Type: long
  * Default: 540000
  * Importance: medium

`producer.partitioner.class`
: Partitioner class that implements the org.apache.kafka.clients.producer.Partitioner interface.


  * Type: class
  * Default: org.apache.kafka.clients.producer.internals.DefaultPartitioner
  * Importance: medium

`producer.interceptor.classes`
: A list of classes to use as interceptors. Implementing the org.apache.kafka.clients.producer.ProducerInterceptor interface allows you to intercept (and possibly mutate) the records received by the producer before they are published to the Kafka cluster. By default, there are no interceptors.


  * Type: list
  * Default: “”
  * Valid Values: [org.apache.kafka.common.config.ConfigDef$NonNullValidator@22b2223](mailto:org.apache.kafka.common.config.ConfigDef$NonNullValidator@22b2223)
  * Importance: low

`producer.security.protocol`
: Protocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.


  * Type: string
  * Default: PLAINTEXT
  * Importance: medium

`producer.ssl.protocol`
: The TLS protocol used to generate the SSLContext. The default is `TLSv1.3`
  when running with Java 11 or newer, `TLSv1.2` otherwise. This value should
  be fine for most use cases. Allowed values in recent JVMs are `TLSv1.2` and
  `TLSv1.3`. `TLS`, `TLSv1.1`, `SSL`, `SSLv2` and `SSLv3` might be
  supported in older JVMs, but their usage is discouraged due to known security
  vulnerabilities. With the default value for this configuration and `ssl.enabled.protocols`,
  clients downgrade to `TLSv1.2` if the server does not support `TLSv1.3`.
  If this configuration is set to `TLSv1.2`, clients do not use `TLSv1.3`,
  even if it is one of the values in `ssl.enabled.protocols` and the server
  only supports `TLSv1.3`.


  * Type: string
  * Default: `TLSv1.3`
  * Importance: medium

`producer.ssl.provider`
: The name of the security provider used for TLS connections. Default value is the default security provider of the JVM.


  * Type: string
  * Default: null
  * Importance: medium

`producer.ssl.cipher.suites`
: A list of cipher suites. This is a named combination of authentication,
  encryption, MAC, and key exchange algorithms used to negotiate the security
  settings for a network connection using TLS. By default, all the available
  cipher suites are supported.


  * Type: list
  * Default: null
  * Importance: low

`producer.ssl.enabled.protocols`
: The comma-separated list of protocols enabled for TLS connections. The default
  value is `TLSv1.2,TLSv1.3` when running with Java 11 or later, `TLSv1.2`
  otherwise. With the default value for Java 11 (`TLSv1.2,TLSv1.3`), Kafka
  clients and brokers prefer TLSv1.3 if both support it, and falls back to
  TLSv1.2 otherwise (assuming both support at least TLSv1.2).


  * Type: list
  * Default: `TLSv1.2,TLSv1.3`
  * Importance: medium

`producer.ssl.keystore.type`
: The file format of the key store file. This is optional for client.


  * Type: string
  * Default: JKS
  * Importance: medium

`producer.ssl.keystore.location`
: The location of the key store file. This is optional for client and can be used for two-way client authentication.


  * Type: string
  * Default: null
  * Importance: high

`producer.ssl.keystore.password`
: The store password for the key store file. This is optional for client and only needed if ssl.keystore.location is configured.


  * Type: password
  * Default: null
  * Importance: high

`producer.ssl.key.password`
: The password of the private key in the key store file. This is optional for client.


  * Type: password
  * Default: null
  * Importance: high

`producer.ssl.truststore.type`
: The file format of the trust store file.


  * Type: string
  * Default: JKS
  * Importance: medium

`producer.ssl.truststore.location`
: The location of the trust store file.


  * Type: string
  * Default: null
  * Importance: high

`producer.ssl.truststore.password`
: The password for the trust store file. If a password is not set access to the truststore is still available, but integrity checking is disabled.


  * Type: password
  * Default: null
  * Importance: high

`producer.ssl.keymanager.algorithm`
: The algorithm used by key manager factory for TLS connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.


  * Type: string
  * Default: SunX509
  * Importance: low

`producer.ssl.trustmanager.algorithm`
: The algorithm used by trust manager factory for TLS connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.


  * Type: string
  * Default: PKIX
  * Importance: low

`producer.ssl.endpoint.identification.algorithm`
: The endpoint identification algorithm to validate server hostname using server certificate.


  * Type: string
  * Default: https
  * Importance: low

`producer.ssl.secure.random.implementation`
: The SecureRandom PRNG implementation to use for TLS cryptography operations.


  * Type: string
  * Default: null
  * Importance: low

`producer.sasl.kerberos.service.name`
: The Kerberos principal name that Kafka runs as. This can be defined either in Kafka’s JAAS config or in Kafka’s config.


  * Type: string
  * Default: null
  * Importance: medium

`producer.sasl.kerberos.kinit.cmd`
: Kerberos kinit command path.


  * Type: string
  * Default: /usr/bin/kinit
  * Importance: low

`producer.sasl.kerberos.ticket.renew.window.factor`
: Login thread will sleep until the specified window factor of time from last refresh to ticket’s expiry has been reached, at which time it will try to renew the ticket.


  * Type: double
  * Default: 0.8
  * Importance: low

`producer.sasl.kerberos.ticket.renew.jitter`
: Percentage of random jitter added to the renewal time.


  * Type: double
  * Default: 0.05
  * Importance: low

`producer.sasl.kerberos.min.time.before.relogin`
: Login thread sleep time between refresh attempts.


  * Type: long
  * Default: 60000
  * Importance: low

`producer.sasl.login.refresh.window.factor`
: Login refresh thread will sleep until the specified window factor relative to the credential’s lifetime has been reached, at which time it will try to refresh the credential. Legal values are between 0.5 (50%) and 1.0 (100%) inclusive; a default value of 0.8 (80%) is used if no value is specified. Currently applies only to OAUTHBEARER.


  * Type: double
  * Default: 0.8
  * Valid Values: [0.5,…,1.0]
  * Importance: low

`producer.sasl.login.refresh.window.jitter`
: The maximum amount of random jitter relative to the credential’s lifetime that is added to the login refresh thread’s sleep time. Legal values are between 0 and 0.25 (25%) inclusive; a default value of 0.05 (5%) is used if no value is specified. Currently applies only to OAUTHBEARER.


  * Type: double
  * Default: 0.05
  * Valid Values: [0.0,…,0.25]
  * Importance: low

`producer.sasl.login.refresh.min.period.seconds`
: The desired minimum time for the login refresh thread to wait before refreshing a credential, in seconds. Legal values are between 0 and 900 (15 minutes); a default value of 60 (1 minute) is used if no value is specified.  This value and  sasl.login.refresh.buffer.seconds are both ignored if their sum exceeds the remaining lifetime of a credential. Currently applies only to OAUTHBEARER.


  * Type: short
  * Default: 60
  * Valid Values: [0,…,900]
  * Importance: low

`producer.sasl.login.refresh.buffer.seconds`
: The amount of buffer time before credential expiration to maintain when refreshing a credential, in seconds. If a refresh would otherwise occur closer to expiration than the number of buffer seconds then the refresh will be moved up to maintain as much of the buffer time as possible. Legal values are between 0 and 3600 (1 hour); a default value of  300 (5 minutes) is used if no value is specified. This value and sasl.login.refresh.min.period.seconds are both ignored if their sum exceeds the remaining lifetime of a credential. Currently applies only to OAUTHBEARER.


  * Type: short
  * Default: 300
  * Valid Values: [0,…,3600]
  * Importance: low

`producer.sasl.mechanism`
: SASL mechanism used for client connections. This may be any mechanism for which a security provider is available. GSSAPI is the default mechanism.


  * Type: string
  * Default: GSSAPI
  * Importance: medium

`producer.sasl.jaas.config`
: JAAS login context parameters for SASL connections in the format used by JAAS configuration files. JAAS configuration file format is described here. The format for the value is: ‘<code>loginModuleClass controlFlag (optionName=optionValue)\*;</code>’. For brokers, the config must be prefixed with listener prefix and SASL mechanism name in lower-case. For example, listener.name.sasl_ssl.scram-sha-256.sasl.jaas.config=com.example.ScramLoginModule required;


  * Type: password
  * Default: null
  * Importance: medium

`producer.sasl.client.callback.handler.class`
: The fully qualified name of a SASL client callback handler class that implements the AuthenticateCallbackHandler interface.


  * Type: class
  * Default: null
  * Importance: medium

`producer.sasl.login.callback.handler.class`
: The fully qualified name of a SASL login callback handler class that implements the AuthenticateCallbackHandler interface. For brokers, login callback handler config must be prefixed with listener prefix and SASL mechanism name in lower-case. For example, listener.name.sasl_ssl.scram-sha-256.sasl.login.callback.handler.class=com.example.CustomScramLoginCallbackHandler


  * Type: class
  * Default: null
  * Importance: medium

`producer.sasl.login.class`
: The fully qualified name of a class that implements the Login interface. For brokers, login config must be prefixed with listener prefix and SASL mechanism name in lower-case. For example, listener.name.sasl_ssl.scram-sha-256.sasl.login.class=com.example.CustomScramLogin


  * Type: class
  * Default: null
  * Importance: medium


## Features

The following functionality is currently exposed and available through Confluent REST APIs.

* **Metadata** - Most metadata about the cluster – brokers, topics,
  partitions, and configs – can be read using `GET` requests for the
  corresponding URLs.
* **Producers** - Instead of exposing producer objects, the API accepts produce
  requests targeted at specific topics or partitions and routes them all through
  a small pool of producers.
  * Producer configuration - Producer instances are shared, so configs cannot
    be set on a per-request basis. However, you can adjust settings globally by
    passing new producer settings in the REST Proxy configuration. For example,
    you might pass in the `compression.type` option to enable site-wide
    compression to reduce storage and network overhead.
* **Consumers** - Consumers are stateful and therefore tied to specific REST Proxy instances. Offset
  commit can be either automatic or explicitly requested by the user. Currently limited to
  one thread per consumer; use multiple consumers for higher throughput. The REST Proxy uses either the high level consumer (v1 api) or the
  new 0.9 consumer (v2 api) to implement consumer-groups that can read from topics. Note: the v1 API has been marked for deprecation.
  * Consumer configuration - Although consumer instances are not shared, they do
    share the underlying server resources. Therefore, limited configuration
    options are exposed via the API. However, you can adjust settings globally
    by passing consumer settings in the REST Proxy configuration.
* **Data Formats** - The REST Proxy can read and write data using JSON, raw bytes
  encoded with base64 or using JSON-encoded Avro, Protobuf, or JSON Schema. With Avro, Protobuf, or
  JSON Schema, schemas are registered and validated against Schema Registry.
* **REST Proxy Clusters and Load Balancing** - The REST Proxy is designed to
  support multiple instances running together to spread load and can safely be
  run behind various load balancing mechanisms (e.g. round robin DNS, discovery
  services, load balancers) as long as instances are
  [configured correctly](production-deployment/rest-proxy/index.md#kafkarest-deployment).
* **Simple Consumer** - The high-level consumer should generally be
  preferred. However, it is occasionally useful to use low-level read
  operations, for example to retrieve messages at specific offsets.


# ksqlDB for Confluent Platform Java Client

ksqlDB ships with a lightweight Java client that enables sending
requests easily to a ksqlDB server from within your Java application, as
an alternative to using the [REST API](../ksqldb-rest-api/rest-api-reference.md#ksqldb-rest-api-reference). The
client supports pull and push queries; inserting new rows of data into existing
ksqlDB streams; creation and management of new streams, tables, and
persistent queries; and also admin operations such as listing streams,
tables, and topics.


The client sends requests to the HTTP2 server endpoints.
Pull and push queries are served by the
[/query-stream endpoint](../ksqldb-rest-api/streaming-endpoint.md#ksqldb-rest-api-query-stream-endpoint),
and inserts are served by the
[/inserts-stream endpoint](../ksqldb-rest-api/streaming-endpoint.md#ksqldb-rest-api-query-stream-endpoint-inserting-rows).
All other requests are served by the
[/ksql endpoint](../ksqldb-rest-api/ksql-endpoint.md#ksqldb-rest-api-ksql-endpoint).
The client is compatible only with ksqlDB deployments that are on
version 0.10.0 or later.

Use the Java client to:

- [Receive query results one row at a time (streamQuery())](#ksqldb-developer-guide-java-client-streamquery)
- [Receive query results in a single batch (executeQuery())](#ksqldb-developer-guide-java-client-executequery)
- [Terminate a push query (terminatePushQuery())](#ksqldb-developer-guide-java-client-terminatepushquery)
- [Insert a new row into a stream (insertInto())](#ksqldb-developer-guide-java-client-insertinto)
- [Insert new rows in a streaming fashion (streamInserts())](#ksqldb-developer-guide-java-client-streaminserts)
- [Create and manage new streams, tables, and persistent queries (executeStatement())](#ksqldb-developer-guide-java-client-executestatement)
- [List streams, tables, topics, and queries](#ksqldb-developer-guide-java-client-admin-operations)
- [Describe specific streams and tables](#ksqldb-developer-guide-java-client-describe-source)
- [Get metadata about the ksqlDB cluster](#ksqldb-developer-guide-java-client-server-info)
- [Manage, list and describe connectors](#ksqldb-developer-guide-java-client-connector-operations)
- [Define variables for substitution](#ksqldb-developer-guide-java-client-variable-substitution)
- [Execute Direct HTTP Requests](#ksqldb-developer-guide-java-client-direct-http-requests)
- [Assert the existence of a topic or schema](#ksqldb-developer-guide-java-client-assert-topics-schemas)

Get started below or skip to the end for full
[examples](#ksqldb-developer-guide-java-client-tutorials).


#### IMPORTANT
- You can revert a promoted or failed-over topic back into a mirror topic by using `truncate-and-restore`. This command is only available only on [“bidirectional” links](#bidirectional-linking-cp),
  and only in KRaft mode. To learn more about running Kafka in KRaft mode, see [KRaft Overview for Confluent Platform](../../kafka-metadata/kraft.md#kraft-overview), [KRaft Configuration for Confluent Platform](../../kafka-metadata/config-kraft.md#configure-kraft), and the [Platform Quick Start](../../get-started/platform-quickstart.md#cp-quickstart-step-1).
  Also, the [basic Cluster Linking tutorial](topic-data-sharing.md#tutorial-topic-data-sharing) includes a full walkthrough of how to run Cluster Linking in KRaft mode.
- You can run `mirror describe` (`confluent kafka mirror describe <mirror-topic-name> --link <link>`) on a promoted or
  failed over mirror topic, if you do not delete the cluster link. If you delete the cluster link, you will lose the history and,
  therefore, `mirror describe` will not find data on promoted or failed over topics.
- There is no way to change a mirror topic to use a different cluster link or make changes to the link itself, other than to recreate the mirror topic on a different link.
- You cannot delete a cluster link that still has mirror topics on it (the delete operation will fail).
- If you are using Confluent for Kubernetes (CFK), and you delete your cluster link resource, any mirror topics still attached to that cluster link
  will be forcibly converted to regular topics by use of the `failover` API. To learn more, see
  [Modify a mirror topic](https://docs.confluent.io/operator/current/co-link-clusters.html#modify-a-mirror-topic)
  in [Cluster Linking using Confluent for Kubernetes](https://docs.confluent.io/operator/current/co-link-clusters.html#).


## Related Content

Schema Linking is the recommended way to migrate schemas on Confluent Platform 7.0.0 or newer releases.
[Schema Linking on Confluent Platform](../schema-linking-cp.md#schema-linking-cp-overview)
[Confluent Cloud](/cloud/current/index.html)

These more general topics are helpful for understanding how Schema Registry and schemas are managed on multi data center deployments.
- [Confluent Cloud](/cloud/current/index.html)
- [Replicator Schema Translation Example for Confluent Platform](../../multi-dc-deployments/replicator/replicator-schema-translation.md#quickstart-demos-replicator-schema-translation)
- [Quick Start for Schema Management on Confluent Cloud](/cloud/current/get-started/schema-registry.html)
- [Schema Registry Configuration Reference for Confluent Platform](config.md#schemaregistry-config)
- [Overview of Multi-Datacenter Deployment Solutions on Confluent Platform](../../multi-dc-deployments/index.md#multi-dc)
- [Schema Registry API Reference](../develop/api.md#schemaregistry-api)

To learn more, see these sections in [Replicator Configuration Reference for Confluent Platform](../../multi-dc-deployments/replicator/configuration_options.md#replicator-config-options):

- [Source Topics](../../multi-dc-deployments/replicator/configuration_options.md#rep-source-topics)
- [Destination Topics](../../multi-dc-deployments/replicator/configuration_options.md#rep-destination-topics)
- [Schema Translation](../../multi-dc-deployments/replicator/configuration_options.md#schema-translation)

Finally, [Replicator Schema Translation Example for Confluent Platform](../../multi-dc-deployments/replicator/replicator-schema-translation.md#quickstart-demos-replicator-schema-translation) shows a demo of migrating schemas across self-managed, on-premises clusters, using the legacy Replicator methods.


# Authorization in Confluent Platform

* [Overview](overview.md)
* [Access Control Lists](acls/index.md)
  * [Overview](acls/overview.md)
  * [Manage ACLs](acls/manage-acls.md)
* [Role-Based Access Control](rbac/index.md)
  * [Overview](rbac/overview.md)
  * [Quick Start](rbac/rbac-cli-quickstart.md)
  * [Predefined RBAC Roles](rbac/rbac-predefined-roles.md)
  * [Cluster Identifiers](rbac/rbac-get-cluster-ids.md)
  * [Example of Enabling RBAC](rbac/cp-rbac-example.md)
  * [Enable RBAC on Running Cluster](rbac/enable-rbac-running-cluster.md)
  * [Use mTLS with RBAC](rbac/mtls-rbac.md)
  * [Configure mTLS with RBAC](rbac/configure-mtls-rbac.md)
  * [Deployment Patterns for mTLS with RBAC](rbac/mtls-rbac-options.md)
  * [Client Flow for OAuth-OIDC using RBAC](rbac/client-flow-oauth-oidc-and-rbac.md)
  * [Migrate LDAP to OAuth for RBAC](rbac/migrate-ldap-to-oauth-for-rbac.md)
  * [Migrate LDAP to mTLS for RBAC](rbac/migrate-ldap-to-mtls.md)
  * [RBAC using REST API](rbac/rbac-config-using-rest-api.md)
  * [Use Centralized ACLs with MDS for Authorization](rbac/authorization-acl-with-mds.md)
  * [Request Forwarding with mTLS RBAC](rbac/request-forwarding-mtls-rbac.md)
  * [Deploy Secure ksqlDB with RBAC](rbac/ksql-rbac.md)
  * [Metadata API](rbac/mds-api.md)
* [LDAP Group-Based Authorization](ldap/index.md)
  * [Configure LDAP Group-Based Authorization](ldap/configure.md)
  * [LDAP Configuration Reference](ldap/ldap-config-ref.md)
  * [Tutorial: Group-Based Authorization Using LDAP](ldap/quickstart.md)
  * [Configure Confluent Server Authorizer in Confluent Platform](../csa-introduction.md)


## Kafka 101

Kafka Streams is, by deliberate design, tightly integrated with Apache Kafka®:  many capabilities of Kafka Streams such
as its [stateful processing features](architecture.md#streams-architecture-state), its
[fault tolerance](architecture.md#streams-architecture-fault-tolerance), and its
[processing guarantees](#streams-concepts-processing-guarantees) are built on top of functionality provided by
Apache Kafka®’s storage and messaging layer.  It is therefore important to familiarize yourself with the key concepts of Kafka,
notably the sections [Getting Started](/kafka/get-started.html) and
[Design](/kafka/design/index.html).
In particular you should understand:

* **The who’s who:** Kafka distinguishes **producers**, **consumers**, and **brokers**.  In short, producers publish
  data to Kafka brokers, and consumers read published data from Kafka brokers.  Producers and consumers are totally
  decoupled, and both run outside the Kafka brokers in the perimeter of a Kafka cluster.  A Kafka **cluster** consists
  of one or more brokers.  An application that uses the Kafka Streams API acts as both a producer and a consumer.
* **The data:**  Data is stored in **topics**.  The topic is the most important abstraction provided by Kafka:  it is a
  category or feed name to which data is published by producers.  Every topic in Kafka is split into one or more
  **partitions**.  Kafka partitions data for storing, transporting, and replicating it.  Kafka Streams partitions data
  for processing it.  In both cases, this partitioning enables elasticity, scalability, high performance, and
  fault tolerance.
* **Parallelism:** Partitions of Kafka topics, and especially their number for a given topic, are also the main factor
  that determines the parallelism of Kafka with regards to reading and writing data.  Because of the tight integration
  with Kafka, the parallelism of an application that uses the Kafka Streams API is primarily depending on Kafka’s
  parallelism.


## A Closer Look

Before you dive into the [Concepts](concepts.md#streams-concepts) and
[Architecture](architecture.md#streams-architecture), get your feet wet by walking
through [your first Kafka Streams application](https://developer.confluent.io/tutorials/creating-first-apache-kafka-streams-application/confluent.html),
let’s take a closer look.

A key motivation of the Kafka Streams API is to bring stream processing out of the Big Data niche into the world of
mainstream application development, and to radically improve the developer and operations experience by
[making stream processing simple and easy](http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple).
Using the Kafka Streams API you can implement standard Java applications to solve your stream processing needs –
whether at small or at large scale – and then run these applications on client machines at the perimeter of your
Kafka cluster.  Your applications are fully elastic:  you can run one or more instances of your application, and they
will automatically discover each other and collaboratively process the data.  Your applications are also fault-tolerant:
if one of the instances dies, then the remaining instances will automatically take over its work – without any data
loss!  Deployment-wise, you are free to choose from any technology that can deploy Java applications, including
but not limited to Puppet, Chef, Ansible, Docker, Mesos, YARN, Kubernetes, and so on.
This lightweight and integrative approach of the Kafka Streams API – “Build applications, not infrastructure!” – is in
stark contrast to other stream processing tools that require you to install and operate separate processing clusters and
similar heavy-weight infrastructure that come with their own special set of rules on how to use and interact with them.

The following list highlights [several key capabilities and aspects](architecture.md#streams-architecture) of the Kafka Streams
API that make it a compelling choice for use cases such as microservices, event-driven systems, reactive applications,
and continuous queries and transformations.

**Powerful**
: * Makes your applications highly scalable, elastic, distributed, fault-tolerant
  * Supports exactly-once processing semantics
  * Stateful and stateless processing
  * Event-time processing with windowing, joins, aggregations
  * Supports [Kafka Streams Interactive Queries for Confluent Platform](developer-guide/interactive-queries.md#streams-developer-guide-interactive-queries) to unify the worlds of streams
    and databases
  * Choose between a [declarative, functional API](developer-guide/dsl-api.md#streams-developer-guide-dsl) and a lower-level
    [imperative API](developer-guide/processor-api.md#streams-developer-guide-processor-api) for maximum control and flexibility

**Lightweight**
: * Low barrier to entry
  * Equally viable for small, medium, large, and very large use cases
  * Smooth path from local development to large-scale production
  * No processing cluster required
  * No external dependencies other than Kafka

**Fully integrated**
: * 100% compatible with Kafka 0.11.0 and 1.0.0
  * Easy to integrate into existing applications and microservices
  * No artificial rules for packaging, deploying, and monitoring your applications
  * Runs everywhere: on-premises, public clouds, private clouds, containers, etc.
  * Integrates with databases through continuous change data capture (CDC) performed by
    [Kafka Connect](../connect/index.md#kafka-connect)

**Real-time**
: * Millisecond processing latency
  * Record-at-a-time processing (no micro-batching)
  * Seamlessly handles out-of-order data
  * High throughput

**Secure**
: * Supports [encryption of data-in-transit](developer-guide/security.md#streams-developer-guide-security)
  * Supports [authentication and authorization](developer-guide/security.md#streams-developer-guide-security)

In summary, the Kafka Streams API is a compelling choice for building mission-critical stream processing applications
and microservices. Give it a try with this step-by-step tutorial to build your first [Kafka Streams application](https://developer.confluent.io/tutorials/creating-first-apache-kafka-streams-application/confluent.html)!
The next sections, [Concepts](concepts.md#streams-concepts), [Architecture](architecture.md#streams-architecture),
and the [Developer Guide](developer-guide/overview.md#streams-developer-guide) will help to get you started.


### Configure passwordless OAuth authentication

Starting with version 8.0, Confluent Ansible supports client assertion for Confluent Platform, a
secure credential management with passwordless authentication. It uses
asymmetric encryption-based authentication, extending Confluent Platform OAuth, and
allows you to:

* Avoid deploying username, password while securing Confluent Platform.
* Streamline and automate client credential rotation on a periodic basis without
  manual intervention for the client applications.

In Confluent Ansible 8.0, OAuth client assertion is not supported for Confluent Control Center.


To configure client assertion on Confluent Platform components:

1. Enable client assertion for Confluent Platform components using the following variables:

   Kafka broker (`kafka_broker_`) and KRaft controller
   (`kafka_controller_`) inherit the superuser properties
   (`oauth_superuser_`) if not set.
   ```yaml
   oauth_superuser_oauth_client_assertion_enabled: true
   kafka_broker_oauth_client_assertion_enabled: true
   kafka_controller_oauth_client_assertion_enabled: true
   schema_registry_oauth_client_assertion_enabled: true
   kafka_connect_oauth_client_assertion_enabled: true
   ksql_oauth_client_assertion_enable: true
   kafka_rest_oauth_client_assertion_enable: true
   kafka_connect_replicator_oauth_client_assertion_enable: true
   kafka_connect_replicator_producer_oauth_client_assertion_enable: true
   kafka_connect_replicator_erp_oauth_client_assertion_enable: true
   kafka_connect_replicator_consumer_erp_oauth_client_assertion_enable: true
   ```
2. Set other dependent variables listed below. Refer to the previous step for
   `<component_prefix>`.
   ```yaml
   <component_prefix>_oauth_user:  //client ID, currently in use.
   <component_prefix>_oauth_client_assertion_issuer:
   <component_prefix>_oauth_client_assertion_sub:
   <component_prefix>_oauth_client_assertion_audience:
   <component_prefix>_oauth_client_assertion_private_key_file:
   <component_prefix>_oauth_client_assertion_template_file: //optional
   <component_prefix>_client_assertion_private_key_passphrase: //optional
   <component_prefix>_oauth_client_assertion_jti_include: //optional
   <component_prefix>_oauth_client_assertion_nbf_include: //optional
   ```

   Example configurations:
   ```yaml
   ksql_oauth_client_assertion_enabled: true
   ksql_oauth_client_assertion_issuer: ksql
   ksql_oauth_client_assertion_audience: https://oauth1:8443/realms/cp-ansible-realm
   ksql_oauth_client_assertion_private_key_file: "my-tokenKeypair.pem"
   ```

   Currently, there is no first-class support for the properties listed below,
   which are optional fields in OAuth and also in client assertion. You can set
   them using custom properties, `<component_prefix>_custome_properties`.
   ```yaml
   kafka_broker_custom_properties:
     *.login.connect.timeout.ms
     *.login.read.timeout.ms
     *.login.retry.backoff.max.ms
     *.login.retry.backoff.ms
   ```


## Cluster upgrades and error handling

Confluent Cloud regularly updates clusters to perform [upgrades and maintenance](../release-notes/upgrade-policy.md#minor-ccloud-upgrade). During this process, Confluent performs
[rolling restarts](../_glossary.md#term-rolling-restart) of all the brokers in a cluster. The Kafka protocol
and architecture are designed for this type of highly-available, fault-tolerant
operation. To ensure seamless client handling of cluster updates, you must
configure your clients using [current client libraries](overview.md#client-support-matrix).

Confluent recommends you use the strategies for error handling outlined below.
During normal cluster operations that use a rolling restart, clients may
encounter the following warning exceptions:

```none
UNKNOWN_TOPIC_OR_PARTITION: "This server does not host this topic-partition."
```

```none
LEADER_NOT_AVAILABLE: "There is no leader for this topic-partition as we are in the middle of a leadership election."
```

```none
NOT_COORDINATOR: "This is not the correct coordinator."
```

```none
NOT_ENOUGH_REPLICAS: "Messages are rejected since there are fewer in-sync replicas than required."
```

```none
NOT_ENOUGH_REPLICAS_AFTER_APPEND: "Messages are written to the log, but to fewer in-sync replicas than required."
```

```none
NOT_LEADER_OR_FOLLOWER: "This server is not the leader for the given partition."
```

The following message is what a client would log at `WARN` level should it attempts to connect to a broker
that is just restarted (in the context of a maintenance):

```none
"Connection to node {} ({}) terminated during authentication. This may happen
due to any of the following reasons: (1) Authentication failed due to invalid
credentials with brokers older than 1.0.0, (2) Firewall blocking Kafka TLS
traffic (eg it may only allow HTTPS traffic), (3) Transient network issue."
```

Configure clients with a sufficient number of retries or retry time to prevent
these warning exceptions from getting logged as errors.

* By default, Kafka producer clients retry for two minutes, print these warnings
  to logs, and recover without any intervention.
* By default, Kafka consumer and admin clients retry for one minute.

Timeout exceptions will occur if clients run out of memory buffer space while
retrying or if clients run out of time while waiting for memory.

In general, planning for volatility is a basic tenet of building cloud-native
client applications. In addition to normal cluster operations, brokers may
disappear for a variety of reasons, such as issues with the underlying
infrastructure at the cloud-provider layer. For more information, see
[Cloud-native applications](architecture.md#ccloud-architecture-cloud-native-apps).


## Flags

```none
--hosts strings                    REQUIRED: A comma-separated list of hosts.
--protocol string                  REQUIRED: Security protocol.
--cluster-name string              REQUIRED: Cluster name.
--kafka-cluster string             Kafka cluster ID.
--schema-registry-cluster string   Schema Registry cluster ID.
--ksql-cluster string              ksqlDB cluster ID.
--connect-cluster string           Kafka Connect cluster ID.
--cmf string                       Confluent Managed Flink (CMF) ID.
--flink-environment string         Flink environment ID.
--client-cert-path string          Path to client cert to be verified by MDS. Include for mTLS authentication.
--client-key-path string           Path to client private key, include for mTLS authentication.
--context string                   CLI context name.
```


### Kafka Broker

Create a new file to and put the `KafkaServer` configuration into it. The `KafkaServer` section is for the authentication on brokers.
For this example, create it at `/tmp/kafka_server_jaas.conf`.

```bash
KafkaServer {
  org.apache.kafka.common.security.plain.PlainLoginModule required
  username="admin"
  password="admin-secret"
  user_admin="admin-secret"
  user_confluent="confluent-secret"
  user_metricsreporter="metricsreporter-secret";
};

KafkaClient {
  org.apache.kafka.common.security.plain.PlainLoginModule required
  username="metricsreporter"
  password="metricsreporter-secret";
};
```

This configures several users on the server:
: - an `admin` user for internal interbroker traffic
  - a `confluent` user, for Confluent Control Center, Kafka Connect, and Schema Registry
  - a `metricsreporter` user for Metrics Reporter to publish Apache Kafka® metrics

In this example, Metrics Reporter publishes metrics to the same cluster it is configured on,
so we also need to include the corresponding `KafkaClient` client configuration in the same file.

It is possible to pass the JAAS configuration file location as JVM parameter to each client JVM as

```bash
-Djava.security.auth.login.config=/tmp/kafka_server_jaas.conf
```

Next, secure the Kafka broker, the monitoring interceptor and the metrics reporter. There are
[more options for security](/platform/current/security/overview.html#security), but this broker will be secured using `SASL_PLAINTEXT`.


### Schema Registry Configuration

If you followed the quick start, Connect relies on Schema Registry,
so we first need to update Schema Registry to use SASL authentication.

Edit the Schema Registry configuration (`CONFLUENT_HOME/etc/schema-registry/schema-registry.properties`) and add the following settings.

```bash
kafkastore.security.protocol=SASL_PLAINTEXT
kafkastore.sasl.mechanism=PLAIN
```

Start schema registry with the additional `SCHEMA_REGISTRY_OPTS` parameter with the JAAS file [created ealier](#controlcenter-security-kafkaclient).

```bash
SCHEMA_REGISTRY_OPTS=-Djava.security.auth.login.config=/tmp/kafka_client_jaas.conf \
confluent local services schema-registry start
```


### Transactional producer and exactly-once semantics

The JavaScript Client library supports idempotent producers, transactional producers, and
exactly-once semantics (EOS).

To use an idempotent producer:

```js
const producer = new Kafka().producer({
    'bootstrap.servers': '<fill>',
    'enable.idempotence': true,
});
```

More details about the guarantees provided by an idempotent producer can
be found
[here](https://github.com/confluentinc/librdkafka/blob/master/INTRODUCTION.md#idempotent-producer),
as well as the limitations and other configuration changes that an
idempotent producer brings.

To use a transactional producer:

```js
const producer = new Kafka().producer({
    'bootstrap.servers': '<fill>',
    'transactional.id': 'my-transactional-id', // Must be unique for each producer instance.
});

await producer.connect();

// Start transaction.
await producer.transaction();
await producer.send({topic: 'topic', messages: [{value: 'message'}]});

// Commit transaction.
await producer.commit();
```

Specifying a `transactional.id` makes the producer transactional. The
`transactional.id` must be unique for each producer instance. The
producer must be connected before starting a transaction with
`transaction()`. A transactional producer cannot be used as a
non-transactional producer, and every message must be within a
transaction.

More details about the guarantees provided by a transactional producer
can be found
[here](https://github.com/confluentinc/librdkafka/blob/master/INTRODUCTION.md#transactional-producer).

Using a transactional producer also allows for exactly-once semantics
(EOS) in the specific case of consuming from a Kafka cluster, processing
the message, and producing to another topic on the same cluster.

```js
consumer.run({
    eachMessage: async ({ topic, partition, message }) => {
        try {
            const transaction = await producer.transaction();
            await transaction.send({
                topic: 'produceTopic',
                messages: [
                    { value: 'consumed a message: ' + message.value.toString() },
                ]
            });

            await transaction.sendOffsets({
                consumer,
                topics: [
                    {
                        topic,
                        partitions: [
                            { partition, offset: String(Number(message.offset) + 1) },
                        ],
                    }
                ],
            });

            // The transaction assures that the message sent and the offset committed
            // are transactional, only reflecting on the broker on commit.
            await transaction.commit();
        } catch (e) {
            console.error(e);
            await transaction.abort();
        }
    },
});
```


## Quick Start

This quick start uses the ActiveMQ Sink Connector to consume records from Kafka
and send them to an ActiveMQ broker.

1. [Install ActiveMQ](https://activemq.apache.org/getting-started#installation-procedure-for-unix)
2. [Start ActiveMQ](https://activemq.apache.org/getting-started#starting-activemq)
3. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-activemq-sink:latest
   ```
4. Start Confluent Platform.
   ```bash
   confluent local start
   ```
5. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html)
   test data to the `sink-messages` topic in Kafka.
   ```bash
   seq 10 | confluent local produce sink-messages
   ```
6. Create a `activemq-sink.json` file with the following contents:
   ```json
   {
     "name": "AMQSinkConnector",
     "config": {
       "connector.class": "io.confluent.connect.jms.ActiveMqSinkConnector",
       "tasks.max": "1",
       "topics": "sink-messages",
       "activemq.url": "tcp://localhost:61616",
       "activemq.username": "connectuser",
       "activemq.password": "connectuser",
       "jms.destination.type": "queue",
       "jms.destination.name": "connector-quickstart",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```
7. Load the ActiveMQ Sink Connector.
   ```bash
   confluent local load jms --config activemq-sink.json
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production environments.
8. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status AMQSinkConnector
   ```
9. Navigate to the [ActiveMQ Admin UI](http://localhost:8161/admin) or use
   the following ActiveMQ CLI command to confirm the messages were delivered to the
   `connector-quickstart` queue.
   ```bash
   ./bin/activemq consumer --destination connector-quickstart --messageCount 10
   ```

For an example of how to get Kafka Connect connected to [Confluent Cloud](/cloud/current/index.html), see
[Connect Self-Managed Kafka Connect to Confluent Cloud](/cloud/current/cp-component/connect-cloud-config.html#distributed-cluster).


### Start the BigQuery Sink connector

To start the BigQuery Sink Connector, complete the following steps:

1. Create the file `register-kcbd-connect-bigquery.json` to store the
   connector configuration.

   **Connect Distributed REST quick start connector properties:**
   ```json
   {
         "name": "kcbq-connect1",
         "config": {
         "connector.class": "com.wepay.kafka.connect.bigquery.BigQuerySinkConnector",
         "tasks.max" : "1",
         "topics" : "kcbq-quickstart1",
         "sanitizeTopics" : "true",
         "autoCreateTables" : "true",
         "allowNewBigQueryFields" : "true",
         "allowBigQueryRequiredFieldRelaxation" : "true",
         "schemaRetriever" : "com.wepay.kafka.connect.bigquery.retrieve.IdentitySchemaRetriever",
         "project" : "confluent-243016",
         "defaultDataset" : "ConfluentDataSet",
         "keyfile" : " /Users/titomccutcheon/dev/confluent_fork/kafka-connect-bigquery/kcbq-connector/quickstart/properties/confluent-243016-384a24e2de1a.json",
         "transforms" : "RegexTransformation",
         "transforms.RegexTransformation.type" : "org.apache.kafka.connect.transforms.RegexRouter",
         "transforms.RegexTransformation.regex" : "(kcbq_)(.*)",
         "transforms.RegexTransformation.replacement" : "$2"
       }
   }
   ```

Note that the `project` key is the `id` value of the BigQuery project
: in Google Cloud. For `datasets`, the value `ConfluentDataSet` is the ID of
  the dataset entered by the user during Google Cloud dataset creation.\`\`keyfile\`\`
  is the service account key JSON file location.


  If you don’t want this connector to create a BigQuery table automatically,
  create a BigQuery table with `Partitioning: Partition by ingestion time`
  and a proper schema.


  Also, note that the properties prefixed with `transforms` are used to set
  up SMTs. The following is an example regex router SMT that strips `kcbq_`
  from the topic name. Replace with relevant regex to replace the topic of each
  sink record with destination dataset and table name in the format
  `<dataset>:<tableName>` or only the destination table name in the format
  `<tableName>`

1. Start the connector.
   ```text
   curl -i -X POST -H "Accept:application/json" -H  "Content-Type:application/json" http://localhost:8083/connectors/ -d @register-kcbd-connect-bigquery.json
   ```


### Install and load the connector

1. Install the connector through the [Confluent Hub Client](/kafka-connectors/self-managed/confluent-hub/client.html).
   ```bash
   # run from your CP installation directory
   confluent connect plugin install confluentinc/kafka-connect-gcp-bigtable:latest
   ```

   Note that by default, it will install the plugin into
   `share/confluent-hub-components` and add the directory to the plugin path.
2. Adding a new connector plugin requires restarting Connect. Use the
   Confluent CLI to restart Kafka Connect.
   ```bash
   confluent local services connect stop && confluent local services connect start
   ```
3. Configure your connector by adding the file
   `etc/kafka-connect-gcp-bigtable/sink-quickstart-bigtable.properties`, with
   the following properties:
   ```none
   name=BigTableSinkConnector
   topics=stats
   tasks.max=1
   connector.class=io.confluent.connect.gcp.bigtable.BigtableSinkConnector

   gcp.bigtable.credentials.path=$home/bigtable-test-credentials.json
   gcp.bigtable.project.id=YOUR-PROJECT-ID
   gcp.bigtable.instance.id=test-instance
   auto.create.tables=true
   aut.create.column.families=true
   table.name.format=example_table

   # The following define the Confluent license stored in Kafka, so we need the Kafka bootstrap addresses.
   # `replication.factor` may not be larger than the number of Kafka brokers in the destination cluster,
   # so here we set this to '1' for demonstration purposes. Always use at least '3' in production configurations.
   confluent.license=
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   ```

   Ensure you replace `YOUR-PROJECT-ID` with the project ID you created in
   the prerequisite portion of this quick start. You should also replace
   `$home` with your home directory path, or any other path where the
   credentials file was saved.
4. Start the BigTable Sink connector by loading the connector’s configuration
   using the following command:
   ```bash
   confluent local load bigtable --config etc/kafka-connect-gcp-bigtable/sink-quickstart-bigtable.properties
   ```

   Your output should resemble the following:
   ```json
   {
     "name": "bigtable",
     "config": {
       "topics": "stats",
       "tasks.max": "1",
       "connector.class": "io.confluent.connect.gcp.bigtable.BigtableSinkConnector",
       "gcp.bigtable.credentials.path": "$home/bigtable-test-credentials.json",
       "gcp.bigtable.instance.id": "test-instance",
       "gcp.bigtable.project.id": "YOUR-PROJECT-ID",
       "auto.create.tables": "true",
       "auto.create.column.families": "true",
       "table.name.format": "example_table",
       "confluent.license": "",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1",
       "name": "bigtable"
     },
     "tasks": [
       {
         "connector": "bigtable",
         "task": 0
       }
     ],
     "type": "sink"
   }
   ```
5. Check the status of the connector to confirm it’s in a `RUNNING` state.
   ```bash
   confluent local status bigtable
   ```

   Your output should resemble the following:
   ```bash
   {
     "name": "bigtable",
     "connector": {
       "state": "RUNNING",
       "worker_id": "10.200.7.192:8083"
     },
     "tasks": [
       {
         "id": 0,
         "state": "RUNNING",
         "worker_id": "10.200.7.192:8083"
       }
     ],
     "type": "sink"
   }
   ```


## Quick Start

In the following example, the GCS Source connector reads all data listed under a
specific GCS bucket and then loads them into a Kafka topic. It does not matter
what file naming convention you use for writing data to the GCS bucket.

1. Upload the following data under a folder name `quickstart` within the
   targeted GCS bucket. In this example JSON format, which supports
   line-delimited JSON, concatenated JSON, and a JSON array of records, is used.
   ```json
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```
2. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html)
   by running the following command from your Confluent Platform installation directory:
   ```bash
   confluent connect plugin install confluentinc/kafka-connect-gcs-source:latest
   ```
3. Create a `quickstart-gcssource-generalized.properties` file with the
   following contents:
   ```json
   {
      "name": "quickstart-gcs-source",
      "config": {
      "connector.class": "io.confluent.connect.gcs.GcsSourceConnector",
      "tasks.max": "1",
      "value.converter": "org.apache.kafka.connect.json.JsonConverter",
      "mode": "GENERIC",
      "topics.dir": "quickstart",
      "topic.regex.list": "quick-start-topic:.*",
      "format.class": "io.confluent.connect.gcs.format.json.JsonFormat",
      "gcs.bucket.name": "<bucket-name>",
      "gcs.credentials.path": "</full/path/to/credentials/keys.json>",
      "value.converter.schemas.enable": "false"
     }
   }
   ```
4. Load the Generalized GCS Source connector by running the following command:
   ```bash
   confluent local services connect connector load quickstart-gcs-source --config quickstart-gcssource-generalized.properties
   ```
5. Verify the connector is in a `RUNNING` state:
   ```bash
   confluent local services connect connector status quickstart-gcs-source
   ```
6. Verify the messages are being sent to Kafka:
   ```bash
   kafka-console-consumer \
      --bootstrap-server localhost:9092 \
      --topic quick-start-topic \
      --from-beginning
   ```
7. You should see output similar to the following:
   ```json
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```


## ActveMQ Quick Start

For an example of how to get Kafka Connect connected to [Confluent Cloud](/cloud/current/index.html), see
[Connect Self-Managed Kafka Connect to Confluent Cloud](/cloud/current/cp-component/connect-cloud-config.html#distributed-cluster).

This quick start uses the JMS Sink connector to consume records from Kafka and send them to an ActiveMQ broker.

Prerequisites
: - [Confluent Platform](/platform/2.1/installation/index.html)
  - [Confluent CLI](/confluent-cli/current/installing.html) (requires separate installation)

1. [Install ActiveMQ](https://activemq.apache.org/getting-started#installation-procedure-for-unix)
2. [Start ActiveMQ](https://activemq.apache.org/getting-started#starting-activemq)
3. Install the connector by using the following [CLI
   command](/confluent-cli/current/command-reference/connect/plugin/confluent_connect_plugin_install.html):
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-jms-sink:latest
   ```
4. [Download the activemq-all JAR](https://repo1.maven.org/maven2/org/apache/activemq/activemq-all/5.15.4/activemq-all-5.15.4.jar) and copy it into the JMS Sink connector’s plugin folder. This needs to be done on every Connect worker node and you must restart the workers pick up the client JAR.
5. Start Confluent Platform using the [confluent local](/confluent-cli/current/command-reference/local/index.html) command.
   ```bash
   confluent local start
   ```
6. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html) test data to the `jms-messages` topic in Kafka.
   ```bash
   seq 10 | confluent local produce jms-messages
   ```
7. Create a `jms-sink.json` file with the following contents:
   ```json
   {
     "name": "JmsSinkConnector",
     "config": {
       "connector.class": "io.confluent.connect.jms.JmsSinkConnector",
       "tasks.max": "1",
       "topics": "jms-messages",
       "java.naming.factory.initial": "org.apache.activemq.jndi.ActiveMQInitialContextFactory",
       "java.naming.provider.url": "tcp://localhost:61616",
       "java.naming.security.principal": "connectuser",
       "java.naming.security.credentials": "connectpassword",
       "connection.factory.name": "connectionFactory",
       "jms.destination.type": "queue",
       "jms.destination.name": "connector-quickstart",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```
8. Load the JMS Sink connector.
   ```bash
   confluent local load jms --config jms-sink.json
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](/confluent-cli/current/index.html) in production environments.
9. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status jms
   ```
10. Navigate to the [ActiveMQ Admin UI](http://localhost:8161/admin) to confirm the messages were delivered to the `connector-quickstart` queue.


## TIBCO EMS Quick Start

This quick start uses the JMS Sink connector to consume records from Kafka and send them to TIBCO Enterprise Message Service - Community Edition.

1. Download and unzip [TIBCO EMS Community Edition](https://www.tibco.com/resources/product-download/tibco-enterprise-message-service-community-edition-free-download-mac).
2. Run the `TIBCOUniversalInstaller-mac.command` and step through the TIBCO Universal Installer.
3. Start TIBCO EMS with default configurations.
   ```bash
   ~/TIBCO_HOME/ems/8.4/bin/tibemsd
   ```
4. Install the connector through the [Confluent Hub Client](/kafka-connectors/self-managed/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-jms-sink:latest
   ```
5. Copy `~/TIBCO_HOME/ems/8.4/lib/tibjms.jar` into the JMS Sink connector’s plugin folder. This needs to be done on every Connect worker node and the workers must be restarted to pick up the client jar.
6. Start Confluent Platform.
   ```bash
   confluent local start
   ```
7. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html) test data to the `jms-messages` topic in Kafka.
   ```bash
   seq 10 | confluent local produce jms-messages
   ```
8. Create a `jms-sink.json` file with the following contents:
   ```json
   {
     "name": "JmsSinkConnector",
     "config": {
       "connector.class": "io.confluent.connect.jms.JmsSinkConnector",
       "tasks.max": "1",
       "topics": "jms-messages",
       "java.naming.provider.url": "tibjmsnaming://localhost:7222",
       "java.naming.factory.initial": "com.tibco.tibjms.naming.TibjmsInitialContextFactory",
       "connection.factory.name": "QueueConnectionFactory",
       "java.naming.security.principal": "admin",
       "java.naming.security.credentials": "",
       "jms.destination.type": "queue",
       "jms.destination.name": "connector-quickstart",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```
9. Load the JMS Sink connector.
   ```bash
   confluent local load jms --config jms-sink.json
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](/confluent-cli/current/index.html) in production environments.
10. Confirm that the connector is in a `RUNNING` state.
    ```bash
    confluent local status jms
    ```
11. Validate that there are messages on the queue using the `tibemsadmin` tool.
    ```bash
    ~/TIBCO_HOME/ems/8.4/bin/tibemsadmin -server "tcp://localhost:7222" -user admin
    # admin password is blank by default

    tcp://localhost:7222> show queue connector-quickstart
    ```


### Connect to IBM MQ using LDAP

The [IBM MQ](https://docs.confluent.io/kafka-connect-ibmmq-source/current/)
is available for download from Confluent Hub. If possible, you should use the
IBM MQ Source connector instead of the general JMS connector. However, you may
want to use the more general connector if you are required to connect to IBM MQ
using LDAP, or any other JNDI mechanism.

To get started, you must install the latest IBM MQ JMS client libraries
into the same directory where this connector is installed. For more details, see
the [IBM MQ installation](https://www.ibm.com/docs/en/ibm-mq/8.0?topic=mq-installing-uninstalling)
documentation for more details.

Next, create a connector configuration for your environment using the
appropriate configuration properties. The following example shows a typical but
incomplete configuration of the connector for use with [distributed
mode](/platform/current/connect/concepts.html#distributed-workers).

```bash
{
  "name": "connector1",
  "config": {
    "connector.class": "io.confluent.connect.jms.JmsSourceConnector",
    "kafka.topic":"MyKafkaTopicName",
    "jms.destination.name":"MyQueueName",
    "jms.destination.type":"queue",
    "java.naming.factory.initial":"com.sun.jndi.ldap.LdapCtxFactory",
    "java.naming.provider.url":"ldap://<ldap_url>"
    "java.naming.security.principal":"MyUserName",
    "java.naming.security.credentials":"MyPassword",
    "confluent.license":""
    "confluent.topic.bootstrap.servers":"localhost:9092"
  }
}
```

Note that any extra properties defined on the connector will be passed into the
JNDI InitialContext. This makes it easy to pass any IBM MQ specific settings
used for connecting to the IBM MQ broker.

Finally, deploy your connector by posting it to a Kafka Connect distributed
worker.


## Quick Start

In the following example, the Generalized S3 Source connector reads all data
listed under a specific S3 bucket and then loads them into a Kafka topic. You may
use any file naming convention writing when data to the S3 bucket.

1. Upload the following data under a folder name `quickstart` within the
   targeted S3 bucket. In this example JSON format is used, which supports the
   following: line-delimited JSON, concatenated
   JSON, and a JSON array of records.
   ```json
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```
2. Install the connector by running the following command from your Confluent Platform
   installation directory:
   ```bash
   confluent connect plugin install confluentinc/kafka-connect-s3-source:latest
   ```
3. Create a `quickstart-s3source-generalized.properties` file with the
   following contents:
   ```properties
   name=quick-start-s3-source
   connector.class=io.confluent.connect.s3.source.S3SourceConnector
   tasks.max=1
   value.converter=org.apache.kafka.connect.json.JsonConverter
   mode=GENERIC
   topics.dir=quickstart
   format.class=io.confluent.connect.s3.format.json.JsonFormat
   topic.regex.list=quick-start-topic:.*
   s3.bucket.name=healthcorporation
   value.converter.schemas.enable=false
   ```

   #### NOTE
   For more information about accepted regular expressions,
   see [Google RE2 syntax](https://github.com/google/re2/wiki/Syntax/).
4. Load the Generalized S3 Source connector.
   ```bash
   confluent local services connect connector load quick-start-s3-source --config quickstart-s3source-generalized.properties
   ```
5. Confirm the connector is in a `RUNNING` state:
   ```bash
   confluent local services connect connector status quick-start-s3-source
   ```
6. Confirm that the messages are being sent to Kafka.
   ```bash
   kafka-console-consumer \
       --bootstrap-server localhost:9092 \
       --topic quick-start-topic \
       --from-beginning
   ```
7. The response should be 9 records as shown in the following example:
   ```bash
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```


## Quick Start

Prerequisites
: - [Confluent Platform](/platform/current/installation/index.html)
  - [Confluent CLI](https://docs.confluent.io/confluent-cli/current/installing.html) (requires separate installation)

1. Install the connector:
   ```none
   confluent connect plugin install confluentinc/kafka-connect-syslog:latest
   ```
2. Start Confluent Platform using the Confluent CLI [confluent
   local](https://docs.confluent.io/confluent-cli/current/command-reference/local/index.html) commands.
   ```bash
   confluent local services connect start
   ```
3. Create a config file with the following contents:
   ```none
   name=syslog-tcp
   tasks.max=1
   connector.class=io.confluent.connect.syslog.SyslogSourceConnector
   syslog.port=5454
   syslog.listener=TCP
   confluent.license=
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   ```
4. Load the Syslog Connector.
   ```bash
   confluent local load syslog-tcp --config path/to/config.properties
   ```

   #### IMPORTANT
   Don’t use the [confluent local](https://docs.confluent.io/confluent-cli/current/command-reference/local/index.html) commands in production environments. Always run the Syslog connector in standalone mode, for example, with `bin/connect-standalone`.
5. Test with the sample syslog-formatted message sent using `netcat`:
   ```none
   echo "<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - Your refrigerator is running" | nc -v -w 0 localhost 5454
   ```
6. Confirm that the message is logged to Apache Kafka®:
   ```none
   kafka-avro-console-consumer \
       --bootstrap-server localhost:9092 \
       --property schema.registry.url=http://localhost:8081 \
       --topic syslog --from-beginning | jq '.'
   ```


## Quick start

This quick start uses the TIBCO Source connector to consume records from TIBCO
Enterprise Message Service™ - Community Edition and sends them to Kafka.

1. Download TIBCO Enterprise Message Service™ - Community Edition ([Mac](https://www.tibco.com/resources/product-download/tibco-enterprise-message-service-community-edition-free-download-mac)
   or [Linux](https://www.tibco.com/resources/product-download/tibco-enterprise-message-service-community-edition-free-download-linux))
   and run the appropriate installer. For more details, see the [TIBCO
   Enterprise Message Service™ Installation Guide](https://docs.tibco.com/pub/ems-zlinux/8.5.0/doc/pdf/TIB_ems_8.5_installation.pdf).
   Similar documentation is available for each version of TIBCO EMS.
2. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-tibco-source:latest
   ```
3. [Install the TIBCO JMS Client Library](#installing-tibco-client-lib).
4. Start Confluent Platform.
   ```bash
   confluent local start
   ```
5. Create a `connector-quickstart` queue with the TIBCO Admin Tool.
   ```bash
   # connect to TIBCO with the Admin Tool (PASSWORD IS EMPTY)
   tibco/ems/8.4/bin/tibemsadmin -server "tcp://localhost:7222" -user admin

   > create queue connector-quickstart
   ```
6. Compile the TIBCO Java samples so that they can be run in the following
   step.
   ```bash
   # setup Java's classpath so that the Java compiler can find the imports of the samples
   cd tibco/ems/8.4/samples/java
   export TIBEMS_JAVA=tibco/ems/8.4/lib
   CLASSPATH=${TIBEMS_JAVA}/jms-2.0.jar:${CLASSPATH}
   CLASSPATH=.:${TIBEMS_JAVA}/tibjms.jar:${TIBEMS_JAVA}/tibjmsadmin.jar:${CLASSPATH}
   export CLASSPATH

   # compile the java classes (run from the tibco/ems/8.4/samples/java directory)
   javac *.java
   ```
7. Produce a set of messages to the `connector-quickstart` queue.
   ```bash
   cd tibco/ems/8.4/samples/java

   # produce 5 test messages
   java tibjmsMsgProducer -user admin -queue connector-quickstart m1 m2 m3 m4 m5

   tibjmsMsgProducer SAMPLE

   Server....................... localhost
   User......................... admin
   Destination.................. connector-quickstart
   Send Asynchronously.......... false
   Message Text.................
   m1
   m2
   m3
   m4
   m5

   Publishing to destination 'connector-quickstart'

   Published message: m1
   Published message: m2
   Published message: m3
   Published message: m4
   Published message: m5
   ```
8. Create a `tibco-source.json` file with the following contents:
   ```json
   {
     "name": "TibcoSourceConnector",
     "config": {
       "connector.class": "io.confluent.connect.tibco.TibcoSourceConnector",
       "tasks.max": "1",
       "kafka.topic": "from-tibco-messages",
       "tibco.url": "tcp://localhost:7222",
       "tibco.username": "admin",
       "tibco.password": "",
       "jms.destination.type": "queue",
       "jms.destination.name": "connector-quickstart",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```
9. Load the TIBCO Source connector.
   ```bash
   confluent local load tibco --config tibco-source.json
   ```
10. Confirm that the connector is in a `RUNNING` state.
    ```bash
    confluent local status TibcoSourceConnector
    ```
11. Confirm the messages were delivered to the `from-tibco-messages` topic in Kafka.
    ```bash
    confluent local consume from-tibco-messages --from-beginning
    ```


# The output topic in Kafka
topic=connect-test
```

If choosing to use this tutorial without Schema Registry, you must also specify the `key.converter` and
`value.converter` properties to use `org.apache.kafka.connect.json.JsonConverter`.
This will override the converters’ settings for this connector only.

You are now ready to load the connector, but before you do that, update the file with some sample data.
Note that the connector configuration specifies a relative path for the
file, so you should create the file in the same directory that you will run the Kafka Connect worker from.

```bash
for i in {1..3}; do echo "log line $i"; done > test.txt
```

Next, start an instance of the FileStreamSourceConnector using the configuration file you defined previously. You can easily
do this from the command line using the following commands:

```bash
  confluent local load file-source

{
  "name": "file-source",
  "config": {
    "connector.class": "FileStreamSource",
    "tasks.max": "1",
    "file": "test.txt",
    "topics": "connect-test",
    "name": "file-source"
  },
  "tasks": []
}
```

Upon success it will print a snapshot of the connector’s configuration. To confirm which connectors are loaded any time, run:

```bash
  confluent local status

[
  "file-source"
]
```

You will get a list of all the loaded connectors in this worker. The same command supplied with the connector name will
give you the status of this connector, including an indication of whether the connector has started successfully or has
encountered a failure. For instance, running this command on the connector you just loaded would give you the following:

```bash
  confluent local status file-source

{
  "name": "file-source",
  "connector": {
    "state": "RUNNING",
    "worker_id": "192.168.10.1:8083"
  },
  "tasks": [
    {
      "state": "RUNNING",
      "id": 0,
      "worker_id": "192.168.10.1:8083"
    }
  ]
}
```

Soon after the connector starts, each of the three lines in our log file should be delivered to Kafka, having registered
a schema with Schema Registry.
One way to validate that the data is there is to use the console consumer in another console to inspect the contents
of the topic:

```bash
kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic connect-test --from-beginning
"log line 1"
"log line 2"
"log line 3"
```

Note that `kafka-avro-console-consumer` is used because the data has been stored in Kafka using Avro format. This
consumer uses the Avro converter that is bundled with Schema Registry in order to properly lookup the schema for the
Avro data.


### Examples

The following example shows a line added that overrides the default worker
`compression.type` property. After the connector configuration is updated, the
[Replicator](../../multi-dc-deployments/replicator/index.md#replicator-detail) connector will use gzip compression.

```json
{
  "name": "Replicator",
  "config": {
    "connector.class": "io.confluent.connect.replicator.ReplicatorSourceConnector",
    "topic.whitelist": "_schemas",
    "topic.rename.format": "\${topic}.replica",
    "key.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
    "value.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
    "src.kafka.bootstrap.servers": "srcKafka1:10091",
    "dest.kafka.bootstrap.servers": "destKafka1:11091",
    "tasks.max": "1",
    "producer.override.compression.type": "gzip",
    "confluent.topic.replication.factor": "1",
    "schema.subject.translator.class": "io.confluent.connect.replicator.schemas.DefaultSubjectTranslator",
    "schema.registry.topic": "_schemas",
    "schema.registry.url": "http://destSchemaregistry:8086"
  }
```

The following example shows a line added that overrides the default worker
`auto.offset.reset` property. After the connector configuration is updated,
the [Elasticsearch](https://docs.confluent.io/kafka-connectors/elasticsearch/current/)  connector will
use `latest` instead of the default connect worker property value
`earliest`.

```json
{
  "name": "Elasticsearch",
  "config": {
  "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
  "topics": "orders",
  "consumer.override.auto.offset.reset": "latest",
  "tasks.max": 1,
  "connection.url": "http://elasticsearch:9200",
  "type.name": "type.name=kafkaconnect",
  "key.ignore": "true",
  "schema.ignore": "false",
  "transforms": "renameTopic",
  "transforms.renameTopic.type": "org.apache.kafka.connect.transforms.RegexRouter",
  "transforms.renameTopic.regex": "orders",
  "transforms.renameTopic.replacement": "orders-latest"
}'
```

When the worker override configuration property is set to `connector.client.config.override.policy=Principal`, each of the connectors can use a different service principal. The following example shows a sink connector service principal override when implementing [Role-Based Access Control (RBAC)](../rbac/connect-rbac-connectors.md#connect-rbac-connectors):

```none
consumer.override.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
   username="<username>" \
   password="<password>" \
   metadataServerUrls="<metadata_server_urls>";
```


### Bouncy Castle FIPS provider

To generate a PKCS#8-format encrypted keypair that works with the Bouncy Castle
FIPS provider, run the following `openssl` commands. The first command generates
a private key in PKCS#8 format. The second command prompts for a password.

```shell
$ openssl genpkey -algorithm RSA -out private_key.pem
$ openssl pkcs8 -topk8 -in rsakey.pem \
    -out pvtkey-pkcs8-aes256.pem -v2 aes256
```

To see a sample configuration for the Bouncy Castle FIPS provider, expand
the following section:

<details>
<summary style="display: list-item; cursor:pointer; color:#337ab7;">
  Sample configuration for Bouncy Castle FIPS provider
</summary>

   Example configuration for Bouncy Castle FIPS provider

   <pre>

      advertised.listeners=INTERNAL://REDACTED.us-west-2.compute.internal:9092,BROKER://REDACTED.us-west-2.compute.internal:9091,CUSTOM://REDACTED.us-west-2.compute.amazonaws.com:9093,TOKEN://REDACTED.us-west-2.compute.amazonaws.com:9094
      authorizer.class.name=io.confluent.kafka.security.authorizer.ConfluentServerAuthorizer
      broker.id=1
      confluent.ansible.managed=true
      confluent.authorizer.access.rule.providers=CONFLUENT
      confluent.balancer.topic.replication.factor=3
      confluent.basic.auth.credentials.source=USER_INFO
      confluent.basic.auth.user.info=schema-registry:password
      confluent.license.topic=_confluent-command
      confluent.license.topic.replication.factor=3
      confluent.metadata.server.advertised.listeners=https://REDACTED.us-west-2.compute.internal:8090
      confluent.metadata.server.authentication.method=BEARER
      confluent.metadata.server.listeners=https://0.0.0.0:8090
      confluent.metadata.server.sni.host.check.enabled=false
      confluent.metadata.server.ssl.key.password=REDACTED
      confluent.metadata.server.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore_BCFKS.bcfks
      confluent.metadata.server.ssl.keystore.password=REDACTED
      confluent.metadata.server.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      confluent.metadata.server.ssl.truststore.password=REDACTED
      confluent.metadata.server.ssl.truststore.type=BCFKS
      confluent.metadata.server.ssl.keystore.type=BCFKS
      confluent.metadata.server.token.key.path=/var/ssl/private/encrypted_aes256_tokenKeypair.pem
      confluent.metadata.server.token.key.passphrase=REDACTED
      security.providers=io.confluent.kafka.security.fips.provider.BcFipsProviderCreator
      #confluent.metadata.server.security.providers=io.confluent.kafka.security.fips.provider.BcFipsProviderCreator
      confluent.metadata.server.token.max.lifetime.ms=3600000
      confluent.metadata.server.token.signature.algorithm=RS256
      confluent.metadata.topic.replication.factor=3
      confluent.metrics.reporter.bootstrap.servers=REDACTED.us-west-2.compute.internal:9091,REDACTED.us-west-2.compute.internal:9091,REDACTED.us-west-2.compute.internal:9091
      confluent.metrics.reporter.security.protocol=SSL
      confluent.metrics.reporter.ssl.key.password=REDACTED
      confluent.metrics.reporter.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore_BCFKS.bcfks
      confluent.metrics.reporter.ssl.keystore.password=REDACTED
      confluent.metrics.reporter.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      confluent.metrics.reporter.ssl.truststore.password=REDACTED
      confluent.metrics.reporter.ssl.keystore.type=BCFKS
      confluent.metrics.reporter.ssl.truststore.type=BCFKS
      confluent.metrics.reporter.topic.replicas=3
      confluent.schema.registry.url=https://REDACTED.us-west-2.compute.internal:8081
      confluent.security.event.logger.exporter.kafka.topic.replicas=3
      confluent.ssl.key.password=REDACTED
      confluent.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore_BCFKS.bcfks
      confluent.ssl.keystore.password=REDACTED
      confluent.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      confluent.ssl.truststore.password=REDACTED
      confluent.support.customer.id=anonymous
      confluent.support.metrics.enable=true
      group.initial.rebalance.delay.ms=3000
      inter.broker.listener.name=BROKER
      kafka.rest.bootstrap.servers=REDACTED.us-west-2.compute.internal:9092,REDACTED.us-west-2.compute.internal:9092,REDACTED.us-west-2.compute.internal:9092
      kafka.rest.client.security.protocol=SASL_SSL
      kafka.rest.client.ssl.key.password=REDACTED
      kafka.rest.client.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore_BCFKS.bcfks
      kafka.rest.client.ssl.keystore.password=REDACTED
      kafka.rest.client.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      kafka.rest.client.ssl.truststore.password=REDACTED
      kafka.rest.client.ssl.keystore.type=BCFKS
      kafka.rest.client.ssl.truststore.type=BCFKS
      kafka.rest.confluent.metadata.basic.auth.user.info=pkcs8-7-5-x-74-test-cluster-main.kafka_erp:Confluent1!
      kafka.rest.confluent.metadata.bootstrap.server.urls=https://REDACTED.us-west-2.compute.internal:8090,https://REDACTED.us-west-2.compute.internal:8090,https://REDACTED.us-west-2.compute.internal:8090
      kafka.rest.confluent.metadata.http.auth.credentials.provider=BASIC
      kafka.rest.confluent.metadata.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      kafka.rest.confluent.metadata.ssl.truststore.password=REDACTED
      kafka.rest.enable=true
      kafka.rest.kafka.rest.resource.extension.class=io.confluent.kafkarest.security.KafkaRestSecurityResourceExtension
      kafka.rest.public.key.path=/var/ssl/private/public.pem
      kafka.rest.rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler
      ldap.com.sun.jndi.ldap.read.timeout=3000
      ldap.group.member.attribute.pattern=uid=(.),OU=rbac,DC=confluent,DC=io
      ldap.group.name.attribute=cn
      ldap.group.search.base=OU=rbac,DC=confluent,DC=io
      ldap.java.naming.factory.initial=com.sun.jndi.ldap.LdapCtxFactory
      ldap.java.naming.provider.url=ldap://ip-10-0-242-18.us-west-2.compute.internal:389
      ldap.java.naming.security.authentication=simple
      ldap.java.naming.security.credentials=Confluent1!
      ldap.java.naming.security.principal=uid=mds,OU=rbac,DC=confluent,DC=io
      ldap.user.memberof.attribute.pattern=cn=(.),OU=rbac,DC=confluent,DC=io
      ldap.user.name.attribute=uid
      ldap.user.object.class=account
      ldap.user.search.base=OU=rbac,DC=confluent,DC=io
      listener.name.broker.ssl.client.auth=required
      listener.name.broker.ssl.key.password=REDACTED
      listener.name.broker.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore_BCFKS.bcfks
      listener.name.broker.ssl.keystore.password=REDACTED
      listener.name.broker.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      listener.name.broker.ssl.truststore.password=REDACTED
      listener.name.custom.ssl.client.auth=required
      listener.name.custom.ssl.key.password=REDACTED
      listener.name.custom.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore_BCFKS.bcfks
      listener.name.custom.ssl.keystore.password=REDACTED
      listener.name.custom.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      listener.name.custom.ssl.truststore.password=REDACTED
      listener.name.internal.oauthbearer.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required publicKeyPath="/var/ssl/private/public.pem";
      listener.name.internal.oauthbearer.sasl.login.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerServerLoginCallbackHandler
      listener.name.internal.oauthbearer.sasl.server.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerValidatorCallbackHandler
      listener.name.internal.principal.builder.class=io.confluent.kafka.security.authenticator.OAuthKafkaPrincipalBuilder
      listener.name.internal.sasl.enabled.mechanisms=OAUTHBEARER
      listener.name.internal.ssl.client.auth=required
      listener.name.internal.ssl.key.password=REDACTED
      listener.name.internal.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore.pk12
      listener.name.internal.ssl.keystore.password=REDACTED
      listener.name.internal.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore.pk12
      listener.name.internal.ssl.truststore.password=REDACTED
      listener.name.token.oauthbearer.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required publicKeyPath="/var/ssl/private/public.pem";
      listener.name.token.oauthbearer.sasl.login.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerServerLoginCallbackHandler
      listener.name.token.oauthbearer.sasl.server.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerValidatorCallbackHandler
      listener.name.token.principal.builder.class=io.confluent.kafka.security.authenticator.OAuthKafkaPrincipalBuilder
      listener.name.token.sasl.enabled.mechanisms=OAUTHBEARER
      listener.name.token.ssl.client.auth=required
      listener.name.token.ssl.key.password=REDACTED
      listener.name.token.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore_BCFKS.bcfks
      listener.name.token.ssl.keystore.password=REDACTED
      listener.name.token.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      listener.name.token.ssl.truststore.password=REDACTED
      listener.security.protocol.map=INTERNAL:SASL_SSL,BROKER:SSL,CUSTOM:SSL,TOKEN:SASL_SSL
      listeners=INTERNAL://:9092,BROKER://:9091,CUSTOM://:9093,TOKEN://:9094
      log.dirs=/var/lib/kafka/data
      log.retention.check.interval.ms=300000
      log.retention.hours=168
      log.segment.bytes=1073741824
      metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter
      num.io.threads=16
      num.network.threads=8
      num.partitions=1
      num.recovery.threads.per.data.dir=2
      offsets.topic.replication.factor=3
      sasl.enabled.mechanisms=OAUTHBEARER
      socket.receive.buffer.bytes=102400
      socket.request.max.bytes=104857600
      socket.send.buffer.bytes=102400
      ssl.key.password=REDACTED
      ssl.keystore.location=/var/ssl/private/kafka_broker.keystore_BCFKS.bcfks
      ssl.keystore.password=REDACTED
      ssl.truststore.location=/var/ssl/private/kafka_broker.truststore_BCFKS.bcfks
      ssl.truststore.password=REDACTED
      super.users=User:mds;User:C=US,ST=Ca,L=PaloAlto,O=CONFLUENT,OU=TEST,CN=kafka_broker
      transaction.state.log.min.isr=2
      transaction.state.log.replication.factor=3
      zookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
      zookeeper.connect=REDACTED.us-west-2.compute.internal:2182,REDACTED.us-west-2.compute.internal:2182,REDACTED.us-west-2.compute.internal:2182
      zookeeper.connection.timeout.ms=18000
      zookeeper.ssl.client.enable=true
      zookeeper.ssl.keystore.location=/var/ssl/private/kafka_broker.keystore.pk12
      zookeeper.ssl.keystore.password=REDACTED
      zookeeper.ssl.truststore.location=/var/ssl/private/kafka_broker.truststore.pk12
      zookeeper.ssl.truststore.password=REDACTED
      ssl.keystore.type=BCFKS
      ssl.truststore.type=BCFKS
      listener.name.internal.ssl.keystore.type=PKCS12
      listener.name.internal.ssl.truststore.type=PKCS12
      listener.name.broker.ssl.keystore.type=BCFKS
      listener.name.broker.ssl.truststore.type=BCFKS
      listener.name.custom.ssl.keystore.type=BCFKS
      listener.name.custom.ssl.truststore.type=BCFKS
      listener.name.token.ssl.keystore.type=BCFKS
      listener.name.token.ssl.truststore.type=BCFKS
      zookeeper.ssl.keystore.type=PKCS12
      zookeeper.ssl.truststore.type=PKCS12
      confluent.ssl.keystore.type=BCFKS
      confluent.ssl.truststore.type=BCFKS
   </pre></details>


#   -d                Enable debug logging


from confluent_kafka import KafkaError, KafkaException, version
from confluent_kafka import Producer, Consumer
import json
import logging
import argparse
import uuid
import sys
import re


class CommandRecord (object):
   def __init__(self, stmt):
      self.stmt = stmt

   def __str__(self):
      return "({})".format(self.stmt)

   @classmethod
   def deserialize(cls, binstr):
      d = json.loads(binstr)
      return CommandRecord(d['statement'])

class CommandConsumer(object):
   def __init__(self, ksqlServiceId, conf):
      self.consumer = Consumer(conf)
      self.topic = '_confluent-ksql-{}_command_topic'.format(ksqlServiceId)

   def consumer_run(self):
      max_offset = -1001

      def latest_offsets(consumer, partitions):
            nonlocal max_offset
            for p in partitions:
               high_water = consumer.get_watermark_offsets(p)[1]
               if high_water >= max_offset:
                  max_offset = high_water
            logging.debug("Max offset in command topic = %d", max_offset)

      self.consumer.subscribe([self.topic], on_assign=latest_offsets)
      self.msg_cnt = 0
      self.msg_err_cnt = 0
      stmts = {}
      try:
            while True:
               msg = self.consumer.poll(0.2)
               if msg is None:
                  continue

               if msg.error() is not None:
                  print("consumer: error: {}".format(msg.error()))
                  self.consumer_err_cnt += 1
                  continue

               try:
                  #print("Read msg with offset ", msg.offset())
                  self.msg_cnt += 1
                  record = CommandRecord.deserialize(msg.value())
                  #print(record)

                  # match statements CREATE/DROP STREAM, CREATE/DROP TABLE
                  match = re.search(r'(?:create|drop) (?:stream|table) ([a-zA-z0-9-]+?)(:?\(|AS|\s|;)', record.stmt, re.I)
                  if match:
                        name = match.group(1).upper()
                        if name == "KSQL_PROCESSING_LOG":
                           continue
                        if name not in stmts:
                           stmts[name] = []
                        stmts[name].append(record.stmt)

                  # match statements TERMINATE query
                  match2 = re.search(r'(?:terminate) (?:ctas|csas)_(.+?)_', record.stmt, re.I)
                  if match2:
                        name = match2.group(1).upper()
                        stmts[name].append(record.stmt)

                  # match statements INSERT INTO stream or table
                  match3 = re.search(r'(?:insert into) ([a-zA-z0-9-]+?)(:?\(|\s|\()', record.stmt, re.I)
                  if match3:
                        name = match3.group(1).upper()
                        stmts[name].append(record.stmt)

                  #match statements CREATE TYPE
                  match4 = re.search(r'(?:create|drop) type ([a-zA-z0-9-]+?)(:?AS|\s|;)', record.stmt, re.I)
                  if match4:
                        name = match4.group(1).upper()
                        if name not in stmts:
                           stmts[name] = []
                        stmts[name].append(record.stmt)

                  if match is None and match2 is None and match3 is None and match4 is None:
                        if 'UNRECOGNIZED' not in stmts:
                           stmts['UNRECOGNIZED'] = []
                        stmts['UNRECOGNIZED'].append(record.stmt)

                  # High watermark is +1 from last offset
                  if msg.offset() >= max_offset-1:
                        break

               except ValueError as ex:
                  print("consumer: Failed to deserialize message in "
                        "{} [{}] at offset {} (headers {}): {}".format(
                        msg.topic(), msg.partition(), msg.offset(), msg.headers(), ex))
                  self.msg_err_cnt += 1

      except KeyboardInterrupt:
            pass

      finally:
            self.consumer.close()
            logging.debug("Consumed {} messages, erroneous message = {}.".format(self.msg_cnt, self.msg_err_cnt))
            outer_json = []

            for key, value in stmts.items():
               inner_json = {}
               inner_json['subject'] = key
               inner_json['statements'] = value
               outer_json.append(inner_json)

            print(json.dumps(outer_json  ))

if __name__ == '__main__':
   parser = argparse.ArgumentParser(description="Command topic consumer that dumps CREATE, DROP and TERMINATE queries to "+
                                                "stdout. If no arguments are provided, default values are used. Default broker is "
                                                "'localhost:9092'. Default ksqlServiceId is 'default_'. You may optionally provide a configuration file with "+
                                                "broker specific configuration parameters. Every run of this script will consume the topic from the beginning. ")

   parser.add_argument('-f', dest='confFile', type=argparse.FileType('r'),
                        help='Configuration file (configProp=value format)')
   parser.add_argument('-b', dest='brokers', type=str, default=None, help='Bootstrap servers')
   parser.add_argument('-k', dest='ksqlServiceId', type=str, default=None, help='KsqlDB service ID')
   parser.add_argument("-d", dest='debug', action="store_true", default=False, help="Enable debug logging")


   args = parser.parse_args()

   if args.debug:
      logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)

   conf = dict()
   if args.confFile is not None:
      # Parse client configuration file
      for line in args.confFile:
            line = line.strip()
            if len(line) == 0 or line[0] == '#':
               continue

            i = line.find('=')
            if i <= 0:
               raise ValueError("Configuration lines must be `name=value..`, not {}".format(line))

            name = line[:i]
            value = line[i+1:]

            conf[name] = value

   if args.brokers is not None:
      # Overwrite any brokers specified in configuration file with
      # brokers from -b command line argument
      conf['bootstrap.servers'] = args.brokers
   elif 'bootstrap.servers' not in conf:
      conf['bootstrap.servers'] = 'localhost:9092'

   if args.ksqlServiceId is None:
      args.ksqlServiceId = 'default_'

   conf['auto.offset.reset'] = 'earliest'
   conf['enable.auto.commit']= 'False'
   conf['client.id'] = 'commandClient'
   # Generate a unique group.id
   conf['group.id'] = 'commandTopicConsumer.py-{}'.format(uuid.uuid4())

   c = CommandConsumer(args.ksqlServiceId, conf)
   c.consumer_run()
```

If you prefer to recover the schema manually, use the following steps.

1. Capture streams SQL:
2. Run `list streams extended;` to list all of the streams.
3. Grab the SQL statement that created each stream from the output,
   ignoring `KSQL_PROCESSING_LOG`.
4. Capture tables SQL:
5. Run `list tables extended;` to list all of the tables.
6. Grab the SQL statement that created each table from the output.
7. Capture custom types SQL:
8. Run `list types;` to list all of the custom types.
9. Convert the output into `CREATE TYPE <name> AS <schema>` syntax by
   grabbing the name from the first column and the schema from the
   second column of the output.
10. Order by dependency: you’ll now have the list of SQL statements to
    rebuild the schema, but they are not yet ordered in terms of
    dependencies. You will need to reorder the statements to ensure each
    statement comes after any other statements it depends on.
11. Update the script to take into account any changes in syntax or
    functionality between the old and new clusters. The release notes
    can help here. It can also be useful to have a test ksqlDB cluster,
    pointing to a different test Kafka cluster, where you can try
    running the script to get feedback on any errors. Note: you may want
    to temporarily add `PARTITIONS=1` to the `WITH` clause of any
    `CREATE TABLE` or `CREATE STREAM` command, so that the command
    will run without requiring you to first create the necessary topics
    in the test Kafka cluster.
12. Stop the old cluster: if you do not do so then both the old and new
    cluster will be publishing to sink topics, resulting in undefined
    behavior.
13. Build the schema in the new instance. Now you have the SQL file you
    can run this against the new cluster to build a copy of the schema.
    This is best achieved with the
    [RUN SCRIPT](../../developer-guide/ksqldb-reference/run-script.md#ksqldb-reference-run-script)
    command, which takes a SQL file as an input.


### Initial Setup

To get started with the `ksql-migrations` tool, use the
`ksql-migrations new-project` command to set up the required directory
structure and create a config file for using the migrations tool.

```none
ksql-migrations new-project [--] <project-path> <ksql-server-url>
```

The two required arguments are the path that will be used as the root
directory for your new migrations project, and your ksqlDB server URL.

```bash
ksql-migrations new-project /my/migrations/project/path http://localhost:8088
```

Your output should resemble:

```none
Creating new migrations project at /my/migrations/project/path
Creating directory: /my/migrations/project/path
Creating directory: /my/migrations/project/path/migrations
Creating file: /my/migrations/project/path/ksql-migrations.properties
Writing to config file: ksql.server.url=http://localhost:8088
...
Migrations project directory created successfully
Execution time: 0.0080 seconds
```

This command creates a config file, named
`ksql-migrations.properties`, in the specified directory, and also
creates an empty `/migrations` subdirectory. The config file is
initialized with the ksqlDB server URL passed as part of the command.

As a convenience, the config file is also initialized with default
values for other
[migrations tool configurations](#ksqldb-manage-metadata-schemas-config-reference)
commented out. These additional, optional configurations include configs
required to access secure ksqlDB servers, such as credentials for HTTP
basic authentication or TLS keystores and truststores, as well as
optional configurations specific to the migrations tool.

See the
[config reference](#ksqldb-manage-metadata-schemas-config-reference)
for details on individual configs. See
[here](#ksqldb-manage-metadata-schemas-connect-to-cloud)
for the configs required to connect to a Confluent Cloud ksqlDB cluster.


## Step 1: Create a docker-compose file

The minimum set of services for running ksqlDB comprises a Kafka broker
and ksqlDB Server. The ksqlDB CLI is required for developing applications
with SQL code. The following `docker-compose` file specifies the Docker
images that you need for a minimal local environment:

- confluentinc/cp-kafka
- confluentinc/cp-ksqldb-server
- confluentinc/cp-ksqldb-cli

1. Run the following command to create a file named `docker-compose.yml`.
   ```bash
   touch docker-compose.yml
   ```
2. Copy the following YAML into docker-compose.yml and save the file.
   ```yaml

   version: '2'
   services:

   broker:
       image: confluentinc/cp-kafka:8.1.0
       hostname: broker
       container_name: broker
       ports:
       - "9092:9092"
       - "9101:9101"
       environment:
       KAFKA_NODE_ID: 1
       KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 'CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT'
       KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092'
       KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
       KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
       KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
       KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
       KAFKA_JMX_PORT: 9101
       KAFKA_JMX_HOSTNAME: localhost
       KAFKA_PROCESS_ROLES: 'broker,controller'
       KAFKA_CONTROLLER_QUORUM_VOTERS: '1@broker:29093'
       KAFKA_LISTENERS: 'PLAINTEXT://broker:29092,CONTROLLER://broker:29093,PLAINTEXT_HOST://0.0.0.0:9092'
       KAFKA_INTER_BROKER_LISTENER_NAME: 'PLAINTEXT'
       KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
       KAFKA_LOG_DIRS: '/tmp/kraft-combined-logs'
       # Replace CLUSTER_ID with a unique base64 UUID using "bin/kafka-storage.sh random-uuid"
       # See https://docs.confluent.io/kafka/operations-tools/kafka-tools.html#kafka-storage-sh
       CLUSTER_ID: 'MkU3OEVBNTcwNTJENDM2Qk'

   ksqldb-server:
       image: confluentinc/cp-ksqldb-server:8.1.0
       hostname: ksqldb-server
       container_name: ksqldb-server
       depends_on:
       - broker
       ports:
       - "8088:8088"
       environment:
       KSQL_CONFIG_DIR: "/etc/ksql"
       KSQL_BOOTSTRAP_SERVERS: "broker:29092"
       KSQL_HOST_NAME: ksqldb-server
       KSQL_LISTENERS: "http://0.0.0.0:8088"
       KSQL_CACHE_MAX_BYTES_BUFFERING: 0
       KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
       KSQL_KSQL_CONNECT_URL: "http://connect:8083"
       KSQL_KSQL_LOGGING_PROCESSING_TOPIC_REPLICATION_FACTOR: 1
       KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: 'true'
       KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: 'true'

   ksqldb-cli:
       image: confluentinc/cp-ksqldb-cli:8.1.0
       container_name: ksqldb-cli
       depends_on:
       - broker
       - ksqldb-server
       entrypoint: /bin/sh
       tty: true
   ```


#### IMPORTANT
ksqlDB logs only error messages and doesn’t use the log level from the
log4.properties file, which means that you can’t change the log level
of the processing log.

- For local deployments, edit the
  [log4j.properties](https://github.com/confluentinc/ksql/blob/master/config/log4j.properties)
  config file to assign Log4J properties.
- For Docker deployments, set the corresponding environment variables.
  For more information, see
  [Configure ksqlDB with Docker](../operate-and-deploy/installation/install-ksqldb-with-docker.md#ksqldb-install-configure-with-docker)
  and
  [Configure Docker Logging](../../installation/docker/operations/logging.md#docker-operations-logging).

All entries are written under the `processing` logger hierarchy.

Restart the ksqlDB Server for your configuration changes to take effect.

The following example shows how to configure the processing log to emit
all events at ERROR level or higher to an appender that writes to
`stdout`:

```properties
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c:%L)%n
log4j.logger.processing=ERROR, stdout
log4j.additivity.processing=false
```

If you’re using a Docker deployment, set the following environment
variables in your docker-compose.yml:

```properties
environment:
    # --- ksqlDB Server log config ---
    KSQL_LOG4J_ROOT_LOGLEVEL: "ERROR"
    KSQL_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
    # --- ksqlDB processing log config ---
    KSQL_LOG4J_PROCESSING_LOG_BROKERLIST: kafka:29092
    KSQL_LOG4J_PROCESSING_LOG_TOPIC: <ksql-processing-log-topic-name>
    KSQL_KSQL_LOGGING_PROCESSING_TOPIC_NAME: <ksql-processing-log-topic-name>
    KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
    KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
```

For more information, see
[Create a log4J configuration](https://developer.confluent.io/tutorials/handling-deserialization-errors/ksql.html#create-a-log4j-configuration)
in the
[How to handle deserialization errors](https://developer.confluent.io/tutorials/handling-deserialization-errors/ksql.html)
tutorial.

For the full Docker example configuration, see the
[Multi-node ksqlDB and Kafka Connect clusters](https://github.com/confluentinc/demo-scene/blob/master/multi-cluster-connect-and-ksql/docker-compose.yml)
demo.


### NONE

| Feature                                                                                     | Supported   |
|---------------------------------------------------------------------------------------------|-------------|
| As value format                                                                             | No          |
| As key format                                                                               | Yes         |
| Multi-Column Keys                                                                           | N/A         |
| [Schema Registry required](../operate-and-deploy/installation/server-config/avro-schema.md) | No          |
| [Schema inference](/reference/server-configuration#ksqlpersistencedefaultformatkey)         | No          |
| [Single field wrapping](#ksqldb-serialization-formats-single-field-unwrapping)              | No          |
| [Single field unwrapping](#ksqldb-serialization-formats-single-field-unwrapping)            | No          |

The `NONE` format is a special marker format that is used to indicate
ksqlDB should not attempt to deserialize that part of the Kafka
record.

It’s main use is as the `KEY_FORMAT` of key-less streams, especially
where a default key format has been set, via
[ksql.persistence.default.format.key](server-configuration.md#ksqldb-reference-server-configuration-persistence-default-format-key)
that supports Schema inference. If the key format was not overridden,
the server would attempt to load the key schema from the Schema Registry. If
the schema existed, the key columns would be inferred from the schema,
which may not be the intent. If the schema did not exist, the statement
would be rejected. In such situations, the key format can be set to
`NONE`:

```sql
CREATE STREAM KEYLESS_STREAM (
    VAL STRING
  ) WITH (
    KEY_FORMAT='NONE',
    VALUE_FORMAT='JSON',
    KAFKA_TOPIC='foo'
  );
```

Any statement that sets the key format to `NONE` and has key columns
defined, will result in an error.

If a `CREATE TABLE AS` or `CREATE STREAM AS` statement has a source
with a key format of `NONE`, but the newly created table or stream has
key columns, then you may either explicitly define the key format to use
in the `WITH` clause, or the default key format, as set in
[ksql.persistence.default.format.key](server-configuration.md#ksqldb-reference-server-configuration-persistence-default-format-key)
will be used.

Conversely, a `CREATE STREAM AS` statement that removes the key
columns, i.e. via `PARTITION BY null` will automatically set the key
format to `NONE`.

```sql
-- keyless stream with NONE key format:
CREATE STREAM KEYLESS_STREAM (
    VAL STRING
  ) WITH (
    KEY_FORMAT='NONE',
    VALUE_FORMAT='JSON',
    KAFKA_TOPIC='foo'
  );

-- Table created from stream with explicit key format declared in WITH clause:
CREATE TABLE T WITH (KEY_FORMAT='KAFKA') AS
  SELECT VAL, COUNT() FROM KEYLESS_STREAM
  GROUP BY VAL;

-- or, using the default key format set in the ksql.persistence.default.format.key config:
CREATE TABLE T AS
  SELECT VAL, COUNT() FROM KEYLESS_STREAM
  GROUP BY VAL;
```


### Start the stack

Next, set up and launch the services in the stack. But before you bring
it up, you need to make a few changes to the way that Postgres launches
so that it works well with Debezium. Debezium has dedicated
[documentation](https://debezium.io/documentation/reference/1.1/connectors/postgresql.html)
on this if you’re interested, but this guide covers just the essentials.
To simplify some of this, you launch a Postgres Docker container
[extended by Debezium](https://hub.docker.com/r/debezium/postgres) to
handle some of the customization. Also, you must create an additional
configuration file at `postgres/custom-config.conf` with the following
content:

```none
listen_addresses = '*'
wal_level = 'logical'
max_wal_senders = 1
max_replication_slots = 1
```

This sets up Postgres so that Debezium can watch for changes as they
occur.

With the Postgres configuration file in place, create a
`docker-compose.yml` file that defines the services to launch. You may
need to increase the amount of memory that you give to Docker when you
launch it:

```yaml

version: '2'

services:
  mongo:
    image: mongo:4.2.5
    hostname: mongo
    container_name: mongo
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: mongo-user
      MONGO_INITDB_ROOT_PASSWORD: mongo-pw
      MONGO_REPLICA_SET_NAME: my-replica-set
    command: --replSet my-replica-set --bind_ip_all

  postgres:
    image: debezium/postgres:12
    hostname: postgres
    container_name: postgres
    ports:
      - "5432:5432"
    environment:
      POSTGRES_USER: postgres-user
      POSTGRES_PASSWORD: postgres-pw
      POSTGRES_DB: customers
    volumes:
      - ./postgres/custom-config.conf:/etc/postgresql/postgresql.conf
    command: postgres -c config_file=/etc/postgresql/postgresql.conf

  elastic:
    image: elasticsearch:7.6.2
    hostname: elastic
    container_name: elastic
    ports:
      - "9200:9200"
      - "9300:9300"
    environment:
      discovery.type: single-node

  broker:
    image: confluentinc/cp-kafka:8.1.0
    hostname: broker
    container_name: broker
    ports:
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1

  schema-registry:
    image: confluentinc/cp-schema-registry:8.1.0
    hostname: schema-registry
    container_name: schema-registry
    depends_on:
      - broker
    ports:
      - "8081:8081"
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: "PLAINTEXT://broker:9092"

  ksqldb-server:
    image: confluentinc/cp-ksqldb-server:8.1.0
    hostname: ksqldb-server
    container_name: ksqldb-server
    depends_on:
      - broker
      - schema-registry
    ports:
      - "8088:8088"
    volumes:
      - "./confluent-hub-components/:/usr/share/kafka/plugins/"
    environment:
      KSQL_LISTENERS: "http://0.0.0.0:8088"
      KSQL_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
      KSQL_CONNECT_GROUP_ID: "ksql-connect-cluster"
      KSQL_CONNECT_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.storage.StringConverter"
      KSQL_CONNECT_VALUE_CONVERTER: "io.confluent.connect.avro.AvroConverter"
      KSQL_CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_CONNECT_CONFIG_STORAGE_TOPIC: "_ksql-connect-configs"
      KSQL_CONNECT_OFFSET_STORAGE_TOPIC: "_ksql-connect-offsets"
      KSQL_CONNECT_STATUS_STORAGE_TOPIC: "_ksql-connect-statuses"
      KSQL_CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
      KSQL_CONNECT_PLUGIN_PATH: "/usr/share/kafka/plugins"

  ksqldb-cli:
    image: confluentinc/cp-ksqldb-cli:8.1.0
    container_name: ksqldb-cli
    depends_on:
      - broker
      - ksqldb-server
    entrypoint: /bin/sh
    tty: true
```

There are a couple things to notice here. The Postgres image mounts the
custom configuration file that you wrote. Postgres adds these
configuration settings into its system-wide configuration. The
environment variables you gave it also set up a blank database called
`customers`, along with a user named `postgres-user` that can access
it.

The compose file also sets up MongoDB as a replica set named
`my-replica-set`. Debezium requires that MongoDB runs in this
configuration to pick up changes from its oplog (see Debezium’s
[documentation](https://debezium.io/documentation/reference/1.1/connectors/mongodb.html)
on MongoDB). In this case, you’re just running a single-node replica
set.

Finally, note that the ksqlDB server image mounts the
`confluent-hub-components` directory, too. The jar files that you
downloaded need to be on the classpath of ksqlDB when the server starts
up.

Bring up the entire stack by running:

```bash
docker-compose up
```


### Create the transactions stream

Connect to ksqlDB’s server by using its interactive CLI. Run the
following command from your host:

```bash
docker exec -it ksqldb-cli ksql http://ksqldb-server:8088
```

Before you issue more commands, tell ksqlDB to start all queries from
earliest point in each topic:

```sql
SET 'auto.offset.reset' = 'earliest';
```

We want to model a stream of credit card transactions from which we’ll
look for anomalous activity. To do that, create a ksqlDB stream to
represent the transactions. Each transaction has a few key pieces of
information, like the card number, amount, and email address that it’s
associated with. Because the specified topic (`transactions`) does not
exist yet, ksqlDB creates it on your behalf.

```sql
CREATE STREAM transactions (
    tx_id VARCHAR KEY,
    email_address VARCHAR,
    card_number VARCHAR,
    timestamp VARCHAR,
    amount DECIMAL(12, 2)
) WITH (
    kafka_topic = 'transactions',
    partitions = 8,
    value_format = 'avro',
    timestamp = 'timestamp',
    timestamp_format = 'yyyy-MM-dd''T''HH:mm:ss'
);
```

Notice that this stream is configured with a custom `timestamp` to
signal that
[event-time](../concepts/time-and-windows-in-ksqldb-queries.md#ksqldb-time-and-windows-event-time)
should be used instead of
[processing-time](../concepts/time-and-windows-in-ksqldb-queries.md#ksqldb-time-and-windows-processing-time).
What this means is that when ksqlDB does time-related operations over
the stream, it uses the `timestamp` column to measure time, not the
current time of the operating system. This makes it possible to handle
out-of-order events.

The stream is also configured to use the `Avro` format for the value
part of the underlying Kafka records that it generates. Because
ksqlDB has been configured with Schema Registry (as part of the Docker
Compose file), the schemas of each stream and table are centrally
tracked. We’ll make use of this in our microservice later.


### Create topics and mirror data to on-premises

1. In Confluent Cloud, use the unified Confluent CLI to create a topic with one partition called `cloud-topic`.
   ```bash
   confluent kafka topic create cloud-topic --partitions 1
   ```
2. In another command window on Confluent Cloud, start a producer to send some data into `cloud-topic`.
   ```bash
   confluent kafka topic produce cloud-topic --cluster $CC_CLUSTER_ID
   ```

   - Verify that the producer has started. Your output will resemble the following to show that the producer is ready.
     ```bash
     $ confluent kafka topic produce cloud-topic --cluster lkc-1vgo6
     Starting Kafka Producer. Use Ctrl-C or Ctrl-D to exit.
     ```
   - Type some entries of your choice into the producer window, hitting return after each entry to send.
     ```bash
     Riesling
     Pinot Blanc
     Verdejo
     ```
3. Mirror the `cloud-topic` on Confluent Platform, using the command `kafka-mirrors --create --mirror-topic <topic-name>`.

   The following command establishes a mirror of the original `cloud-topic`, using the cluster link `from-cloud-link`.
   ```bash
   kafka-mirrors --create --mirror-topic cloud-topic --link from-cloud-link --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
   ```

   You should get this verification that the mirror topic was created.
   ```bash
   Created topic cloud-topic.
   ```
4. On Confluent Platform, check the mirror topic status by running `kafka-mirrors --describe` on the `from-cloud-link`.
   ```bash
   kafka-mirrors --describe --link from-cloud-link --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
   ```

   Your output will show the status of any mirror topics on the specified link.
   ```bash
   Topic: cloud-topic        LinkName: from-cloud-link       LinkId: b1a56076-4d6f-45e0-9013-ff305abd0e54    MirrorTopic: cloud-topic        State: ACTIVE   StateTime: 2021-10-07 16:36:20
             Partition: 0    State: ACTIVE   DestLogEndOffset: 2     LastFetchSourceHighWatermark: 2 Lag: 0  TimeSinceLastFetchMs: 384566
   ```
5. Consume the data from the on-prem mirror topic.
   ```bash
   kafka-console-consumer --topic cloud-topic --from-beginning --bootstrap-server localhost:9092 --consumer.config $CONFLUENT_CONFIG/CP-command.config
   ```

   Your output should match the entries you typed into the Confluent Cloud producer in step 8.
   ![image](multi-dc-deployments/cluster-linking/images/cluster-link-hybrid-produce-consume.png)
6. View the configuration of your cluster link:
   ```bash
   kafka-configs --describe --cluster-link from-cloud-link --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
   ```

   The output for this command is a list of configurations, partially shown in the following example.
   ```bash
   Dynamic configs for cluster-link from-cloud-link are:
   metadata.max.age.ms=300000 sensitive=false synonyms={}
   reconnect.backoff.max.ms=1000 sensitive=false synonyms={}
   auto.create.mirror.topics.filters= sensitive=false synonyms={}
   ssl.engine.factory.class=null sensitive=false synonyms={}
   sasl.kerberos.ticket.renew.window.factor=0.8 sensitive=false synonyms={}
   reconnect.backoff.ms=50 sensitive=false synonyms={}
   consumer.offset.sync.ms=30000 sensitive=false synonyms={}

   ...

   link.mode=DESTINATION sensitive=false synonyms={}
   security.protocol=SASL_SSL sensitive=false synonyms={}
   acl.sync.ms=5000 sensitive=false synonyms={}
   ssl.keymanager.algorithm=SunX509 sensitive=false synonyms={}
   sasl.login.callback.handler.class=null sensitive=false synonyms={}
   replica.fetch.max.bytes=5242880 sensitive=false synonyms={}
   availability.check.consecutive.failure.threshold=5 sensitive=false synonyms={}
   sasl.login.refresh.window.jitter=0.05 sensitive=false synonyms={}
   ```


## About prerequisites and command examples

- These instructions assume you have a local installation of [Confluent Platform 7.0.0 or later](https://www.confluent.io/download/#confluent-platform), the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/installing.html), and Java 8, 11, or 17 (recommended).
  For details on Java requirements, see [Java](../../installation/system-requirements.md#sys-req-java) in System Requirements for Confluent Platform.
  If you are new to Confluent Platform, you may want to work through the [Quick Start for Confluent Platform](../../get-started/platform-quickstart.md#quickstart) first, and then return to this tutorial.
- The examples assume that your properties files are in the default locations on your Confluent Platform installation, except as otherwise noted.
  This should make it easier to copy/paste example commands directly into your terminal in most cases.
- With Confluent Platform is installed, Confluent CLI commands themselves can be run from any directory (`kafka-topics`, `kafka-console-producer`, `kafka-console-consumer`),
  but for commands that access properties files in `$CONFLUENT_HOME` (`kafka-server-start`), the examples show running these from within that directory.
  A reference for these open source utilities is provided in [Kafka Command-Line Interface (CLI) Tools](/kafka/operations-tools/kafka-tools.html).
  A reference for Confluent premium command line tools and utilities is provided in [CLI Tools for Confluent Platform](/platform/current/installation/cli-reference.html).
- Confluent CLI commands can specify the bootstrap server at the beginning or end of the command: `kafka-topics --list --bootstrap-server localhost:9092` is the same
  as `kafka-topics --bootstrap-server localhost:9092 --list`. In these tutorials, the target bootstrap server is specified at the end of commands.

The rest of the tutorial refers to `$CONFLUENT_HOME` to indicate your Confluent Platform install directory.
Set this as an environment variable, for example:

```bash
export CONFLUENT_HOME=$HOME/confluent-8.1.0
PATH=$CONFLUENT_HOME/bin:$PATH
```


### Create consumer, producer, and replicator configuration files

The Replicator executable script expects three configuration files:

- Configuration for the origin cluster
- Configuration for the destination cluster
- Replicator configuration

Create the following files in `$CONFLUENT_HOME/my-examples/`:

1. Configure the origin cluster in a new file named `consumer.properties`.
   ```none
   cp etc/kafka/consumer.properties my-examples/.
   ```

   Edit the file and make sure it contains the addresses of brokers from the **origin** cluster. The default broker list will match the origin cluster you started earlier.
   ```bash
   # Origin cluster connection configuration
   bootstrap.servers=localhost:9082
   ```
2. Configure the destination cluster in a new file named `producer.properties`.
   ```none
   cp etc/kafka/producer.properties my-examples/.
   ```

   Edit the file and make sure it contains the addresses of brokers from the **destination** cluster. The default broker list will match the destination cluster you started earlier.
   ```bash
   # Destination cluster connection configuration
   bootstrap.servers=localhost:9092
   ```
3. Define the Replicator configuration in a new file named `replication.properties` for the Connect worker. This quick start shows a configuration for `topic.rename.format` but any of the [Replicator Configuration Reference for Confluent Platform](configuration_options.md#replicator-config-options) that are not connection related can be supplied in this file.
   ```bash
   # Replication configuration
   topic.rename.format=${topic}.replica
   replication.factor=1
   config.storage.replication.factor=1
   offset.storage.replication.factor=1
   status.storage.replication.factor=1
   confluent.topic.replication.factor=1
   ```


#### Example Consumer Code


By default, each record is deserialized into an Avro `GenericRecord`, but in this tutorial the record should be deserialized using the application’s code-generated `Payment` class.
Therefore, configure the deserializer to use Avro `SpecificRecord`, i.e., `SPECIFIC_AVRO_READER_CONFIG` should be set to `true`. For example:

```java
...
import io.confluent.kafka.serializers.KafkaAvroDeserializer;
...
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class);
props.put(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, true);
...
KafkaConsumer<String, Payment> consumer = new KafkaConsumer<>(props));
consumer.subscribe(Collections.singletonList(TOPIC));
while (true) {
  ConsumerRecords<String, Payment> records = consumer.poll(100);
  for (ConsumerRecord<String, Payment> record : records) {
    String key = record.key();
    Payment value = record.value();
  }
}
...
```

Because the `pom.xml` includes `avro-maven-plugin`, the `Payment` class is automatically generated during compile.

In this example, the connection information to the Kafka brokers and Schema Registry is provided by the configuration file that is passed into the code, but if you want to specify the connection information directly in the client application, see [this java template](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs/java_producer_consumer.delta).

For a full Java consumer example, refer to [the consumer example](https://github.com/confluentinc/examples/tree/latest/clients/avro/src/main/java/io/confluent/examples/clients/basicavro/ConsumerExample.java).


### Authorizing Access to the Schemas Topic

If you enable [Kafka authorization](../../security/authorization/acls/overview.md#kafka-authorization), you must grant
the Schema Registry service principal the ability to perform the following [operations on
the specified resources](../../security/authorization/acls/overview.md#acl-format-operations-resources):

- `Read` and `Write` access to the internal  **\_schemas** topic. This ensures that only authorized users can make changes to the topic.
- `DescribeConfigs` on the schemas topic to verify that the topic exists
- `describe topic` on the schemas topic, giving the Schema Registry service principal the ability to list the schemas topic
- `DescribeConfigs` on the internal consumer offsets topic
- Access to the Schema Registry cluster (`group`)
- `Create` permissions on the Kafka cluster

```bash
export KAFKA_OPTS="-Djava.security.auth.login.config=<path to JAAS conf file>"

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --producer --consumer --topic _schemas --group schema-registry

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation DescribeConfigs --topic _schemas

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Describe --topic _schemas

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Read --topic _schemas

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Write --topic _schemas

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Describe --topic __consumer_offsets

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Create --cluster kafka-cluster
```

If you are using the [Schema Registry ACL Authorizer for Confluent Platform](../../confluent-security-plugins/schema-registry/authorization/sracl_authorizer.md#confluentsecurityplugins-sracl-authorizer), you also need permissions to `Read`, `Write`, and `DescribeConfigs` on the internal **\_schemas_acl** topic:

```bash
bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --producer --consumer --topic _schemas_acl --group schema-registry

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Read --topic _schemas_acl

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation Write --topic _schemas_acl

bin/kafka-acls --bootstrap-server localhost:9092 --command-config adminclient-configs.conf --add \
               --allow-principal 'User:<sr-principal>' --allow-host '*' \
               --operation DescribeConfigs --topic _schemas_acl
```


### Grant roles for the Schema Registry service principal

In these steps, you use the Confluent CLI to log on to MDS and create the Schema Registry
service principal . After you have these roles set up, you can use the Confluent CLI to
manage Schema Registry users. For this example, assume the commands use the MDS server
credentials, URLs, and property values you set up on your local Schema Registry properties file.
(Optionally, you can use a [registered cluster name](#sr-use-registred-cluster-name) in your role bindings.)

1. Log on to MDS.
   ```bash
   confluent login --url <https>://<metadata_server_url>:<port>
   ```
2. As a prerequisite to granting additional access, grant permission to create the topic `_schema_encoders`, which serves as the `metadata.encoder.topic` as described in [Schema Registry Configuration Reference for Confluent Platform](../installation/config.md#schemaregistry-config).
   ```bash
   confluent iam rbac role-binding create \
    --principal User:<sr-user-id> \
    --role ResourceOwner \
    --resource Topic:<_schema_encoders> \
    --kafka-cluster <kafka-cluster-id>
   ```

   For example:
   ```bash
   confluent iam rbac role-binding create \
    --principal User:jack-sr \
    --role ResourceOwner \
    --resource Topic:_schema_encoders \
    --kafka-cluster my-kafka-cluster-ID
   ```
3. Grant the user the role `SecurityAdmin` on the Schema Registry cluster.
   ```bash
   confluent iam rbac role-binding create \
   --role SecurityAdmin \
   --principal User:<service-account-id> \
   --kafka-cluster <kafka-cluster-id> \
   --schema-registry-cluster <schema-registry-group-id>
   ```
4. Use the command `confluent iam rbac role-binding list <flags>` to view the role you just created.
   ```bash
   confluent iam rbac role-binding list \
   --principal User:<service-account-id> \
   --kafka-cluster <kafka-cluster-id> \
   --schema-registry-cluster <schema-registry-group-id>
   ```

   For example, here is a listing for a user “jack-sr” granted `SecurityAdmin` role on “schema-registry-cool-cluster”, connecting to MDS through a Kafka cluster `my-kafka-cluster-ID`:
   ```bash
   confluent iam rbac role-binding list \
   --principal User:jack-sr \
   --kafka-cluster my-kafka-cluster-ID \
   --schema-registry-cluster schema-registry-cool-cluster

   Role            | ResourceType | Name | PatternType
   +---------------+--------------+------+-------------+
   SecurityAdmin   | Cluster      |      |
   ```
5. Grant the user the role `ResourceOwner` on the group that Schema Registry nodes use to coordinate across the cluster.
   ```bash
   confluent iam rbac role-binding create \
    --principal User:<sr-user-id> \
    --role ResourceOwner \
    --resource Group:<schema-registry-group-id> \
    --kafka-cluster <kafka-cluster-id>
   ```

   For example:
   ```bash
   confluent iam rbac role-binding create \
    --principal User:jack-sr \
    --role ResourceOwner \
    --resource Group:schema-registry-cool-cluster \
    --kafka-cluster my-kafka-cluster-ID
   ```
6. Grant the user the role `ResourceOwner` Kafka topic that Schema Registry uses to store its schemas.
   ```bash
   confluent iam rbac role-binding create \
    --principal User:<sr-user-id> \
    --role ResourceOwner \
    --resource Topic:<schemas-topic> \
    --kafka-cluster <kafka-cluster-id>
   ```

   For example:
   ```bash
   confluent iam rbac role-binding create \
    --principal User:jack-sr \
    --role ResourceOwner \
    --resource Topic:_jax-schemas-topic \
    --kafka-cluster my-kafka-cluster-ID
   ```
7. Use the command `confluent iam rbac role-binding list <flags>` to view the role you just created.
   ```bash
   confluent iam rbac role-binding list \
   --principal User:jack-sr \
   --role ResourceOwner \
   --kafka-cluster my-kafka-cluster-ID
   ```

   For example:
   ```bash
   confluent iam rbac role-binding list \
   --principal User:jack-sr \
   --role ResourceOwner \
   --kafka-cluster my-kafka-cluster-ID

   Role          | ResourceType |            Name                  | PatternType
   +-------------+--------------+----------------------------------+-------------+
   ResourceOwner | Topic        | _jax-schemas-topic               | LITERAL
   ResourceOwner | Topic        | __schema_encoders                | LITERAL
   ResourceOwner | Group        | schema-registry-cool-cluster     | LITERAL
   ResourceOwner | Topic        | _schemas                         | LITERAL
   ResourceOwner | Group        | schema-registry                  | LITERAL
   ```


### Client authentication and authorization

Configure license client authentication
: When using principal propagation and the following security types, you must
  configure client authentication for the license topic. For more information,
  see the following documentation:


  - [SASL OAUTHBEARER (RBAC) client authentication](../../security/authentication/sasl/oauthbearer/configure-clients.md#security-sasl-rbac-oauthbearer-clientconfig)
  - [SASL PLAIN client authentication](../../security/authentication/sasl/plain/overview.md#sasl-plain-clients)
  - [SASL SCRAM client authentication](../../security/authentication/sasl/scram/overview.md#sasl-scram-clients)
  - [mTLS client authentication](../../security/authentication/mutual-tls/overview.md#authentication-ssl-clients)

Configure license client authorization
: When using principal propagation and RBAC or ACLs, you must configure client
  authorization for the license topic.


  #### NOTE
  The `_confluent-command` internal topic is available as the preferred
  alternative to the `_confluent-license` topic for components such as Schema Registry, REST Proxy, and Confluent Server
  (which were previously using `_confluent-license`). Both topics will be supported going
  forward. Here are some guidelines:


  - New deployments (Confluent Platform 6.2.1 and later) will default to using `_confluent-command` as shown below.
  - Existing clusters will continue using the `_confluent-license` unless manually changed.
  - Newly created clusters on Confluent Platform 6.2.1 and later will default to creating the
    `_confluent-command`  topic, and only existing clusters that already have a
    `_confluent-license` topic will continue to use it.


  - **RBAC authorization**


    Run this command to add `ResourceOwner` for the component user for the
    Confluent license topic resource (default name is `_confluent-command`).
    ```none
    confluent iam rbac role-binding create \
    --role ResourceOwner \
    --principal User:<service-account-id> \
    --resource Topic:_confluent-command \
    --kafka-cluster <kafka-cluster-id>
    ```
  - **ACL authorization**


    Run this command to configure Kafka authorization, where bootstrap server,
    client configuration, service account ID is specified. This grants create,
    read, and write on the `_confluent-command` topic.
    ```none
    kafka-acls --bootstrap-server <broker-listener> --command-config <client conf> \
    --add --allow-principal User:<service-account-id>  --operation Create --operation Read --operation Write \
    --topic _confluent-command
    ```


#### Schema Registry

- Additional RBAC configurations required for [schema-registry.properties](https://github.com/confluentinc/examples/tree/latest/security/rbac/delta_configs/schema-registry.properties.delta)
  ```none
  kafkastore.bootstrap.servers=localhost:9092
  kafkastore.security.protocol=SASL_PLAINTEXT
  kafkastore.sasl.mechanism=OAUTHBEARER
  kafkastore.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
  kafkastore.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required username="sr" password="sr1" metadataServerUrls="http://localhost:8090";

  # Schema Registry group id, which is the cluster id
  schema.registry.group.id=schema-registry-demo

  # These properties install the Schema Registry security plugin, and configure it to use RBAC for authorization and OAuth for authentication
  resource.extension.class=io.confluent.kafka.schemaregistry.security.SchemaRegistrySecurityResourceExtension
  confluent.schema.registry.authorizer.class=io.confluent.kafka.schemaregistry.security.authorizer.rbac.RbacAuthorizer
  rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler

  # The location of a running metadata service; used to verify that requests are authorized by the users that make them
  confluent.metadata.bootstrap.server.urls=http://localhost:8090

  # Credentials to use when communicating with the MDS; these should usually match the ones used for communicating with Kafka
  confluent.metadata.basic.auth.user.info=sr:sr1
  confluent.metadata.http.auth.credentials.provider=BASIC

  # The path to public keys that should be used to verify json web tokens during authentication
  public.key.path=/tmp/tokenPublicKey.pem

  # This enables anonymous access with a principal of User:ANONYMOUS
  confluent.schema.registry.anonymous.principal=true
  authentication.skip.paths=/*
  ```
- Role bindings:
  ```bash
  # Schema Registry Admin
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_SCHEMA_REGISTRY --role ResourceOwner --resource Topic:_schemas --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_SCHEMA_REGISTRY --role SecurityAdmin --kafka-cluster $KAFKA_CLUSTER_ID --schema-registry-cluster $SCHEMA_REGISTRY_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_SCHEMA_REGISTRY --role ResourceOwner --resource Group:$SCHEMA_REGISTRY_CLUSTER_ID --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_SCHEMA_REGISTRY --role DeveloperRead --resource Topic:$LICENSE_TOPIC --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_SCHEMA_REGISTRY --role DeveloperWrite --resource Topic:$LICENSE_TOPIC --kafka-cluster $KAFKA_CLUSTER_ID

  # Client connecting to Schema Registry
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_A --role ResourceOwner --resource Subject:$SUBJECT --kafka-cluster $KAFKA_CLUSTER_ID --schema-registry-cluster $SCHEMA_REGISTRY_CLUSTER_ID
  ```


### How the audit log migration tool works

The audit log migration tool performs the following tasks:

- Sets the output bootstrap servers to the value specified (when specified).
  Note that output bootstrap servers are empty by default.
- Combines the input audit log destination topics. For topics that appear in more
  than one Kafka cluster configuration, the migration tool uses the maximum
  [retention time](audit-logs-concepts.md#audit-log-retention) specified in the configuration.
- Sets the default audit log topic as `confluent-audit-log-events`. If necessary, the
  migration tool will add this topic to the set of destination topics (in which
  case, it specifies a retention period of 7776000000 milliseconds).
- Combines the set of all [excluded principals](audit-logs-concepts.md#audit-logs-excluded-principals).
- Replaces the `/kafka=*/` part of each [Confluent Resource Name (CRN)](audit-logs-concepts.md#confluent-resource-name) pattern
  using the cluster ID of the contributing Kafka cluster. For example, a route in
  the configuration from `cluster1` with a CRN like
  `crn:///kafka=*/topic=accounting-*` will be transformed to
  `crn:///kafka=cluster1/topic=accounting-*`.

  For routes that have a CRN that uses something other than `/kafka=*/`, the
  migration tool will not replace the Kafka cluster ID. For example, if a route specifies
  `kafka=pkc-123` and the cluster ID is `pkc-abc` then the tool will leave
  it untouched and return the warning:
  ```none
  Mismatched Kafka Cluster Warning: Routes from one Kafka cluster ID on a
  completely different cluster ID are unexpected, but not necessarily wrong.
  For example, this message might be returned if you attempt to reuse the
  same routing configuration on multiple clusters.
  ```
- For any incoming audit log router configurations that have default topics other
  than `confluent-audit-log-events`, the script will add extra routes
  for the following CRN patterns (if they do not already exist):

  | Topic Route                                                              | Event Category Type   |
  |--------------------------------------------------------------------------|-----------------------|
  | `crn://<authority>/kafka=<cluster-id>`                                   | AUTHORIZE, MANAGEMENT |
  | `crn://<authority>/kafka=<cluster-id>/topic=*`                           | AUTHORIZE, MANAGEMENT |
  | `crn://<authority>/kafka=<cluster-id>/control-center-broker-metrics=*`   | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/control-center-alerts=*`           | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/delegation-token=*`                | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/control-center-broker-metrics=*`   | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/control-center-alerts=*`           | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/cluster-registry=*`                | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/security-metadata=*`               | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/all=*`                             | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/connect=<connect-id>`              | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/connect=<connect-id>/connector=*`  | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/connect=<connect-id>/secret=*`     | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/connect=<connect-id>/all=*`        | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/schema-registry=<sr-id>`           | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/schema-registry=<sr-id>/subject=*` | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/schema-registry=<sr-id>/all=*`     | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/ksql=<id>`                         | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/ksql=<id>/ksql-cluster=*`          | AUTHORIZE             |
  | `crn://<authority>/kafka=<cluster-id>/ksql=<id>/all=*`                   | AUTHORIZE             |

  #### NOTE
  If you do not want the routes listed above added in your newly-migrated
  audit log configuration, then edit your input `server.properties` files
  to only use `confluent-audit-log-events` in the `default_topics`
  before migrating.


### View audit logs on the fly

During initial setup or troubleshooting, you can quickly and iteratively examine
recent audit log entries using simple command line tools. This can aid in audit log
troubleshooting, and also when troubleshooting role bindings for RBAC.

First pipe your audit log topics into a local file. Grep works faster on your
local file system.

If you have the `kafka-console-consumer` installed locally and can directly
consume from the audit log destination Kafka cluster, your command should look
similar to the following:

```none
./kafka-console-consumer --bootstrap-server auditlog.example.com:9092 --consumer.config ~/auditlog-consumer.properties --whitelist '^confluent-audit-log-events.*' > /tmp/streaming.audit.log
```

If you don’t have direct access and must instead connect using a “jump box” (a machine
or server on a network that you use to access and manage devices in a separate
security zone), use a command similar to the following:

```none
ssh -tt -i ~/.ssh/theuser.key theuser@jumpbox './kafka-console-consumer --bootstrap-server auditlog.example.com:9092 --consumer.config ~/auditlog-consumer.properties --whitelist '"'"'^confluent-audit-log-events.*'"'"' ' > /tmp/streaming.audit.log
```

Regardless of which method you use, at this point you can open another terminal
and locally run `tail -f /tmp/streaming.audit.log` to view audit log messages
on the fly. After you’ve gotten the audit logs, you can use [grep](https://man7.org/linux/man-pages/man1/grep.1.html)
and [jq](https://stedolan.github.io/jq/) (or another utility) to examine
them. For example:

```none
tail -f /tmp/streaming.audit.log | grep 'connect' | jq -c '[.time, .data.authenticationInfo.principal, .data.authorizationInfo.operation, .data.resourceName]'
```


### Configure Replicator configuration connection

1. Define the `kafka_connect_replicator` group and `hosts` to deploy to.

   For example:
   ```yaml
   kafka_connect_replicator:
     hosts:
       ip-172-31-34-246.us-east-2.compute.internal:
   ```
2. Define the listener for Replicator configuration cluster.

   The following is an example of a listener with Kerberos authentication and
   TLS enabled:
   ```yaml
   kafka_connect_replicator_listener:
     ssl_enabled: true
     sasl_protocol: kerberos
   ```
3. Define the basic configuration for Replicator connection:
   ```yaml
   kafka_connect_replicator_white_list:
   kafka_connect_replicator_bootstrap_servers: <configuration cluster hostname:port>
   ```
4. Define security configuration for the Replicator connection:
   ```yaml
   kafka_connect_replicator_kerberos_principal: <Kafka principal primary>
   kafka_connect_replicator_kerberos_keytab_path: <path to your keytab>
   kafka_connect_replicator_ssl_ca_cert_path: <path to your CA certificate>
   kafka_connect_replicator_ssl_cert_path: <path to your signed certificate>
   kafka_connect_replicator_ssl_key_path: <path to your SSL key>
   kafka_connect_replicator_ssl_key_password: <SSL key password>
   ```
5. For RBAC-enabled deployment, define the additional security configuration.

   Specify either the Kafka cluster id
   (`kafka_connect_replicator_kafka_cluster_id`) or the cluster name
   (`kafka_connect_replicator_kafka_cluster_name`).
   ```yaml
   kafka_connect_replicator_rbac_enabled: true
   kafka_connect_replicator_erp_tls_enabled: <true if Confluent REST API has TLS enabled>
   kafka_connect_replicator_erp_host: <Confluent Rest API host URL>
   kafka_connect_replicator_erp_admin_user: <mds or your Kafka super user>
   kafka_connect_replicator_erp_admin_password: <password>
   kafka_connect_replicator_kafka_cluster_id: <destination cluster id>
   kafka_connect_replicator_kafka_cluster_name: <destination cluster name>
   kafka_connect_replicator_erp_pem_file: <path to oauth pem file>
   ```
6. Set the `CLASSPATH` to the replicator installation directory in
   `kafka_connect_service_environment_overrides`:
   ```yaml
   kafka_connect_service_environment_overrides:
     CLASSPATH: <path to replicator install>/*
   ```

   For more information about setting required Confluent Platform environment variables using
   Ansible, see [Set environment variables](ansible-configure.md#ansible-override-env-varabiles).


### Configure producer connection

1. Define the listener configuration for the producer connection to the
   destination cluster.

   The following is an example with TLS and Kerberos for authentication enabled:
   ```yaml
   kafka_connect_replicator_producer_listener:
     ssl_enabled: true
     sasl_protocol: kerberos
   ```
2. Define the basic producer configuration:
   ```yaml
   kafka_connect_replicator_producer_bootstrap_servers: <destination cluster hostname:port>
   ```
3. Define the security configuration for the producer connection:
   ```yaml
   kafka_connect_replicator_producer_kerberos_principal: <kafka principal primary>
   kafka_connect_replicator_producer_kerberos_keytab_path: <path to your keytab>
   kafka_connect_replicator_producer_ssl_ca_cert_path: <path to your CA cert>
   kafka_connect_replicator_producer_ssl_cert_path: <path to your signed cert>
   kafka_connect_replicator_producer_ssl_key_path: <path to your ssl key>
   kafka_connect_replicator_producer_ssl_key_password: <ssl key password>
   ```
4. Define custom properties for each client connection:
   ```yaml
   kafka_connect_replicator_producer_custom_properties:
     <custom property:value>
   ```
5. For RBAC-enabled deployment, define the additional producer custom
   properties.

   `kafka_connect_replicator_producer` configs default to match
   `kafka_connect_replicator` configs. The following are required only if you
   are producing to a different cluster than where you are storing your configs.

   Specify either the Kafka cluster id
   (`kafka_connect_replicator_producer_kafka_cluster_id`) or the cluster name
   (`kafka_connect_replicator_producer_kafka_cluster_name`).
   ```yaml
   kafka_connect_replicator_producer_rbac_enabled: true
   kafka_connect_replicator_producer_erp_tls_enabled: <true if Confluent REST API has TLS enabled>
   kafka_connect_replicator_producer_erp_host: <Confluent Rest API host URL>
   kafka_connect_replicator_producer_erp_admin_user: <mds or your Kafka super user>
   kafka_connect_replicator_producer_erp_admin_password: <password>
   kafka_connect_replicator_producer_kafka_cluster_id: <destination cluster id>
   kafka_connect_replicator_producer_kafka_cluster_name: <destination cluster name>
   kafka_connect_replicator_producer_erp_pem_file: <path to oauth pem file>
   ```


### Run Replicator Docker Container with Kubernetes

1. Delete existing secret (if it exists).
   ```bash
   kubectl delete secret replicator-secret-props
   ```
2. Regenerate configs, if changed. (See the [Quick Start](replicator-cloud-quickstart.md#cloud-replicator-quickstart).)
3. Upload the new secret.
   ```bash
   kubectl create secret generic replicator-secret-props --from-file=/tmp/replicator/
   ```
4. Reload pods.
   ```bash
   kubectl apply -f container/replicator-deployment.yaml
   ```

   Here is an example `replicator-deployment.yaml`.
   ```yaml
   apiVersion: extensions/v1beta1
   kind: Deployment
   metadata:
     name: repl-exec-connect-cluster
   spec:
     replicas: 1
     template:
       metadata:
         labels:
           app: replicator-app
       spec:
         containers:
           - name: confluent-replicator
             image: confluentinc/cp-enterprise-replicator-executable
             env:
               - name: CLUSTER_ID
                 value: "replicator-k8s"
               - name: CLUSTER_THREADS
                 value: "1"
               - name: CONNECT_GROUP_ID
                 value: "containerized-repl"

                 # Note: This is to avoid _overlay errors_ . You could use /etc/replicator/ here instead.
               - name: REPLICATION_CONFIG
                 value: "/etc/replicator-config/replication.properties"
               - name: PRODUCER_CONFIG
                 value: "/etc/replicator-config/producer.properties"
               - name: CONSUMER_CONFIG
                 value: "/etc/replicator-config/consumer.properties"
             volumeMounts:
               - name: replicator-properties
                 mountPath: /etc/replicator-config/
         volumes:
           - name: replicator-properties
             secret:
               secretName: "replicator-secret-props"
               defaultMode: 0666
   ```
5. Verify status.
   ```bash
   kubectl get pods kubectl logs <pod-name> -f
   ```


### Describe a custom connector

Use the following command to get connector details.

Command syntax:

```bash
confluent connect cluster describe <id> [flags]
```

For example:

```bash
confluent connect cluster describe clcc-wzxp69 --cluster lkc-abcd123
```

Example output:

```bash
  Connector Details
+--------+---------------------+
| ID     | clcc-wzxp69         |
| Name   | my-custom-connector |
| Status | RUNNING             |
| Type   | source              |
+--------+---------------------+


Task Level Details
  Task ID |  State
----------+----------
        0 | RUNNING


Configuration Details
              Config             |                          Value
---------------------------------+----------------------------------------------------------
  cloud.environment              | prod
  cloud.provider                 | aws
  confluent.custom.plugin.id     | custom-plugin-epp0ye
  connector.class                | io.confluent.kafka.connect.datagen.DatagenConnector
  iterations                     |                                                10000000
  kafka.api.key                  | ****************
  kafka.api.secret               | ****************
  kafka.auth.mode                | KAFKA_API_KEY
  kafka.endpoint                 | SASL_SSL://pkc-abcd5.us-west-2.aws.confluent.cloud:9092
  kafka.region                   | us-west-2
  kafka.topic                    | pageviews
  key.converter                  | org.apache.kafka.connect.storage.StringConverter
  max.interval                   |                                                     100
  name                           | custom-datagen_0
  quickstart                     | pageviews
  tasks.max                      |                                                       1
  value.converter                | org.apache.kafka.connect.json.JsonConverter
  value.converter.schemas.enable | false
```


#### Step 3: Create the connector configuration file

Create a JSON file that contains the connector configuration properties. The
following example shows required and optional connector properties:

```none
{
  "connector.class": "AlloyDbSink",
  "name": "AlloyDbSinkConnector_0",
  "input.data.format": "AVRO",
  "kafka.auth.mode": "KAFKA_API_KEY",
  "kafka.api.key": "****************",
  "kafka.api.secret": "****************************************************************",
  "connection.host": "34.27.121.137",
  "connection.port": "5432",
  "connection.user": "postgres",
  "connection.password": "**************",
  "db.name": "postgres",
  "topics": "postgresql_ratings",
  "insert.mode": "UPSERT",
  "db.timezone": "UTC",
  "auto.create": "true",
  "auto.evolve": "true",
  "pk.mode": "record_value",
  "pk.fields": "user_id",
  "tasks.max": "1"
}
```

Note the following property definitions. See the [AlloyDB Sink
configuration properties](#cc-alloydb-sink-config-properties) for additional
property values and definitions.

* `"connector.class"`: Identifies the connector plugin name.
* `"name"`: Sets a name for your new connector.

* `"kafka.auth.mode"`: Identifies the connector authentication mode you want to use. There are two options: `SERVICE_ACCOUNT` or `KAFKA_API_KEY` (the default). To use an API key and secret, specify the configuration properties `kafka.api.key` and `kafka.api.secret`, as shown in the example configuration (above).  To use a [service account](service-account.md#s3-cloud-service-account), specify the **Resource ID** in the property `kafka.service.account.id=<service-account-resource-ID>`. To list the available service account resource IDs, use the following command:
  ```bash
  confluent iam service-account list
  ```

  For example:
  ```bash
  confluent iam service-account list

     Id     | Resource ID |       Name        |    Description
  +---------+-------------+-------------------+-------------------
     123456 | sa-l1r23m   | sa-1              | Service account 1
     789101 | sa-l4d56p   | sa-2              | Service account 2
  ```

* `"connection.host"`: The hostname or the IP address of the VM running the AlloyDB Auth Proxy.
* `"connection.port"`: The AlloyDB database connection port. Defaults to
  `5432`.
* `"connection.user"`: The AlloyDB database user name.
* `"connection.password"`: The AlloyDB database password.
* `"db.name"`: The AlloyDB database name.
* `"input.data.format"`:  Sets the input Kafka record value format (data coming from the Kafka topic). Valid entries are **AVRO**, **JSON_SR** (JSON Schema), or **PROTOBUF**. You must have Confluent Cloud Schema Registry configured if using a schema-based message format.
* `"input.key.format"`: Sets the input record key format (data coming from the Kafka topic). Valid entries are **AVRO**, **JSON_SR** (JSON Schema), **PROTOBUF**, or **STRING**. You must have Confluent Cloud Schema Registry configured if using a schema-based message format.
* `"delete.on.null"`: Whether to treat null record values as deletes. Defaults to `false`. Requires `pk.mode` to be `record_key`. Defaults to `false`.
* `"topics"`: Identifies the topic name or a comma-separated list of topic names.
* `"insert.mode"`: Enter one of the following modes:
  - `INSERT`: Use the standard `INSERT` row function. An error occurs if the row already exists in the table.
  - `UPSERT`: This mode is similar to `INSERT`. However, if the row already exists, the `UPSERT` function overwrites column values with the new values provided.
* `db.timezone`: Name of the time zone the connector uses when inserting time-based values. Defaults to UTC.
* `"auto.create"` (tables) and `"auto-evolve"` (columns): (Optional) Sets whether to automatically create tables or columns if they are missing relative to the input record schema. If not entered in the configuration, both default to `false`. When\`\`auto.create\`\` is set to `true`, the connector creates a table name using `${topic}` (that is, the Kafka topic name). For more information, see [Table names and Kafka topic names](#cc-alloydb-sink-truncation-behavior) and the [AlloyDB Sink configuration properties](#cc-alloydb-sink-config-properties).
* `"pk.mode"`: Supported modes are listed below:
  - `kafka`: Kafka coordinates are used as the primary key. Must be used with the `"pk.fields"` property.
  - `none`: No primary keys used.
  - `record_key`: Fields from the record key are used. May be a primitive or a struct.
  - `record_value`: Fields from the Kafka record value are used. Must be a struct type.
* `"pk.fields"`: A list of comma-separated primary key field names. The runtime interpretation of this property depends on the `pk.mode` selected. Options are listed below:
  - `kafka`: Must be three values representing the Kafka coordinates. If left empty, the coordinates default to `__connect_topic,__connect_partition,__connect_offset`.
  - `none`: PK Fields not used.
  - `record_key`: If left empty, all fields from the key struct are used. Otherwise, this is used to extract the fields in the property. A single field name must be configured for a primitive key.
  - `record_value`: Used to extract fields from the record value. If left empty, all fields from the value struct are used.
* `"tasks.max"`: Maximum number of tasks the connector can run. See Confluent Cloud [connector limitations](limits.md#cc-alloydb-sink-limits) for additional task information.

**Single Message Transforms**: See the [Single Message Transforms (SMT)](single-message-transforms.md#cc-single-message-transforms) documentation for details about adding SMTs using the CLI.

See [Configuration Properties](#cc-alloydb-sink-config-properties) for all property values and
definitions.


#### Step 3: Create the connector configuration file

Create a JSON file that contains the connector configuration properties. The
following example shows an example configuration. For two additional examples,
see [Configuration JSON Examples](#cc-amazon-lambda-sink-config-examples).

```none
{
  "connector.class": "LambdaSink",
  "name": "LambdaSinkConnector_0",
  "topics": "topic_aws_lambda_1",
  "input.data.format": "JSON",
  "kafka.auth.mode": "KAFKA_API_KEY",
  "kafka.api.key": "****************",
  "kafka.api.secret": "*************************************************",
  "aws.access.key.id": "****************",
  "aws.secret.access.key": "********************************************",
  "aws.lambda.configuration.mode": "single",
  "aws.lambda.function.name": "lambda-test",
  "aws.lambda.invocation.type": "sync",
  "behavior.on.error": "fail",
  "tasks.max": "1"
}
```

Note the following required property definitions:

* `"connector.class"`: Identifies the connector plugin name.
* `"name"`: Sets a name for your new connector.
* `"topics"`: Identifies the topic name or a comma-separated list of topic names.

* `"kafka.auth.mode"`: Identifies the connector authentication mode you want to use. There are two options: `SERVICE_ACCOUNT` or `KAFKA_API_KEY` (the default). To use an API key and secret, specify the configuration properties `kafka.api.key` and `kafka.api.secret`, as shown in the example configuration (above).  To use a [service account](service-account.md#s3-cloud-service-account), specify the **Resource ID** in the property `kafka.service.account.id=<service-account-resource-ID>`. To list the available service account resource IDs, use the following command:
  ```bash
  confluent iam service-account list
  ```

  For example:
  ```bash
  confluent iam service-account list

     Id     | Resource ID |       Name        |    Description
  +---------+-------------+-------------------+-------------------
     123456 | sa-l1r23m   | sa-1              | Service account 1
     789101 | sa-l4d56p   | sa-2              | Service account 2
  ```

* `"input.data.format"`:  Sets the input Kafka record value format (data coming from the Kafka topic). Valid entries are **AVRO**, **JSON_SR** (JSON Schema), **PROTOBUF**, **JSON** (Schemaless), or **BYTES**. You must have Confluent Cloud Schema Registry configured if using a schema-based message format.

  #### NOTE
  If no schema is defined, values are encoded as plain strings. For example,  `"name": "Kimberley Human"` is encoded as `name=Kimberley Human`.
* `"aws.access.key.id"` and `"aws.secret.access.key"`: Enter the AWS Access Key ID and Secret. For information about how to set these up, see [Access Keys](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
* `"aws.lambda.configuration.mode"`: The mode in which to run the connector.
  Options are `multiple` to invoke multiple AWS Lambda functions or
  `single` (the default) to invoke a single function. One connector instance
  can support a maximum of 10 functions.
* `"aws.lambda.function.name"`: The AWS Lambda function to invoke for
  `single` configuration mode.
* `"aws.lambda.topic2function.map"`: A map of Kafka topics to AWS Lambda
  functions for `multiple` configuration mode. Enter the map as comma-
  separated tuples. For example: `<topic-1>;<function-1>,<topic-2>;<function-2>,...`.
  You can map a maximum of three functions to a single topic.


#### NOTE
The following steps show basic ACL entries for source connector service
accounts. Make sure to review [Debezium [Legacy] Source Connectors](#cloud-service-account-debezium-acls) and
[JDBC-based Source Connectors and the MongoDB Atlas Source Connector](#cloud-service-account-jdbc-mongo-acls) for additional ACL entries that
may be required for certain connectors.

1. Create a service account named `myserviceaccount`:
   ```none
   confluent iam service-account create myserviceaccount --description "test service account"
   ```
2. Find the service account ID for `myserviceaccount`:
   ```none
   confluent iam service-account list
   ```
3. Set a DESCRIBE ACL to the cluster.
   ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" --operations describe --cluster-scope
   ```
4. Set a WRITE ACL to `passengers`:
   ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" --operations write --topic "passengers"
   ```
5. Create a Kafka API key and secret for `<service-account-id>`:
   ```none
   confluent api-key create --resource "lkc-abcd123" --service-account "<service-account-id>"
   ```
6. Save the API key and secret.

The connector configuration must include either an API key and secret or a
service account ID. For additional service account information, see
[Service Accounts on Confluent Cloud](../security/authenticate/workload-identities/service-accounts/overview.md#service-accounts).


## Connect a Java application to Confluent Cloud

To configure Java clients for Kafka to connect to a Kafka cluster in Confluent Cloud:

1. Add the Kafka client dependency to your project. For Maven:
   ```xml
   <dependency>
       <groupId>org.apache.kafka</groupId>
       <artifactId>kafka-clients</artifactId>
       <version>${kafka.clients.version}</version>
   </dependency>
   ```

   For Gradle:
   ```groovy
   implementation "org.apache.kafka:kafka-clients:${kafkaClientsVersion}"
   ```

   Use a current, supported version per your build’s BOM or dependency management policy.
2. Configure your Java application with the connection properties. You can obtain these
   from the Confluent Cloud Console by selecting your cluster and clicking **Clients**.
3. Use the configuration in your producer or consumer code:
   ```java
   Properties props = new Properties();
   props.put("bootstrap.servers", "your-bootstrap-servers");
   props.put("security.protocol", "SASL_SSL");
   props.put("sasl.mechanism", "PLAIN");
   props.put("sasl.jaas.config", "org.apache.kafka.common.security.plain.PlainLoginModule required username='your-api-key' password='your-api-secret';");

   // Create producer or consumer
   KafkaProducer<String, String> producer = new KafkaProducer<>(props);
   ```
4. See the [Java client examples](https://github.com/confluentinc/examples/tree/latest/clients/cloud/java)
   for complete working examples.
5. Integrate with your environment.


## Connect a C/C++ application to Confluent Cloud

To configure a C/C++ application using the [librdkafka client](https://github.com/edenhill/librdkafka) to connect to a Kafka cluster in Confluent Cloud:

1. **Prerequisite**: Ensure you have installed the `librdkafka` library on your system or included it in your project’s build process.
2. In your application code, create a configuration object and set the properties for connecting to Confluent Cloud.
   ```c
   #include <librdkafka/rdkafka.h>
   // ...

   rd_kafka_conf_t *conf;
   char errstr[512];

   conf = rd_kafka_conf_new();

   // Confluent Cloud bootstrap servers
   if (rd_kafka_conf_set(conf, "bootstrap.servers", "<BOOTSTRAP_SERVERS>", errstr, sizeof(errstr)) != RD_KAFKA_CONF_OK) {
       fprintf(stderr, "%s\n", errstr);
       // Handle error
   }

   // Security configuration
   rd_kafka_conf_set(conf, "security.protocol", "SASL_SSL", errstr, sizeof(errstr));
   rd_kafka_conf_set(conf, "sasl.mechanisms", "PLAIN", errstr, sizeof(errstr));
   rd_kafka_conf_set(conf, "sasl.username", "<API_KEY>", errstr, sizeof(errstr));
   rd_kafka_conf_set(conf, "sasl.password", "<API_SECRET>", errstr, sizeof(errstr));

   // See the Client Prerequisites section for details on ssl.ca.location.
   // For librdkafka v2.11 or later, it is typically not required.

   // ... create producer or consumer instance with this conf ...
   ```
3. For complete, working projects, refer to the official [librdkafka examples directory](https://github.com/edenhill/librdkafka/tree/master/examples).


### Implement a permanent and in-line UDF

```java
package io.confluent.flink.examples.table;

import io.confluent.flink.plugin.ConfluentSettings;

import org.apache.flink.table.api.EnvironmentSettings;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.table.functions.ScalarFunction;
import org.apache.flink.table.functions.TableFunction;

import java.util.List;

import static org.apache.flink.table.api.Expressions.$;
import static org.apache.flink.table.api.Expressions.array;
import static org.apache.flink.table.api.Expressions.call;
import static org.apache.flink.table.api.Expressions.row;

/**
* A table program example showing how to use User-Defined Functions
* (UDFs) in the Flink Table API.
*
* <p>The Flink Table API simplifies the process of creating and managing UDFs.
*
* <ul>
*   <li>It helps creating a JAR file containing all required dependencies for a given UDF.
*   <li>Uploads the JAR to Confluent artifact API.
*   <li>Creates SQL functions for given artifacts.
* </ul>
*/
public class Example_09_Functions {

   // Fill this with an environment you have write access to
   static final String TARGET_CATALOG = "";

   // Fill this with a Kafka cluster you have write access to
   static final String TARGET_DATABASE = "";

   // All logic is defined in a main() method. It can run both in an IDE or CI/CD system.
   public static void main(String[] args) {
      // Setup connection properties to Confluent Cloud
      EnvironmentSettings settings = ConfluentSettings.fromResource("/cloud.properties");

      // Initialize the session context to get started
      TableEnvironment env = TableEnvironment.create(settings);

      // Set default catalog and database
      env.useCatalog(TARGET_CATALOG);
      env.useDatabase(TARGET_DATABASE);

      System.out.println("Registering a scalar function...");
      // The Table API underneath creates a temporary JAR file containing all transitive classes
      // required to run the function, uploads it to Confluent Cloud, and registers the function
      // using the previously uploaded artifact.
      env.createFunction("CustomTax", CustomTax.class, true);

      // As of now, Scalar and Table functions are supported.
      System.out.println("Registering a table function...");
      env.createFunction("Explode", Explode.class, true);

      // Once registered, the functions can be used in Table API and SQL queries.
      System.out.println("Executing registered UDFs...");
      env.fromValues(row("Apple", "USA", 2), row("Apple", "EU", 3))
               .select(
                        $("f0").as("product"),
                        $("f1").as("location"),
                        $("f2").times(call("CustomTax", $("f1"))).as("tax"))
               .execute()
               .print();

      env.fromValues(
                        row(1L, "Ann", array("Apples", "Bananas")),
                        row(2L, "Peter", array("Apples", "Pears")))
               .joinLateral(call("Explode", $("f2")).as("fruit"))
               .select($("f0").as("id"), $("f1").as("name"), $("fruit"))
               .execute()
               .print();

      // Instead of registering functions permanently, you can embed UDFs directly into queries
      // without registering them first. This will upload all the functions of the query as a
      // single artifact to Confluent Cloud. Moreover, the functions lifecycle will be bound to
      // the lifecycle of the query.
      System.out.println("Executing inline UDFs...");
      env.fromValues(row("Apple", "USA", 2), row("Apple", "EU", 3))
               .select(
                        $("f0").as("product"),
                        $("f1").as("location"),
                        $("f2").times(call(CustomTax.class, $("f1"))).as("tax"))
               .execute()
               .print();

      env.fromValues(
                        row(1L, "Ann", array("Apples", "Bananas")),
                        row(2L, "Peter", array("Apples", "Pears")))
               .joinLateral(call(Explode.class, $("f2")).as("fruit"))
               .select($("f0").as("id"), $("f1").as("name"), $("fruit"))
               .execute()
               .print();
   }

   /** A scalar function that calculates a custom tax based on the provided location. */
   public static class CustomTax extends ScalarFunction {
      public int eval(String location) {
            if (location.equals("USA")) {
               return 10;
            }
            if (location.equals("EU")) {
               return 5;
            }
            return 0;
      }
   }

   /** A table function that explodes an array of string into multiple rows. */
   public static class Explode extends TableFunction<String> {
      public void eval(List<String> arr) {
            for (String i : arr) {
               collect(i);
            }
      }
   }
}
```


# Carry-over Offsets in Confluent Cloud for Apache Flink

Confluent Cloud for Apache Flink® supports carry-over offsets, which means that you can use the topic
offsets from one statement to start a new statement.

Carry-over offsets provide a streamlined way to update Flink statements without
data loss. This feature eliminates the manual complexity of copying offsets
between statements and reduces the need to monitor statement status when
deploying CI/CD pipelines.

Automatic orchestration handles the upgrade process. The system automatically
waits for the old statement to stop before starting the new one, providing a
seamless transition of processing between statements.

Carry-over offsets are available only when replacing an existing statement.
This feature enables you to
[evolve statements](../concepts/schema-statement-evolution.md#flink-sql-schema-and-statement-evolution)
with exactly-once semantics across the update when the statement is “stateless”,
as determined by the system. At a high level, “stateless” applies to statements
that can process each event independently and in any order.

For other scenarios, such as aggregates, lag, windows, pattern matching, or
use of upsert sink, this feature can’t be used, because the update may cause
inconsistent results.

To use carry-over offsets, add the `sql.tables.initial-offset-from` property
to the statement configuration when you create your new statement, for example:

In the Confluent Cloud Console and the Flink SQL shell, you can set the property
by using the [SET](../reference/statements/set.md#flink-sql-set-statement) statement, for example:

```sql
SET 'sql.tables.initial-offset-from' = '<reference-statement-name>'
```

The `<reference-statement-name>` is the name of the statement that you want
to use as the reference for the carry-over offsets.

If you’re using the
[Statements API](/cloud/current/api.html#tag/Statements-(sqlv1)/operation/createSqlv1Statement)
or the Confluent Terraform provider, you
can set the property by using the
[properties field](https://registry.terraform.io/providers/confluentinc/confluent/latest/docs/resources/confluent_flink_statement),
for example:

```json
{
   "properties": {
      "sql.tables.initial-offset-from": "<reference-statement-name>"
   }
}
```


## num.partitions

You can change the number of partitions for an existing topic (`num.partitions`) for all cluster types on a per-
topic basis. You can only increase (not decrease) the `num.partitions` value after you create a topic,
and you must make the increase using the `kafka-topic` script or the API.

Limits vary based on Kafka cluster type. For more information, see [Kafka Cluster Types in Confluent Cloud](../clusters/cluster-types.md#cloud-cluster-types).

- Default: 6
- Editable: Yes
- Kafka REST API and Terraform Provider Support: Yes

To change the number of partitions, you can use the `kafka-topic` script that is a part
of the [Kafka command line tools](https://www.confluent.io/blog/using-apache-kafka-command-line-tools-confluent-cloud/).
(installed with Confluent Platform) with the following command.

```none
bin/kafka-topics --bootstrap-server <hostname:port> --command-config <config_file> --alter --topic <topic_name> --partitions <number_partitions>``
```

Alternatively, you can use the [Kafka REST APIs](https://docs.confluent.io/cloud/current/api.html#tag/Topic-(v3)/operation/updatePartitionCountKafkaTopic) to
change the number of partitions for an existing topic (`num.partitions`). You need the REST endpoint and the cluster ID for your cluster to make
Kafka REST calls. To find this information with Cloud Console, see
[Find the REST endpoint address and cluster ID](../clusters/broker-config.md#cluster-settings-console). For more on how to use the
REST APIs, see [Kafka REST API Quick Start for Confluent Cloud](../kafka-rest/krest-qs.md#cloud-rest-api-quickstart).

You could also use Terraform Provider for Confluent to edit this topic setting.

For more details, sign in to the [the Confluent Support Portal](https://support.confluent.io/) and
search for “How to increase the partition count for a Confluent Cloud hosted topic.”


## Flags

```none
--bootstrap string                    Kafka cluster endpoint (Confluent Cloud); or comma-separated list of broker hosts (Confluent Platform), each formatted as "host" or "host:port".
--key-schema string                   The ID or filepath of the message key schema.
--schema string                       The ID or filepath of the message value schema.
--key-format string                   Format of message key as "string", "avro", "double", "integer", "jsonschema", or "protobuf". Note that schema references are not supported for Avro. (default "string")
--value-format string                 Format message value as "string", "avro", "double", "integer", "jsonschema", or "protobuf". Note that schema references are not supported for Avro. (default "string")
--references string                   The path to the message value schema references file.
--parse-key                           Parse key from the message.
--delimiter string                    The delimiter separating each key and value. (default ":")
--config strings                      A comma-separated list of configuration overrides ("key=value") for the producer client. For a full list, see https://docs.confluent.io/platform/current/clients/librdkafka/html/md_CONFIGURATION.html
--config-file string                  The path to the configuration file for the producer client, in JSON or Avro format.
--schema-registry-endpoint string     Endpoint for Schema Registry cluster.
--headers strings                     A comma-separated list of headers formatted as "key:value".
--key-references string               The path to the message key schema references file.
--api-key string                      API key.
--api-secret string                   API secret.
--schema-registry-api-key string      Schema registry API key.
--schema-registry-api-secret string   Schema registry API secret.
--cluster string                      Kafka cluster ID.
--context string                      CLI context name.
--environment string                  Environment ID.
--certificate-authority-path string   File or directory path to one or more Certificate Authority certificates for verifying the broker's key with SSL.
--username string                     SASL_SSL username for use with PLAIN mechanism.
--password string                     SASL_SSL password for use with PLAIN mechanism.
--cert-location string                Path to client's public key (PEM) used for SSL authentication.
--key-location string                 Path to client's private key (PEM) used for SSL authentication.
--key-password string                 Private key passphrase for SSL authentication.
--protocol string                     Specify the broker communication protocol as "PLAINTEXT", "SASL_SSL", or "SSL". (default "SSL")
--sasl-mechanism string               SASL_SSL mechanism used for authentication. (default "PLAIN")
--client-cert-path string             File or directory path to client certificate to authenticate the Schema Registry client.
--client-key-path string              File or directory path to client key to authenticate the Schema Registry client.
```


## High-availability setup

Use these steps to configure Control Center for Active/Active high-availability deployment.

Considerations:
: - You must manually duplicate alerts in one of your Control Center instances.
  - For a Confluent Ansible example of Control Center Active/Active high-availability setup, see: [GitHub repo](https://github.com/confluentinc/cp-ansible/tree/d31730fa1b14db2833c40ad7308e89de9f96b734/docs/sample_inventories/c3-next-gen-active-active-setup)
  - For CFK example of Control Center Active/Active high-availability setup, see: [GitHub repo](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/control-center-next-gen/plain-active-active-setup)

To configure Control Center Active/Active high-availability, use the following steps:

1. Configure two instances of Control Center for your Kafka cluster.
2. For every Kafka broker and KRaft controller, you must add and configure two HttpExporters.

   Consider the following example HttpExporter configurations:
   ```none
   confluent.telemetry.exporter._c3-1.client.base.url=http://{C3-1-internal-dns-hostname}:9090/api/v1/otlp

   confluent.telemetry.exporter._c3-2.client.base.url=http://{C3-2-internal-dns-hostname}:9090/api/v1/otlp
   ```

   - Replace `{C3-1-internal-dns-hostname}` with the base URL for the corresponding Prometheus instance in your cluster.
3. For every Kafka broker and KRaft controller, add the following configurations:
   ```none
   #common configs

   confluent.telemetry.metrics.collector.interval.ms=60000
   confluent.telemetry.remoteconfig._confluent.enabled=false
   confluent.consumer.lag.emitter.enabled=true
   metric.reporters=io.confluent.telemetry.reporter.TelemetryReporter

   # instance 1 configs

   confluent.telemetry.exporter._c3.type=http
   confluent.telemetry.exporter._c3.enabled=true
   confluent.telemetry.exporter._c3.client.base.url=http://{C3-1-internal-dns-hostname}:9090/api/v1/otlp
   confluent.telemetry.exporter._c3.client.compression=gzip
   confluent.telemetry.exporter._c3.api.key=dummy
   confluent.telemetry.exporter._c3.api.secret=dummy
   confluent.telemetry.exporter._c3.buffer.pending.batches.max=80
   confluent.telemetry.exporter._c3.buffer.batch.items.max=4000
   confluent.telemetry.exporter._c3.buffer.inflight.submissions.max=10
   confluent.telemetry.exporter._c3.metrics.include=io.confluent.kafka.server.request.(?!.*delta).*|io.confluent.kafka.server.server.broker.state|io.confluent.kafka.server.replica.manager.leader.count|io.confluent.kafka.server.request.queue.size|io.confluent.kafka.server.broker.topic.failed.produce.requests.rate.1.min|io.confluent.kafka.server.tier.archiver.total.lag|io.confluent.kafka.server.request.total.time.ms.p99|io.confluent.kafka.server.broker.topic.failed.fetch.requests.rate.1.min|io.confluent.kafka.server.broker.topic.total.fetch.requests.rate.1.min|io.confluent.kafka.server.partition.caught.up.replicas.count|io.confluent.kafka.server.partition.observer.replicas.count|io.confluent.kafka.server.tier.tasks.num.partitions.in.error|io.confluent.kafka.server.broker.topic.bytes.out.rate.1.min|io.confluent.kafka.server.request.total.time.ms.p95|io.confluent.kafka.server.controller.active.controller.count|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.total|io.confluent.kafka.server.request.total.time.ms.p999|io.confluent.kafka.server.controller.active.broker.count|io.confluent.kafka.server.request.handler.pool.request.handler.avg.idle.percent.rate.1.min|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.rate.1.min|io.confluent.kafka.server.controller.unclean.leader.elections.rate.1.min|io.confluent.kafka.server.replica.manager.partition.count|io.confluent.kafka.server.controller.unclean.leader.elections.total|io.confluent.kafka.server.partition.replicas.count|io.confluent.kafka.server.broker.topic.total.produce.requests.rate.1.min|io.confluent.kafka.server.controller.offline.partitions.count|io.confluent.kafka.server.socket.server.network.processor.avg.idle.percent|io.confluent.kafka.server.partition.under.replicated|io.confluent.kafka.server.log.log.start.offset|io.confluent.kafka.server.log.tier.size|io.confluent.kafka.server.log.size|io.confluent.kafka.server.tier.fetcher.bytes.fetched.total|io.confluent.kafka.server.request.total.time.ms.p50|io.confluent.kafka.server.tenant.consumer.lag.offsets|io.confluent.kafka.server.session.expire.listener.zookeeper.expires.rate.1.min|io.confluent.kafka.server.log.log.end.offset|io.confluent.kafka.server.broker.topic.bytes.in.rate.1.min|io.confluent.kafka.server.partition.under.min.isr|io.confluent.kafka.server.partition.in.sync.replicas.count|io.confluent.telemetry.http.exporter.batches.dropped|io.confluent.telemetry.http.exporter.items.total|io.confluent.telemetry.http.exporter.items.succeeded|io.confluent.telemetry.http.exporter.send.time.total.millis|io.confluent.kafka.server.controller.leader.election.rate.(?!.*delta).*|io.confluent.telemetry.http.exporter.batches.failed

   # instance 2 configs

   confluent.telemetry.exporter._c3-2.type=http
   confluent.telemetry.exporter._c3-2.enabled=true
   confluent.telemetry.exporter._c3-2.client.compression=gzip
   confluent.telemetry.exporter._c3-2.api.key=dummy
   confluent.telemetry.exporter._c3-2.api.secret=dummy
   confluent.telemetry.exporter._c3-2.buffer.pending.batches.max=80
   confluent.telemetry.exporter._c3-2.buffer.batch.items.max=4000
   confluent.telemetry.exporter._c3-2.buffer.inflight.submissions.max=10
   confluent.telemetry.exporter._c3-2.client.base.url=http://{C3-2-internal-dns-hostname}:9090/api/v1/otlp
   confluent.telemetry.exporter._c3-2.metrics.include=io.confluent.kafka.server.request.(?!.*delta).*|io.confluent.kafka.server.server.broker.state|io.confluent.kafka.server.replica.manager.leader.count|io.confluent.kafka.server.request.queue.size|io.confluent.kafka.server.broker.topic.failed.produce.requests.rate.1.min|io.confluent.kafka.server.tier.archiver.total.lag|io.confluent.kafka.server.request.total.time.ms.p99|io.confluent.kafka.server.broker.topic.failed.fetch.requests.rate.1.min|io.confluent.kafka.server.broker.topic.total.fetch.requests.rate.1.min|io.confluent.kafka.server.partition.caught.up.replicas.count|io.confluent.kafka.server.partition.observer.replicas.count|io.confluent.kafka.server.tier.tasks.num.partitions.in.error|io.confluent.kafka.server.broker.topic.bytes.out.rate.1.min|io.confluent.kafka.server.request.total.time.ms.p95|io.confluent.kafka.server.controller.active.controller.count|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.total|io.confluent.kafka.server.request.total.time.ms.p999|io.confluent.kafka.server.controller.active.broker.count|io.confluent.kafka.server.request.handler.pool.request.handler.avg.idle.percent.rate.1.min|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.rate.1.min|io.confluent.kafka.server.controller.unclean.leader.elections.rate.1.min|io.confluent.kafka.server.replica.manager.partition.count|io.confluent.kafka.server.controller.unclean.leader.elections.total|io.confluent.kafka.server.partition.replicas.count|io.confluent.kafka.server.broker.topic.total.produce.requests.rate.1.min|io.confluent.kafka.server.controller.offline.partitions.count|io.confluent.kafka.server.socket.server.network.processor.avg.idle.percent|io.confluent.kafka.server.partition.under.replicated|io.confluent.kafka.server.log.log.start.offset|io.confluent.kafka.server.log.tier.size|io.confluent.kafka.server.log.size|io.confluent.kafka.server.tier.fetcher.bytes.fetched.total|io.confluent.kafka.server.request.total.time.ms.p50|io.confluent.kafka.server.tenant.consumer.lag.offsets|io.confluent.kafka.server.session.expire.listener.zookeeper.expires.rate.1.min|io.confluent.kafka.server.log.log.end.offset|io.confluent.kafka.server.broker.topic.bytes.in.rate.1.min|io.confluent.kafka.server.partition.under.min.isr|io.confluent.kafka.server.partition.in.sync.replicas.count|io.confluent.telemetry.http.exporter.batches.dropped|io.confluent.telemetry.http.exporter.items.total|io.confluent.telemetry.http.exporter.items.succeeded|io.confluent.telemetry.http.exporter.send.time.total.millis|io.confluent.kafka.server.controller.leader.election.rate.(?!.*delta).*|io.confluent.telemetry.http.exporter.batches.failed
   ```


# Configure SASL for Control Center on Confluent Platform

Many of the concepts applied here [come from the Kafka Security documentation](/platform/current/security/authentication/overview.html#kafka-sasl-auth). Reading through and understanding
that documentation will be useful in configuring Control Center for SASL.

The following assumes that this is for a development setup only and generically
followed the
[Quick Start for Confluent Platform](/platform/current/get-started/platform-quickstart.html#quickstart). While the specifics are for development purposes only,
securing a production cluster follows the same concepts.


### Admin client

The JavaScript Client library includes an admin client to interact with the Kafka cluster.
The admin client provides several methods to manage topics, groups, and other
Kafka entities.

```js
// An admin client can be created from configuration.
const admin = new Kafka().admin({
    'bootstrap.servers': '<fill>',
});

// Or from a producer or consumer instance.
const depAdmin = producer.dependentAdmin();

await admin.connect();
await depAdmin.connect();
```

A complete list of methods available on the admin client can be found in
the [JavaScript Client API reference
documentation](/platform/current/clients/confluent-kafka-javascript/docs/index.html).


#### Standard API

The Standard API is more performant, particularly when handling high
volumes of messages. However, it requires more manual setup to use. The
following example illustrates its use:

```js
const producer = new Kafka.Producer({
  'bootstrap.servers': 'localhost:9092',
  'dr_cb': true
});

// Connect to the broker manually
producer.connect();

// Wait for the ready event before proceeding
producer.on('ready', () => {
  try {
    producer.produce(
      // Topic to send the message to
      'topic',
      // optionally we can manually specify a partition for the message
      // this defaults to -1 - which will use librdkafka's default partitioner (consistent random for keyed messages, random for unkeyed messages)
      null,
      // Message to send. Must be a buffer
      Buffer.from('Awesome message'),
      // for keyed messages, we also specify the key - note that this field is optional
      'Stormwind',
      // you can send a timestamp here. If your broker version supports it,
      // it will get added. Otherwise, we default to 0
      Date.now(),
      // you can send an opaque token here, which gets passed along
      // to your delivery reports
    );
  } catch (err) {
    console.error('A problem occurred when sending our message');
    console.error(err);
  }
});

// Any errors we encounter, including connection errors
producer.on('event.error', (err) => {
  console.error('Error from producer');
  console.error(err);
})

// We must either call .poll() manually after sending messages
// or set the producer to poll on an interval (.setPollInterval).
// Without this, we do not get delivery events and the queue
// will eventually fill up.
producer.setPollInterval(100);

// You can also set up the producer to poll in the background thread which is
// spawned by the C code. It is more efficient for high-throughput producers.
// Calling this clears any interval set in setPollInterval.
producer.setPollInBackground(true);
```

To see the configuration options available to you, see the [librdkafka
Configuration options](https://github.com/confluentinc/librdkafka/blob/v2.3.0/CONFIGURATION.md).


#### IMPORTANT
Append EntityPath=<your-queue-name> at the end of the `azure.servicebus.connection.string`

```json
{
  "name" : "ServiceBusSourceConnector",
  "config" : {
    "connector.class" : "io.confluent.connect.azure.servicebus.ServiceBusSourceConnector",
    "tasks.max" : "1",
    "kafka.topic" : "servicebus-topic",
    "azure.servicebus.sas.keyname":"sas-keyname",
    "azure.servicebus.sas.key":"sas-key",
    "azure.servicebus.namespace":"namespace",
    "azure.servicebus.entity.name":"queue-name",
    "azure.servicebus.subscription" : "",
    "azure.servicebus.max.message.count" : "10",
    "azure.servicebus.max.waiting.time.seconds" : "30",
    "confluent.license":"",
    "confluent.topic.bootstrap.servers":"localhost:9092",
    "confluent.topic.replication.factor":"1"
  }
}
```

Use `curl` to post the configuration to one of the Kafka Connect Workers.
Change `http://localhost:8083/` the endpoint of one of your Kafka Connect
worker(s).

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
```

Use the following command to update the configuration of existing connector.

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors/ServiceBusSourceConnector/config
```

To publish messages to Service Bus queue, follow the  [Send and receive
messages](https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-quickstart-cli#send-and-receive-messages).

```bash
java -jar ./target/queuesgettingstarted-1.0.0-jar-with-dependencies.jar -c "Endpoint=sb://<namespace>.servicebus.windows.net/;SharedAccessKeyName=<keyName>;SharedAccessKey=<SharedAccessKey>;"
```

To consume records written by connector to the configured Kafka topic, run the
following command:

```bash
kafka-avro-console-consumer \
--bootstrap-server localhost:9092 \
--property schema.registry.url=http://localhost:8081 \
--topic servicebus-topic \
--from-beginning
```


### Sink Connector Configuration

Start the services using the Confluent CLI:

```bash
confluent local start
```

Create a configuration file named datadog-metrics-sink-config.json with the
following contents:

```text
 {
  "name": "datadog-metrics-sink",
  "config": {
    "topics": "datadog-metrics-topic",
    "connector.class": "io.confluent.connect.datadog.metrics.DatadogMetricsSinkConnector",
    "tasks.max": "1",
    "key.converter": "io.confluent.connect.string.StringConverter",
    "key.converter.schema.registry.url": "http://localhost:8081",
    "value.converter": "io.confluent.connect.json.JsonConverter",
    "value.converter.schema.registry.url": "http://localhost:8081",
    "datadog.api.key": "< your-api-key >"
    "datadog.domain": "COM"
    "behavior.on.error": "fail",
    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1"
  }
}
```

Run this command to start the Datadog Metrics sink connector.

```bash
confluent local load datadog-metrics-sink --config datadog-metrics-sink-config.json
```

To check that the connector started successfully view the Connect worker’s log
by running:

```bash
confluent local services connect log
```

Produce test data to the `datadog-metrics-topic` topic in Kafka using the
[Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) confluent local produce command.

```bash
  kafka-avro-console-producer \
--broker-list localhost:9092 --topic datadog-metrics-topic \
--property value.schema='{"name": "metric","type": "record","fields": [{"name": "name","type": "string"},{"name": "type","type": "string"},{"name": "timestamp","type": "long"}, {"name": "dimensions", "type": {"name": "dimensions", "type": "record", "fields": [{"name": "host", "type":"string"}, {"name":"interval", "type":"int"}, {"name": "tag1", "type":"string"}]}},{"name": "values","type": {"name": "values","type": "record","fields": [{"name":"doubleValue", "type": "double"}]}}]}'
```


## REST-based example

Use this setting with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`config.json`, configure all of the required values, and use the following
command to post the configuration to one of the distributed connect workers. For
more information about the Kafka Connect REST API, see [this
documentation](/platform/current/connect/references/restapi.html).

```json
{
  "name" : "FirebaseSinkConnector",
  "config" : {
    "topics":"artists,songs",
    "connector.class" : "io.confluent.connect.firebase.FirebaseSinkConnector",
    "tasks.max" : "1",

    "gcp.firebase.credentials.path" : "credential path",
    "gcp.firebase.database.reference": "database url",
    "insert.mode" : "set/update/push",

    "key.converter" : "io.confluent.connect.avro.AvroConverter",
    "key.converter.schema.registry.url":"http://localhost:8081",
    "value.converter" : "io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url":"http://localhost:8081",

    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1",
    "confluent.license": " Omit to enable trial mode "
  }
}
```


### REST-based example

Use this setting with [distributed workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to `config.json`, configure all of the required values, and use the following command to post the configuration to one of the distributed Connect workers. Check here for more information about the Kafka Connect [REST API](/platform/current/connect/references/restapi.html).

```json
{
   "name" : "MyGithubConnector",
   "config" :
   {
      "connector.class" : "io.confluent.connect.github.GithubSourceConnector",
      "confluent.topic.bootstrap.servers": "localhost:9092",
      "confluent.topic.replication.factor": "1",
      "tasks.max" : "1",
      "github.service.url":"https://api.github.com",
      "github.access.token":"< Github-Access-Token >",
      "github.repositories":"apache/kafka",
      "github.resources":"stargazers",
      "github.since":"2019-01-01",
      "topic.name.pattern":"github-${resourceName}",
      "key.converter":"io.confluent.connect.avro.AvroConverter",
      "key.converter.schema.registry.url":"http://localhost:8081",
      "value.converter":"io.confluent.connect.avro.AvroConverter",
      "value.converter.schema.registry.url":"http://localhost:8081"
   }
}
```


### REST-based example

This configuration is used typically along with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following json to
`omnisci-sink-connector.json`, configure all of the required values, and use
the command below to post the configuration to one the distributed connect
worker(s). Check here for more information about the Kafka Connect
[REST API](/platform/current/connect/references/restapi.html)

```bash
    {
      "name" : "OmnisciSinkConnector",
      "config" : {
            "connector.class" : "io.confluent.connect.omnisci.OmnisciSinkConnector",
            "tasks.max" : "1",
            "topics": "orders",
            "connection.database": "omnisci",
            "connection.port": "6274",
            "connection.host": "localhost",
            "connection.user": "admin",
            "connection.password": "HyperInteractive",
            "confluent.topic.bootstrap.servers": "localhost:9092",
            "confluent.topic.replication.factor": "1",
            "auto.create": "true"
  }
}
```

Use curl to post the configuration to one of the Kafka Connect workers. Change
http://localhost:8083/ the endpoint of one of your Kafka Connect worker(s).

Run the connector with this configuration.

```bash
curl -X POST -d @omnisci-sink-connector.json http://localhost:8083/connectors -H "Content-Type: application/json"
```

Next, create a record in the `orders` topic

```bash
bin/kafka-avro-console-producer \
 --broker-list localhost:9092 --topic orders \
 --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"id","type":"int"},{"name":"product", "type": "string"}, {"name":"quantity", "type": "int"}, {"name":"price",
 "type": "float"}]}'
```

The console producer is waiting for input. Copy and paste the following record
into the terminal:

```bash
{"id": 999, "product": "foo", "quantity": 100, "price": 50}
```

To verify the data in HEAVY-AI, log in to the Docker container using the
following command:

```bash
docker exec -it <containerid> bash
```

Once you are inside the Docker container, launch omnisql:

```bash
bin/omnisql
```

When prompted for a password, enter `HyperInteractive`.

Finally, run the following SQL query to verify the records:

```bash
omnisql> select * from orders;
    foo|50.0|100|999
```


#### Avro converter example

1. Run the demo app with the `basic-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=basic-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=AvroHttpSink
   topics=avro-topic
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=io.confluent.connect.avro.AvroConverter
   value.converter=io.confluent.connect.avro.AvroConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=BASIC
   connection.user=admin
   connection.password=password
   ```

   Note that you should publish Avro messages to the `avro-topic` instead of
   to the string messages shown in the [Quick start](#http-connector-quickstart).
3. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart).


#### Header forwarding example

1. Run the demo app with the `basic-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=basic-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=HttpSinkBasicAuth
   topics=http-messages
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=BASIC
   connection.user=admin
   connection.password=password
   headers=Forward-Me:header_value|Another-Header:another_value
   ```
3. Run and validate the connector as described in the [Quick start](#http-connector-quickstart).


### Key and topic substitution example

1. Run the demo app with the `simple-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=simple-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=KeyTopicSubstitution
   topics=key-val-messages
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   auth.type=NONE
   confluent.topic.bootstrap.servers=localhost:9092
   reporter.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   http.api.url=http://localhost:8080/api/messages/${topic}/${key}
   ```
3. Produce a set of messages with keys and values.
   ```bash
   confluent local produce key-val-topic --property parse.key=true --property key.separator=,

   > 1,value
   > 2,another-value
   ```
4. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart). You can run `curl
   localhost:8080/api/messages | jq` to see that the messages key and topic
   were saved.


### Delete behavior on null values example

1. Run the demo app with the `simple-auth` Spring profile.
   ```bash
   mvn spring-boot:run -Dspring.profiles.active=basic-auth
   ```
2. Create a `http-sink.properties` file with the following contents:
   ```text
   name=DeleteNullHttpSink
   topics=http-messages
   tasks.max=1
   connector.class=io.confluent.connect.http.HttpSinkConnector
   # key/val converters
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   # licensing for local single-node Kafka cluster
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   # connect reporter required bootstrap server
   reporter.bootstrap.servers=localhost:9092
   reporter.result.topic.name=success-responses
   reporter.result.topic.replication.factor=1
   reporter.error.topic.name=error-responses
   reporter.error.topic.replication.factor=1
   # http sink connector configs
   http.api.url=http://localhost:8080/api/messages
   auth.type=BASIC
   connection.user=admin
   connection.password=password
   behavior.on.null.values=delete
   ```
3. Publish messages to the topic that have keys and values.
   ```bash
   confluent local produce http-messages --property parse.key=true --property key.separator=,
   > 1,message-value
   > 2,another-message
   ```
4. Run and validate the connector as described in the
   [Quick start](#http-connector-quickstart). You can check for messages in the demo API
   with the following command: `curl http://localhost:8080/api/messages -H
   'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' | jq`
5. Publish messages to the topic that have keys with null values (tombstones).
   Note that this cannot be done with `confluent local produce` but
   there is an API in the demo app to send tombstones.
   ```text
   curl -X POST \
     'localhost:8080/api/tombstone?topic=http-messages&key=1' \
     -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ='
   ```
6. Validate that the demo app deleted the messages.
   ```text
   curl http://localhost:8080/api/messages \
     -H 'Authorization: Basic YWRtaW46cGFzc3dvcmQ=' | jq
   ```


## Low importance

`redo.log.corruption.topic`
: The name of the Kafka topic where the connector records events that describe
  errors and corrupted portions of the Oracle redo log (missed data). This can
  optionally use the [template variables](overview.md#connect-oracle-cdc-source-variables) `${connectorName}`,
  `${databaseName}`, and `${schemaName}`. A blank topic name (the default)
  designates that this information is not written to Kafka.


  * Type: string
  * Default: blank
  * Importance: low

`redo.log.consumer.fetch.min.bytes`
: The minimum amount of data the server should return for a fetch request. If
  insufficient data is available the request waits for the minimum bytes of data
  to accumulate before answering the request. The default setting of `1` byte
  means that fetch requests are answered as soon as a single byte of data is
  available (the fetch request times out waiting for data to arrive). Setting
  this to something greater than `1` causes the server to wait for a larger
  amount of data to accumulate, which can improve server throughput at the cost
  of additional latency.


  * Type: int
  * Default: `1`
  * Importance: low

`redo.log.consumer.max.partition.fetch.bytes`
: The maximum amount of per-partition data the server will return. Records are
  fetched in batches by the consumer. If the first record batch (in the first
  fetched non-empty partition) is larger than this limit, the batch will still
  be returned to ensure that the consumer can make progress. (This is not an
  absolute maximum.) The maximum record batch size accepted by the broker is
  defined using `message.max.bytes` in the Kafka broker configuration or
  `max.message.bytes` in the topic configuration. See `fetch.max.bytes` for
  limiting the consumer request size.


  * Type: int
  * Default: `1048576`
  * Importance: low

`redo.log.consumer.fetch.max.bytes`
: The maximum amount of data the server should return for a fetch request.
  Records are fetched in batches by the consumer. If the first record batch (in
  the first non-empty partition of the fetch) is larger than this value, the
  record batch will still be returned to ensure that the consumer can make
  progress. (This is not an absolute maximum.) The maximum record batch size
  accepted by the broker is defined using `message.max.bytes` in the Kafka broker
  configuration or `max.message.bytes` in the topic configuration. Note that
  the consumer performs multiple fetches in parallel.


  * Type: long
  * Default: `52428800`
  * Importance: low

`redo.log.consumer.max.poll.records`
: The maximum number of records returned in a single call to poll().


  * Type: int
  * Default: `500`
  * Importance: low

`redo.log.consumer.request.timeout.ms`
: Controls the maximum amount of time the client waits for the request response.
  If the response is not received before the timeout elapses, the client resends
  the request (if necessary) or fails the request if retries are exhausted.


  * Type: int
  * Default: `30000`
  * Importance: low

`redo.log.consumer.receive.buffer.bytes`
: The size of the TCP receive buffer (SO_RCVBUF) to use when reading data. If
  the value is `-1`, the Operating System default buffer size is used.


  * Type: int
  * Default: `65536`
  * Importance: low

`redo.log.consumer.send.buffer.bytes`
: The size of the TCP send buffer (SO_SNDBUF) to use when sending data. If
  the value is `-1`, the OS default will be used


  * Type: int
  * Default: `131072`
  * Importance: low

`behavior.on.dictionary.mismatch`
: Specifies the desired behavior when the connector is not able to parse the
  value of a column due to a dictionary mismatch caused by DDL statement. This
  can happen if the `online` dictionary mode is specified but the connector is
  streaming historical data recorded before DDL changes occurred. The default
  option `fail` will cause the connector task to fail. The `log` option will log
  the unparsable statement and skip the problematic record without failing
  the connector task.


  * Type: string
  * Default: `fail`
  * Importance: low

`behavior.on.unparsable.statement`
: Specifies the desired behavior when the connector encounters a SQL statement that could not
  be parsed. The default option `fail` will cause the connector task to fail. The `log`
  option will log the unparsable statement and skip the problematic record without failing the connector task.


  * Type: string
  * Default: `fail`
  * Importance: low

`oracle.dictionary.mode`
: The dictionary handling mode used by the connector. See [Address DDL Changes in Oracle Database for Confluent Platform](ddl-changes.md#connect-oracle-ddl-changes)
  for more information.


  * Type: string
  * Default: `auto`
  * Valid Values: `auto`, `online`, or `redo_log`
    - `auto`: The connector uses the dictionary from the online catalog until
      a DDL statement to evolve the table schema is encountered. At which point,
      the connector starts using the dictionary from archived redo logs. Once the
      DDL statement has been processed, the connector reverts back to using the
      online catalog. Use this mode if DDL statements are expected.
    - `online`: The connector always uses the online dictionary catalog. Use
      `online` mode if no DDL statements are expected as DDL statements aren’t
      supported in `online` mode.
    - `redo_log`: The connector always uses the dictionary catalog from
      archived redo logs. Use this mode if you can not access the online redo log.
  * Importance: low

`log.mining.archive.destination.name`
: The name of the archive log destination to use when mining archived redo logs. You can
  configure the connector to use a specific destination using this property (for example, LOG_ARCHIVE_DEST_2).
  This is only applicable for Oracle database versions 19c and later.


  * Type: string
  * Default: “”
  * Importance: low

`record.buffer.mode`
: Where to buffer records that are part of the transaction but may not yet be committed.


  #### IMPORTANT
  Database record buffering mode is not supported in Oracle Database version
  19c and later. Use connector record buffering mode instead.


  * Type: string
  * Default: `connector`
  * Valid Values: `connector`, or `database`
    - `connector`: buffer uncommitted transactions in connector memory.
    - `database`: buffer the records in database. This option uses the `COMMITTED_DATA_ONLY`
      flag to LogMiner so that the connector only receives committed records. Transactions that
      are rolled back or in-progress are filtered out, as are internal redo records. Use this
      option if the worker where redo log processing task (task 0) is running is memory constrained
      so you would rather do buffering in the database. Note, though, that this option increases
      the database memory usage to stage all redo records within a single transaction in memory
      until LogMiner finds the commit record for that transaction. Therefore, it is possible to
      exhaust memory.
  * Importance: low

`max.batch.timeout.ms`
: The maximum time to wait for a record before returning an empty batch. Must be
  at least 1000 milliseconds (one second). The default is 60000 milliseconds
  (one minute).


  #### IMPORTANT
  - Not in active support. Use poll.linger.ms instead.


  * Type: long
  * Default: `60000`
  * Importance: low

`max.buffer.size`
: The maximum number of records from all snapshot threads and from the redo log that
  can be buffered into batches. The default `0` means a buffer size will be computed
  from the maximum batch size (`max.batch.size`) and number of threads (`snapshot.threads.per.task`).


  * Type: int
  * Default: `0`
  * Importance: low

`lob.topic.name.template`
: The template that defines the name of the Kafka topic where the connector
  writes LOB objects. The value can be a constant if the connector writes all
  LOB objects from all captured tables to one topic. Or, the value can include
  any supported [template variables](overview.md#connect-oracle-cdc-source-variables)
  (for example, `${columnName}`, `${databaseName}`, `${schemaName}`,
  `${tableName}`, and `${connectorName}`). The default is empty, which
  ignores all LOB type columns if any exist on captured tables. Special-meaning
  characters `\`, `$`, `{`, and `}` must be escaped with `\` when not
  intended to be part of a template variable. Any character that is not a valid
  character for topic name is replaced by an underscore in the topic name.


  * Type: string
  * Default: “”
  * Valid Values: [Template variables](overview.md#connect-oracle-cdc-source-variables) which resolve to a valid topic name or a blank string. Valid topic names consist of `1` to `249` alphanumeric, `+`, `.`, `_`, `\`, and `-` characters.
  * Importance: low

`enable.large.lob.object.support`
: If `true`, the connector will support large LOB objects that are split across multiple redo log
  records. The connector will emit commit messages to the redo log topic and use these commit
  messages to track when a large LOB object can be emitted to the LOB topic.


  * Type: boolean
  * Default: false
  * Importance: low

`log.sensitive.data`
: If `true`, logs sensitive data (such as customer records, SQL queries or exception traces
  containing sensitive data). Set this to true only in exceptional scenarios where logging sensitive
  data is acceptable and is necessary for troubleshooting.


  * Type: boolean
  * Default: false
  * Importance: low

`numeric.mapping`
: Map NUMERIC values by precision and optionally scale to primitive or decimal
  types.


  * Use `none` if all NUMERIC columns are to be represented by Connect’s DECIMAL logical type.
  * Use `best_fit_or_decimal` if NUMERIC columns should be cast to Connect’s primitive type based upon
    the column’s precision and scale. If the precision and scale exceed the bounds for any primitive type, Connect’s
    DECIMAL logical type will be used instead, and the values will be represented in binary form within the
    change events.
  * Use `best_fit_or_double` if NUMERIC columns should be cast to Connect’s primitive type
    based upon the column’s precision and scale. If the precision and scale exceed the bounds
    for any primitive type, Connect’s FLOAT64 type will be used instead.
  * Use `best_fit_or_string` if NUMERIC columns should be cast to Connect’s primitive type
    based upon the column’s precision and scale. If the precision and scale exceed
    the bounds for any primitive type, Connect’s STRING type will be used instead.
  * Use `precision_only` to map NUMERIC columns based only on the column’s precision
    assuming that column’s scale is 0.


  The `none` option is the default, but may lead to serialization issues since
  Connect’s DECIMAL type is mapped to its binary representation. One of the
  `best_fit_or` options will often be preferred. For backwards compatibility reasons,
  the `best_fit` option is also available. It behaves the same as `best_fit_or_decimal`.
  Updating this would require deletion of the table topic and the registered schemas if using non json
  `value.converter`.


  * Type: string
  * Default: `none`
  * Importance: low

`numeric.default.scale`
: The default scale to use for numeric types when the scale cannot be determined.


  * Type: int
  * Default: 127
  * Importance: low

`oracle.date.mapping`
: Map Oracle DATE values to Connect types.


  * Use `date` if all the `DATE` columns are to be represented by Connect’s Date logical type.
  * Use `timestamp` if the `DATE` columns should be cast to Connect’s Timestamp.


  The `date` option is the default value for backward compatibility. Despite the name similarity,
  Oracle `DATE` type has different semantics than Connect Date. `timestamp` will often be preferred
  for semantic similarity.


  * Type: string
  * Default: `date`
  * Importance: low

`emit.tombstone.on.delete`
: If true, delete operations emit a tombstone record with null value.


  * Type: boolean
  * Default: false
  * Importance: low

`oracle.fan.events.enable`
: Whether the connection should allow using Oracle RAC Fast Application
  Notification (FAN) events. This is disabled by default, meaning FAN events
  will not be used even if they are supported by the database. You should only
  be enable this feature when using Oracle RAC set up with FAN events. Enabling
  the feature may cause connection issues when the database is not set up to use
  FAN events.


  * Type: boolean
  * Default: false
  * Importance: low

`table.task.reconfig.checking.interval.ms`
: The interval for the background monitoring thread to examine changes to tables
  and reconfigure table placement if necessary. The default is 300000
  milliseconds (5 minutes).


  * Type: long
  * Default: `300000`
  * Importance: low

`table.rps.logging.interval.ms`
: The interval for the background thread to log current requests per second
  (RPS) for each table.


  * Type: long
  * Default: `60000`
  * Importance: low

`log.mining.end.scn.deviation.ms`
: Calculates the end SCN of log mining sessions as the approximate SCN that
  corresponds to the point in time that is `log.mining.end.scn.deviation.ms`
  milliseconds before the current SCN obtained from the database. The default
  value is set to 3 seconds on RAC environments, and is not set otherwise. This
  configuration is applicable only for Oracle database versions 19c and later.
  Setting this configuration to a lower value on a RAC environment introduces
  the potential for data loss at high load. A higher value increases the end-to
  end-latency for change events.


  * Type: long
  * Default: `0` for single node and `3000` for RAC environments
  * Importance: low


`output.before.state.field`
: The name of the field in the change record written to Kafka that contains the
  before state of changed database rows for an update operation. A blank value signals that this field should not be
  included in the change records. For more details, see [Before state for update operation](overview.md#before-state-for-update-operation).


  * Type: string
  * Default: “”
  * Importance: low

`output.table.name.field`
: The name of the field in the change record written to Kafka that contains the
  fully-qualified name of the affected Oracle table. A blank value signals that
  this field should not be included in the change records. Use unescaped `.` characters to designate
  nested fields within structs, or prefix with `header:` to write the fully-qualified name of the affected Oracle
  table as a header with the given name.


  * Type: string
  * Default: `table`
  * Importance: low

`output.scn.field`
: The name of the field in the change record written to Kafka that contains the
  Oracle System Change Number (SCN) where this change was made. A blank value
  indicates the field should not be included in the change records.
  Use unescaped `.` characters to designate nested fields within structs, or
  prefix with `header:` to write the SCN as a header with the given name.


  * Type: string
  * Default: `scn`
  * Importance: low

`output.commit.scn.field`
: The name of the field in the change record written to Kafka that contains the
  Oracle System Change Number (SCN) when this transaction was committed. An
  empty value indicates that the field should not be included in the change
  records.


  * Type: string
  * Default: “”
  * Importance: low

`output.op.type.field`
: The name of the field in the change record written to Kafka that contains the
  operation type for this change event. A blank value indicates the field should
  not be included in the change records. Use unescaped `.` characters to designate
  nested fields within structs, or prefix with `header:` to write the operation type
  as a header with the given name.


  * Type: string
  * Default: `op_type`
  * Importance: low

`output.op.ts.field`
: The name of the field in the change record written to Kafka that contains the
  operation timestamp for the change event. A blank value indicates the
  field should not be included in the change records. Use unescaped `.`
  characters to designate nested fields within structs, or prefix with
  `header:` to write the operation timestamp as a header with the given name.


  * Type: string
  * Default: `op_ts`
  * Importance: low

`output.current.ts.field`
: The name of the field in the change record written to Kafka that contains the
  current timestamp of the Kafka Connect worker when this change event was
  processed. A blank value indicates the field should not be included in the
  change records. Use unescaped `.` characters to designate nested fields
  within structs, or prefix with `header:` to write the current timestamp as a
  header with the given name.


  * Type: string
  * Default: `current_ts`
  * Importance: low

`output.row.id.field`
: The name of the field in the change record written to Kafka that contains the
  row ID of the changed row. A blank value indicates the field field should not
  be included in the change records. Use unescaped `.` characters to designate
  nested fields within structs, or prefix with `header:` to write the row ID
  as a header with the given name.


  * Type: string
  * Default: `row_id`
  * Importance: low

`output.username.field`
: The name of the field in the change record written to Kafka that contains the
  name of the Oracle user that executed the transaction. A blank value indicates
  the field should not be included in the change records. Use unescaped `.`
  characters to designate nested fields within structs, or prefix with
  `header:` to write the username as a header with the given name.


  * Type: string
  * Default: `username`
  * Importance: low

`output.redo.field`
: The name of the field in the change record written to Kafka that contains the
  original redo data manipulation language (DML) statement from which this
  change record was created. A blank value indicates the field should not be
  included in the change records. Use unescaped `.` characters to designate
  nested fields within structs, or prefix with `header:` to write the username
  as a header with the given name.


  * Type: string
  * Default: “”
  * Importance: low

`output.undo.field`
: The name of the field in the change record written to Kafka that contains the
  original undo DML statement that effectively undoes this change and represents
  the “before” state of the row. A blank value indicates the field should not be
  included in the change records. Use unescaped `.` characters to designate
  nested fields within structs, or prefix with `header:` to write the username
  as a header with the given name.


  * Type: string
  * Default: “”
  * Importance: low

`output.op.type.read.value`
: The value of the operation type for a read (snapshot) change event. By default
  this is `R` (read).


  * Type: string
  * Default: `R`
  * Importance: low

`output.op.type.insert.value`
: The value of the operation type for an insert change event. By default this is
  `I` (insert).


  * Type: string
  * Default: `I`
  * Importance: low

`output.op.type.update.value`
: The value of the operation type for an update change event. By default this is
  `U` (update).


  * Type: string
  * Default: `U`
  * Importance: low

`output.op.type.delete.value`
: The value of the operation type for a delete change event. By default this is
  `D` (delete).


  * Type: string
  * Default: `D`
  * Importance: low

`output.op.type.truncate.value`
: The value of the operation type for a truncate change event. By default this is
  `T` (truncate).


  * Type: string
  * Default: `T`
  * Importance: low

`redo.log.startup.polling.limit.ms`
: The amount of time to wait for the redo log to be present on connector startup.
  This is only relevant when connector is configured to capture change events. On
  expiration of this wait time, the connector would move to a failed state.


  * Type: long
  * Default: 300000
  * Importance: low


`snapshot.by.table.partitions`
: Whether the connector should perform snapshots on each table partition if the table is
  defined to use partitions. This is `false` by default, meaning that one snapshot is
  performed on each table in its entirety.


  * Type: boolean
  * Default: false
  * Importance: low

`oracle.validation.result.fetch.size`
: The fetch size to be used while querying database for validations. This will be used
  to query list of tables and supplemental logging level validation.


  * Type: int
  * Default: `5000`
  * Importance: low

`redo.log.row.poll.fields.include`
: A comma-separated list of fields from the V$LOGMNR_CONTENTS view to include in the redo log events.


  * Type: list
  * Importance: low

`redo.log.row.poll.fields.exclude`
: A comma-separated list of fields from the V$LOGMNR_CONTENTS view to exclude in the redo log events.


  * Type: list
  * Importance: low

`redo.log.row.poll.username.include`
: A comma-separated list of database usernames. When this property is set, the
  connector captures changes only from the specified set of database users. You
  cannot set this property along with the `redo.log.row.poll.username.exclude`
  property.


  * Type: list
  * Importance: low

`redo.log.row.poll.username.exclude`
: A comma-separated list of database usernames. When this property is set, the
  connector captures changes only from the specified set of database users. You
  cannot set this property along with the `redo.log.row.poll.username.exclude`
  property.


  * Type: list
  * Importance: low

`db.timezone`
: Default timezone to assume when parsing Oracle `DATE` and `TIMESTAMP` types for which timezone information
  is not available. For example, if `db.timezone=UTC`, the data in both `DATE` and `TIMESTAMP` will be parsed
  as if in UTC timezone. The value has to be a valid java.util.TimeZone ID.


  * Type: string
  * Default: UTC
  * Importance: low

`db.timezone.date`
: The default timezone to assume when parsing Oracle `DATE` type for which
  timezone information is not available. If `db.timezone.date` is set, the
  value of `db.timezone` for `DATE` type will be overwritten with the value
  in `db.timezone.date`. For example, if `db.timezone=UTC` and
  `db.timezone.date=America/Los_Angeles`, the data `TIMESTAMP` will be
  parsed as if it is in UTC timezone, and the data in `DATE` will be parsed as
  if in America/Los_Angeles timezone. The value has to be a valid
  `java.util.TimeZone` ID.


  * Type: string
  * Importance: low

`oracle.supplemental.log.level`
: Database supplemental logging level for connector operation. If set to
  `full`, the connector validates the supplemental logging level on the
  database is FULL and then captures snapshots and CDC events for the specified
  tables whenever `table.topic.name.template` is not set to `""`. When the
  level is set to `msl`, the connector doesn’t capture the CDC change events,
  rather it only captures snapshots if `table.topic.name.template` is not set
  to `""`. Note that this setting is ignored if the
  `table.topic.name.template` is set to `""` as the connector will only
  capture redo logs. This setting defaults to `full` supplemental logging
  level mode.


  * Type: string
  * Default: full
  * Valid Values: [msl, full]
  * Importance: low

`ldap.url`
: The connection URL of LDAP server if using OID based LDAP.


  * Type: string
  * Importance: low

`ldap.security.principal`
: The login principal or user if using SIMPLE Authentication for LDAP.


  * Type: string
  * Importance: low

`ldap.security.credentials`
: The login password for principal if using SIMPLE Authentication for LDAP.


  * Type: string
  * Importance: low

`oracle.ssl.truststore.file`
: If using SSL for encryption and server authentication, set this to the location of the trust store containing server certificates that should be trusted.


  * Type: string
  * Default: “”
  * Importance: low

`oracle.ssl.truststore.password`
: If using SSL for encryption and server authentication, the password of the trust store containing server certificates that should be trusted.


  * Type: string
  * Default: “”
  * Importance: low

`oracle.kerberos.cache.file`
: If using Kerberos 5 authentication, set this to the location of the Kerberos
  5 ticket cache file on all the Connect workers.


  * Type: string
  * Default: “”
  * Importance: low

`log.sensitive.data`
: If set to `true`, the connector logs sensitive data–such as customer records, SQL queries,
  and exception traces containing sensitive data. Confluent recommends you set this parameter to
  `true` only in cases where logging sensitive data is acceptable and necessary for troubleshooting.


  * Type: boolean
  * Default: false
  * Importance: low

`retry.error.codes`
: A comma-separated list of Oracle error codes (for example, `12505,
  12528,...`) that the connector retries up to the time defined by the
  `max.retry.time.ms` parameter. By default, the connector retries in case of a
  recoverable or transient SQL exception and on certain Oracle error codes.


  * Type: list
  * Importance: low

`enable.metrics.collection`
: If set to `true`, the connector records metrics that can be used to gain insight into the
  connector and troubleshoot issues. These metrics can be accessed using Java Management Extensions (JMX).


  * Type: boolean
  * Default: false
  * Importance: low


#### NOTE
Note the following:

* `salesforce.consumer.key` and `salesforce.consumer.secret` are required properties used for OAuth2 secure authentication by Salesforce.com. Additional information and tutorials are available at [salesforce.com](https://developer.salesforce.com/docs/atlas.en-us.api_streaming.meta/api_streaming/code_sample_auth_oauth.htm).
* Change the `confluent.topic.bootstrap.servers` property to include your broker address(es) and change the `confluent.topic.replication.factor` to `3` for staging or production use.
* Set the following connector configuration properties to enable OAuth JWT bearer token support:
  - `salesforce.username`
  - `salesforce.consumer.key`
  - `salesforce.jwt.keystore.path`
  - `salesforce.jwt.keystore.password`

Run the connector with this configuration.

```bash
confluent local load SalesforceCdcSourceConenctor --config salesforce-cdc-source.properties
```

Confirm that the connector is in a `RUNNING` state.

```bash
confluent local status SalesforceCdcSourceConenctor
```


## Quick Start

In this quick start, the Salesforce Bulk API Source connector is used to import
data from Salesforce to Kafka. Use the following steps to get started:

1. Create Salesforce developer account using this [link](https://developer.salesforce.com/signup) if you don’t have it.
2. Add records to the objects by clicking on App Launcher and selecting the
   required Salesforce object.
3. Install the connector by running the following command from your Confluent Platform
   installation directory:
   ```bash
   confluent connect plugin install confluentinc/kafka-connect-salesforce-bulk-api:latest
   ```

   Note that by default, the connector will install the plugin into the
   `share/confluent-hub-components` directory and add the directory to the
   plugin path. For the plugin path change to take effect, you must restart the
   Connect worker.
4. Start the services using the Confluent CLI.
   ```bash
   confluent local start
   ```

   Every service starts in order, printing a message with its status. Note also
   that the `SalesforceBulkApiSourceConnector` supports a single task only.
   ```bash
   Starting Zookeeper
   Zookeeper is [UP]
   Starting Kafka
   Kafka is [UP]
   Starting Schema Registry
   Schema Registry is [UP]
   Starting Kafka REST
   Kafka REST is [UP]
   Starting Connect
   Connect is [UP]
   Starting KSQL Server
   KSQL Server is [UP]
   Starting Control Center
   Control Center is [UP]
   ```


## Quick start

The quick start guide uses ServiceNow Source connector to consume records from a
ServiceNow Table and send them to Kafka. This guide assumes multi-tenant
environment is used. For local testing, refer to [Running
Connect in standalone mode](/kafka-connectors/self-managed/userguide.html#configuring-and-running-workers).

1. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your confluent platform installation directory
   confluent-hub install confluentinc/kafka-connect-servicenow:latest
   ```
2. Start the Confluent Platform.
   ```bash
   confluent local start
   ```
3. Check the status of all services.
   ```bash
   confluent local services status
   ```
4. Create a `servicenow-source.json` file with the following contents:
   ```bash
   // substitute <> with your config
   {
       "name": "ServiceNowSourceConnector",
       "config": {
           "connector.class": "io.confluent.connect.servicenow.ServiceNowSourceConnector",
           "kafka.topic": "topic-servicenow",
           "servicenow.url": "https://<endpoint>.service-now.com/",
           "tasks.max": "1",
           "servicenow.table": "<table_name>",
           "servicenow.user": "<username>",
           "servicenow.password": "<password>",
           "servicenow.since": "2019-01-01",
           "key.converter": "org.apache.kafka.connect.json.JsonConverter",
           "value.converter": "org.apache.kafka.connect.json.JsonConverter",
           "confluent.topic.bootstrap.servers": "<server hostname>:9092",
           "confluent.license": "<license>", // leave it empty for evaluation license
           "poll.interval.s": "10",
           "confluent.topic.replication.factor": "1"
       }
   }
   ```
5. Load the ServiceNow Source connector by posting configuration to Connect REST server.
   ```bash
   confluent local load servicenow --config servicenow-source.json
   ```
6. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status ServiceNowSourceConnector
   ```
7. Create one record to ServiceNow.
   ```bash
   curl -X POST \
       https://<endpoint>.service-now.com/api/now/table/<table_name> \
       -H 'Accept: application/json' \
       -H 'Authorization: Basic <token>'
       -H 'Content-Type: application/json' \
       -H 'cache-control: no-cache' \
       -d '{"short_description": "This is test"}'
   ```
8. Confirm the messages were delivered to the `topic-servicenow` topic in Kafka.
   ```bash
   confluent local consume topic-servicenow --from-beginning
   ```


## Create Kafka topic

You can create a topic using a KafkaTopic CR in an on-prem or Confluent Cloud Kafka
cluster:

```yaml
kind: KafkaTopic
metadata:
  name:                --- [1]
  namespace:           --- [2]
spec:
  replicas:
  partitionCount:
  kafkaClusterRef:     --- [3]
  kafkaRestClassRef:   --- [4]
  kafkaRest:
    endpoint:          --- [5]
    kafkaClusterID:    --- [6]
    authentication:
       type:           --- [7]
       basic:          --- [8]
       bearer:         --- [9]
       oauth:          --- [10]
  configs:             --- [11]
```

* [1] The topic name. If both `metadata.name` and `spec.name` are specified,
  `spec.name` is used.
* [2] The namespace for the topic.
* Use `kafkaClusterRef` ([3]), `kafkaRestClassRef` ([4]), or
  `kafkaRest.endpoint` ([5]) to explicitly specify the Confluent REST Class.

  The order of precedence is [4], [5], and [3].

  If none of the above is set, it performs an auto discovery of the Kafka in the
  same namespace.
* [3] Name of the Kafka cluster.
* [4] Name of the KafkaRestClass CR.
* [5] Confluent REST Class endpoint. See [Manage Confluent Admin REST Class for Confluent Platform Using Confluent for Kubernetes](co-manage-rest-api.md#co-manage-rest-api) for more
  information.
* [6] ID of the Kafka cluster. Required when creating a topic in Confluent Cloud.
* [7] If authentication is required for the Confluent Admin REST Class, specify
  the authentication type. `basic`, `bearer`, `mtls`, and `oauth` are
  supported.

  If you specified the Confluent Admin REST Class using `kafkaRestClassRef`,
  you do not have to set the authentication in `kafkaRest`. Otherwise specify
  the authentication in `kafkaRest`.
* [8] For information about the basic settings, see
  [Basic authentication](co-authenticate-cp.md#co-authenticate-cp-basic).
* [9] For information about the bearer settings, see
  [Bearer authentication](co-authenticate-kafka.md#co-authenticate-mds-bearer),
* [10] For information about the OAuth settings, see
  [OAuth/OIDC authentication](co-authenticate-cp.md#co-authenticate-cp-oauth).
* [11] Specify additional topic configuration settings in key and value pairs,
  for example, `cleanup.policy: "compact"`.

  For the list of available topics configuration parameters, see [Kafka Topics
  Configurations](https://docs.confluent.io/platform/current/installation/configuration/topic-configs.html).


## Configure external access to other Confluent Platform components using node ports

To configure other Confluent components with node ports:

1. Set the following in the component CRs and apply the configuration using the
   `kubectl apply -f` command:
   ```yaml
   spec:
     externalAccess:
       type: nodePort
       nodePort:
         nodePortOffset:         --- [1]

         host:                   --- [2]

         sessionAffinity:        --- [3]
         sessionAffinityConfig:  --- [4]
           clientIP:
             timeoutSeconds:     --- [5]

     configOverrides:
       server:
         - advertised.listeners= --- [6]
   ```

   The access endpoint of each Confluent Platform component will be:
   `<host>:<nodePortOffset>`
   * [1] Required. The value should be in the range between 30000 and 32767,
     inclusive.

     If you change this value on a running cluster, you must roll the cluster.
   * [2] Required. Specify the FQDN that will be used to configure all
     advertised listeners.

     If you change this value on a running cluster, you must roll the cluster.
   * [3] Required for consumer REST Proxy to enable client IP-based session
     affinity.

     For REST Proxy to be used for Kafka consumers, set to `ClientIP`. See
     [Kubernetes Service](https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies)
     for more information about session affinity.
   * [4] Contains the configurations of session affinity if set
     `sessionAffinity: ClientIP` in [3].
   * [5] Specifies the seconds of `ClientIP` type session sticky time. The
     value must be bigger than `0` and less than or equal to `86400` (1 day).

     Default value is `10800` (3 hours).
   * [6] Set to the external DNS name used for node port. This
     configuration is used to generate absolute URLs in V3 responses. The HTTP
     and HTTPS protocols are supported.
2. Create firewall rules to allow connections at the NodePort range that you
   plan to use. For the steps to create firewall rules, see [Using Google Cloud
   firewall rules](https://cloud.google.com/vpc/docs/using-firewalls).
3. Verify the NodePort services are correctly created by listing the services
   in the namespace using the following command:
   ```bash
   kubectl get services -n <namespace> | grep NodePort
   ```

For a tutorial scenario on configuring external access using NodePort, see
the [quickstart tutorial for using node port](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/networking/external-access-nodeport-deploy).


## Configure external access to other Confluent Platform components using routes

The external clients can connect to other Confluent Platform components using routes.

The access endpoint of each Confluent Platform component is:
`<component CR name>.<Kubernetes domain>:443`

For example, in the `example.com` domain with TLS enabled, you access the Confluent Platform
components at the following endpoints:

* `https://connect.example.com:443`
* `https://replicator.example.com:443`
* `https://schemaregistry.example.com:443`
* `https://ksqldb.example.com:443`
* `https://controlcenter.example.com:443`

**To allow external access to Confluent components using routes:**

1. Enable TLS for the component as described [Configure Network Encryption for Confluent Platform Using Confluent for Kubernetes](co-network-encryption.md#co-network-encryption).
2. Set the following in the component custom resource (CR) and apply the
   configuration:
   ```yaml
   spec:
     externalAccess:
       type: route
       route:
         domain:          --- [1]
         prefix:          --- [2]
         wildcardPolicy:  --- [3]
         annotations:     --- [4]
   ```

   * [1] Required. Set `domain` to the domain name of your Kubernetes cluster.

     If you change this value on a running cluster, you must roll the cluster.
   * [2] Optional. Set `prefix` to change the default route prefixes.
     The default is the component name, such as `controlcenter`,
     `connector`, `replicator`, `schemaregistry`, `ksql`.

     The value is used for the DNS entry. The component DNS name becomes
     `<prefix>.<domain>`.

     If not set, the default DNS name is `<component name>.<domain>`, for
     example, `controlcener.example.com`.

     You may want to change the default prefixes for each component to avoid DNS
     conflicts when running multiple Kafka clusters.

     If you change this value on a running cluster, you must roll the cluster.
   * [3] Optional. It defaults to `None` if not configured. Allowed values are
     `Subdomain` and `None`.
   * [4] Required for REST Proxy to be used as consumers. Otherwise, optional.

     Openshift routes support cookie based sticky sessions by default. To use
     the clientIp based session affinity that REST Proxy requires:
     * Disable cookies by setting the
       annotation `haproxy.router.openshift.io/disable_cookies: true`
     * Enable sourceIP based load balancing by setting the annotation
       `haproxy.router.openshift.io/balance: source`
3. Apply the configuration:
   ```bash
   oc apply -f <component CR>
   ```
4. Add a DNS entry for each Confluent Platform component that you added a route to.

   Once the routes are created, you add a DNS entry associated
   with component routes to your DNS table (or whatever method you use
   to get DNS entries recognized by your provider environment).

   You need the following to derive Confluent Platform component DNS entries:
   * THe domain name of your OpenShift cluster as set in Step #1.
   * External IP of the OpenShift router load balancer
   * The component `prefix` if set in Step #1 above. Otherwise, the default
     component name.

     A DNS name is made up of the `prefix` and the `domain` name. For example,
     `controlcenter.example.com`.


   To add DNS entires for Confluent components:
   1. Get the IP address of the OpenShift router load balancer.

      The HAProxy load balancer serves as the router for route services, and
      generally, HAProxy runs in the `openshift-ingress` namespace.
      ```bash
      oc get svc --namespace openshift-ingress
      ```

      An example output:
      ```text
      NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                      AGE
      router-default            LoadBalancer   172.30.84.52     20.189.181.8   80:31294/TCP,443:32145/TCP   42h
      router-internal-default   ClusterIP      172.30.184.233   <none>         80/TCP,443/TCP,1936/TCP      42h
      ```
   2. Get the component DNS names:
      ```bash
      oc get routes | awk '{print $2}'
      ```
   3. Use the external IP to point to all the DNS names for the Confluent
      components in your DNS service provider.

      The example below shows the DNS table entry, using:
      * Domain: `example.com`
      * Default component prefixes
      * Load balancer router external IP: `20.189.181.8`

      ```text
      20.189.181.8 connect.example.com controlcenter.example.com ksqldb.example.com schemaregistry.example.com
      ```
5. [Validate the connections](#co-routes-validate).


## Set up a schema exporter

When the Schema Registry is running, set up a schema exporter on the Confluent Platform Schema Registry to sync the
Confluent Platform Schema Registry to the Confluent Cloud Schema Registry.

For details about the SchemaExporter CR, see [Create schema exporter](co-link-schemas.md#co-create-schema-exporter).

1. Set up a SchemaExporter CR and apply it on the Confluent Platform Schema Registry to sync the Confluent Platform
   Schema Registry to the Confluent Cloud Schema Registry.

   For more information about the SchemaExporter CR properties, see
   [Create schema exporter](co-link-schemas.md#co-create-schema-exporter).
   ```yaml
   kind: SchemaExporter
   metadata:
     name:                   --- [1]
   spec:
     sourceCluster:
       schemaRegistryClusterRef:
         name:               --- [2]
         namespace:          --- [3]
     destinationCluster:
       schemaRegistryRest:
         endpoint:           --- [4]
         authentication:
           type:             --- [5]
           basic:
             secretRef:      --- [6]
     subjects:               --- [7]
     contextName:            --- [8]
   ```

   * [1] Required. The name of the schema exporter. This name must match the
     `exporterName` value in the SchemaImporter CR
     (`SchemaImporter.spec.exporterName`) you set in
     [Set up a schema importer](#co-setup-schema-importer).
   * [2] The name of the source Schema Registry cluster.
   * [3] The namespace of the source Schema Registry cluster.
   * [4] The endpoint of the Confluent Cloud Schema Registry.
   * [5] The authentication type of the Confluent Cloud Schema Registry. The supported type is
     `basic`.
   * [6] The API key of the Confluent Cloud Schema Registry.
   * [7] The subjects to export. The default value of subjects is `"*"`, which
     denotes all subjects in the default context.

     To export all subjects from Confluent Platform to Confluent Cloud, you can set the subjects to
     `:*:` and contextType as `NONE`.

     You can choose to create multiple exporters and importers if using Scenario
     2 listed in [Use Schema Registry in a hybrid setup](http://docs.confluent.io/platform/current/usm/usm-schema.html#configuration-scenarios).
   * [8] The context name of the Schema Registry cluster. Use an empty string for the
     default context.

     This value must match `SchemaRegistry.spec.unifiedStreamManager.context` for the Unified Stream Manager Schema Registry to work as expected.

   An example SchemaExporter CR snippet:
   ```yaml
   apiVersion: platform.confluent.io/v1beta1
   kind: SchemaExporter
   metadata:
     name: schema-exporter
     namespace: my-namespace-1
   spec:
     sourceCluster:
       schemaRegistryClusterRef:
         name: schemaregistry
         namespace: my-namespace-2
     destinationCluster:
       schemaRegistryRest:
         endpoint: https://<ccloud-sr-endpoint>
         authentication:
           type: basic
           basic:
             secretRef: cc-sr-credential
     subjects: [":*:"]
     contextName: my-context
   ```
2. Wait until the exporter enters the RUNNING state, which confirms that schemas
   are synced; in-flight changes are handled automatically.
3. Ensure that the exporter destination URL and any forwarding settings match
   those used by the forwarder/Unified Stream Manager configuration.


## Upgrade CFK

1. Review [Upgrade considerations and troubleshooting](co-upgrade-overview.md#co-upgrade-considerations-cfk) and address any required steps.
2. If you are upgrading your CFK 2.x to 3.x to deploy and manage Confluent Platform 7.x:
   * For Log4J, set the annotation for the components you want to use Log4J:
     ```bash
     kubectl annotate <CR kind> <CR name> \
         platform.confluent.io/use-log4j1=true \
          --namespace <namespace>
     ```

     The `platform.confluent.io/use-log4j1=true` annotation is required to use
     Confluent Platform 7.x with CFK 3.0+.
   * To use the JAAS class path compatible with Confluent Platform 7.x in basic
     authentication, set the annotation,
     `platform.confluent.io/use-old-jetty9=true`, to your Confluent Platform component CRs,
     such as Control Center, Control Center (Legacy), Schema Registry, Connect, ksqlDB, and REST Proxy:
     ```bash
     kubectl annotate <CR kind> <CR name> \
         platform.confluent.io/use-old-jetty9=true \
          --namespace <namespace>
     ```

     If you do not set the annotation properly, you will not be able to log into the Confluent Platform 7.x components using basic authentication, and you will get a login prompt loop when try to login to Control Center.

     For more information, see [Issue: JAAS class path discrepancy between CFK 3.0 and Confluent Platform 7.x](co-troubleshooting.md#co-jaas-class-change).
3. Disable resource reconciliation.

   To prevent Confluent Platform components from rolling restarts, temporarily disable
   resource reconciliation of the components in each namespace where you have
   deployed Confluent Platform, specifying the CR kinds and CR names:
   ```bash
   kubectl annotate connect connect \
       platform.confluent.io/block-reconcile=true \
        --namespace <namespace>
   ```

   ```bash
   kubectl annotate controlcenter controlcenter \
        platform.confluent.io/block-reconcile=true \
        --namespace <namespace>
   ```

   ```bash
   kubectl annotate kafkarestproxy kafkarestproxy \
        platform.confluent.io/block-reconcile=true \
        --namespace <namespace>
   ```

   ```bash
   kubectl annotate kafka kafka \
        platform.confluent.io/block-reconcile=true \
        --namespace <namespace>
   ```

   ```bash
   kubectl annotate ksqldb ksqldb \
        platform.confluent.io/block-reconcile=true \
        --namespace <namespace>
   ```

   ```bash
   kubectl annotate schemaregistry schemaregistry \
        platform.confluent.io/block-reconcile=true \
        --namespace <namespace>
   ```

   For KRaft-based Confluent Platform:
   ```bash
   kubectl annotate kraftcontroller kraftcontroller \
        platform.confluent.io/block-reconcile=true \
        --namespace <namespace>
   ```

   For ZooKeeper-based Confluent Platform:
   ```bash
   kubectl annotate zookeeper zookeeper \
        platform.confluent.io/block-reconcile=true \
        --namespace <namespace>
   ```
4. Add the CFK Helm repo:
   ```bash
   helm repo add confluentinc https://packages.confluent.io/helm
   ```

   ```bash
   helm repo update
   ```
5. Get the CFK chart.
   * From the Helm repo:
     * To get the latest CFK chart:

     ```bash
     helm pull confluentinc/confluent-for-kubernetes --untar
     ```

     * To get a specific version of the CFK chart, get the image tag of the
       CFK version from [Confluent for Kubernetes image tags](co-plan.md#co-operator-image-tags), and specify the version
       tag with the `--version` flag:

     ```bash
     helm pull confluentinc/confluent-for-kubernetes --version <CFK image tag> --untar
     ```
   * From a download bundle as specified in [Deploy CFK using the download bundle](co-deploy-cfk.md#co-download-bundle).
6. **IMPORTANT.** Upgrade Confluent Platform custom resource definitions (CRDs).

   This step is required because Helm does not support upgrading or deleting
   CRDs using Helm. For more information, see the [Helm documentation](https://helm.sh/docs/chart_best_practices/custom_resource_definitions/#some-caveats-and-explanations).
   ```bash
   kubectl apply -f confluent-for-kubernetes/crds/
   ```

   1. If the above `kubectl apply` command returns an error similar to the
      below:
      ```text
      The CustomResourceDefinition "kafkas.platform.confluent.io" is invalid:
      metadata.annotations: Too long: must have at most 262144 bytes make: ***
      [install-crds] Error 1
      ```

      Run the following commands:
      ```bash
      kubectl apply --server-side=true -f <CRD>
      ```
   2. If running `kubectl apply` with the `--server-side=true` flag returns
      an error similar to the below:
      ```text
      Apply failed with 1 conflict: conflict with "helm" using
      apiextensions.k8s.io/v1: .spec.versions Please review the fields
      above--they currently have other managers.
      ```

      Run `kubectl apply` with an additional flag, `--force-conflicts`:
      ```bash
      kubectl apply --server-side=true --force-conflicts -f <CRD>
      ```
7. Upgrade CFK to 3.1.0.
   * If you [deployed customized CFK using the values file](co-deploy-cfk.md#co-values-file):

     For debugging and validating upgrades, when you have a custom values
     file, it is recommended that you run the `helm upgrade` command with
     the `--dry-run` flag to preview. The command with the `--dry-run` flag
     will not actually apply the changes to the cluster, but will print the
     Kubernetes manifests that would be applied.
     ```bash
     helm upgrade --install confluent-operator \
       confluentinc/confluent-for-kubernetes \
       --values <path-to-values-file> \
       --namespace <namespace> \
       --dry-run
     ```

     After reviewing the Kubernetes manifests, run the following command to upgrade CFK:
     ```bash
     helm upgrade --install confluent-operator \
       confluentinc/confluent-for-kubernetes \
       --values <path-to-values-file> \
       --namespace <namespace>
     ```
   * If you deployed CFK without customizing the values file, run the
     following command to upgrade CFK:
     ```bash
     helm upgrade --install confluent-operator \
       confluentinc/confluent-for-kubernetes \
       --namespace <namespace>
     ```
   * If you deployed CFK from a download bundle, upgrade CFK as specified in
     [Deploy CFK using the download bundle](co-deploy-cfk.md#co-download-bundle).

   Note that when using the CFK global license (`globalLicense: true` in the
   component CRs), you need to specify the license key in the `helm upgrade`
   command using the `--set licenseKey=<CFK license key>` flag. For details,
   see [Update CFK global license](co-license.md#co-licence-global-level).
   ```bash
   helm upgrade --install confluent-operator \
     confluentinc/confluent-for-kubernetes \
     --values values.yaml \
     --set licenseKey=<CFK license key>
   ```
8. Alternatively, upgrade CFK to a specific version, such as a hotfix or a
   patch version.
   * If you [deployed CFK using the values file](co-deploy-cfk.md#co-values-file), in your
     `values.yaml`, update the CFK `image.tag` to the image tag of the
     CFK version specified in [Confluent for Kubernetes image tags](co-plan.md#co-operator-image-tags):
     ```yaml
     image:
       tag: "<CFK image tag>"
     ```

     And run the following command to upgrade CFK:
     ```bash
     helm upgrade --install confluent-operator \
       confluentinc/confluent-for-kubernetes \
       --values <path-to-values-file> \
       --namespace <namespace>
     ```
   * If you did not use a customized `values.yaml` for CFK deployment, run the
     following command to upgrade CFK to a specific version, using the image
     tag of the CFK version specified in [Confluent for Kubernetes image tags](co-plan.md#co-operator-image-tags):
     ```bash
     helm upgrade --install confluent-operator \
       confluentinc/confluent-for-kubernetes \
       --version <CFK image tag>
       --namespace <namespace>
     ```
9. Enable resource reconciliation for each Confluent Platform components that you disabled
   reconciliation in the first step above:
   ```bash
   kubectl annotate <component CR kind> <cluster name> \
     platform.confluent.io/block-reconcile- \
     --namespace <namespace>
   ```
10. [Upgrade the CFK init container](#co-upgrade-init-container).


# Manage Kafka Clusters Using Confluent Platform


* [Overview](overview.md)
  * [Related content](overview.md#related-content)
* [Cluster Metadata Management](../kafka-metadata/index.md)
  * [Overview](../kafka-metadata/overview.md)
  * [KRaft Overview](../kafka-metadata/kraft.md)
    * [The controller quorum](../kafka-metadata/kraft.md#the-controller-quorum)
    * [Scaling Kafka with KRaft](../kafka-metadata/kraft.md#scaling-ak-with-kraft)
  * [Configure KRaft](../kafka-metadata/config-kraft.md)
    * [Hardware and JVM requirements](../kafka-metadata/config-kraft.md#hardware-and-jvm-requirements)
    * [Configuration options](../kafka-metadata/config-kraft.md#configuration-options)
    * [Settings for other Kafka and Confluent Platform components](../kafka-metadata/config-kraft.md#settings-for-other-ak-and-cp-components)
    * [Generate and format IDs](../kafka-metadata/config-kraft.md#generate-and-format-ids)
    * [Tools for debugging KRaft mode](../kafka-metadata/config-kraft.md#tools-for-debugging-kraft-mode)
    * [Monitor KRaft](../kafka-metadata/config-kraft.md#monitor-kraft)
    * [Related content](../kafka-metadata/config-kraft.md#related-content)
  * [Find ZooKeeper Resources](../kafka-metadata/zk-production.md)
    * [Related content](../kafka-metadata/zk-production.md#related-content)
* [Manage Self-Balancing Clusters](sbc/overview.md)
  * [Overview](sbc/index.md)
    * [How Self-Balancing simplifies Kafka operations](sbc/index.md#how-sbc-simplifies-ak-operations)
    * [Self-Balancing vs. Auto Data Balancer](sbc/index.md#sbc-vs-adb)
    * [How it works](sbc/index.md#how-it-works)
    * [Configuration and monitoring](sbc/index.md#configuration-and-monitoring)
    * [Replica placement and rack configurations](sbc/index.md#replica-placement-and-rack-configurations)
    * [Security considerations](sbc/index.md#security-considerations)
    * [Troubleshooting](sbc/index.md#troubleshooting)
    * [Related content](sbc/index.md#related-content)
  * [Tutorial: Adding and Remove Brokers](sbc/sbc-tutorial.md)
    * [Configuring and starting controllers and brokers in KRaft mode](sbc/sbc-tutorial.md#configuring-and-starting-controllers-and-brokers-in-kraft-mode)
    * [Prerequisites](sbc/sbc-tutorial.md#prerequisites)
    * [Environment variables](sbc/sbc-tutorial.md#environment-variables)
    * [Configure Kafka brokers](sbc/sbc-tutorial.md#configure-ak-brokers)
    * [Start Confluent Platform, create topics, and generate test data](sbc/sbc-tutorial.md#start-cp-create-topics-and-generate-test-data)
    * [(Optional) Install and configure Confluent Control Center](sbc/sbc-tutorial.md#optional-install-and-configure-c3)
    * [Use the command line to test rebalancing](sbc/sbc-tutorial.md#use-the-command-line-to-test-rebalancing)
    * [Use Control Center to test rebalancing](sbc/sbc-tutorial.md#use-c3-short-to-test-rebalancing)
    * [Shutdown and cleanup tasks](sbc/sbc-tutorial.md#shutdown-and-cleanup-tasks)
    * [(Optional) Running the other components](sbc/sbc-tutorial.md#optional-running-the-other-components)
    * [Related content](sbc/sbc-tutorial.md#related-content)
  * [Configure](sbc/configuration-options.md)
    * [Self-Balancing configuration](sbc/configuration-options.md#sbc-configuration)
    * [Self-Balancing internal topics](sbc/configuration-options.md#sbc-internal-topics)
    * [Required Configurations for Control Center](sbc/configuration-options.md#required-configurations-for-c3-short)
    * [Examples: Update broker configurations on the fly](sbc/configuration-options.md#examples-update-broker-configurations-on-the-fly)
    * [Monitoring the balancer with kafka-rebalance-cluster](sbc/configuration-options.md#monitoring-the-balancer-with-kafka-rebalance-cluster)
    * [kafka-remove-brokers](sbc/configuration-options.md#kafka-remove-brokers)
    * [Related content](sbc/configuration-options.md#related-content)
  * [Performance and Resource Usage](sbc/performance.md)
    * [Add brokers to expand a small cluster with a high partition count](sbc/performance.md#add-brokers-to-expand-a-small-cluster-with-a-high-partition-count)
    * [Test scalability of a large cluster with many partitions](sbc/performance.md#test-scalability-of-a-large-cluster-with-many-partitions)
    * [Repeatedly bounce the controller](sbc/performance.md#repeatedly-bounce-the-controller)
* [Auto Data Balancing](rebalancer/overview.md)
  * [Overview](rebalancer/index.md)
  * [Quick Start](rebalancer/quickstart.md)
    * [Requirements and Limitations](rebalancer/quickstart.md#requirements-and-limitations)
    * [Confluent Auto Data Balancer Quick Start](rebalancer/quickstart.md#adb-full-quick-start)
    * [Licensing](rebalancer/quickstart.md#licensing)
    * [Suggested Reading](rebalancer/quickstart.md#suggested-reading)
  * [Tutorial: Add and Remove Brokers](rebalancer/adb-docker-tutorial.md)
    * [Installing and running Docker](rebalancer/adb-docker-tutorial.md#installing-and-running-docker)
    * [Use Docker to set up a three Node Kafka cluster](rebalancer/adb-docker-tutorial.md#use-docker-to-set-up-a-three-node-ak-cluster)
    * [Create a topic and generate data](rebalancer/adb-docker-tutorial.md#create-a-topic-and-generate-data)
    * [Related content](rebalancer/adb-docker-tutorial.md#related-content)
  * [Configure](rebalancer/configuration-options.md)
    * [confluent-rebalancer command flags](rebalancer/configuration-options.md#confluent-rebalancer-command-flags)
    * [Using a command-config file to talk to Kafka](rebalancer/configuration-options.md#using-a-command-config-file-to-talk-to-ak)
    * [Specifying Auto Data Balancer properties in a config-file](rebalancer/configuration-options.md#specifying-adb-properties-in-a-config-file)
    * [Example: Run the rebalancer with Security, Metrics, and License Configurations](rebalancer/configuration-options.md#example-run-the-rebalancer-with-security-metrics-and-license-configurations)
    * [Example: Run the rebalancer with a Separate Metrics Cluster](rebalancer/configuration-options.md#example-run-the-rebalancer-with-a-separate-metrics-cluster)
    * [ACLs for Auto Data Balancing](rebalancer/configuration-options.md#acls-for-auto-data-balancing)
* [Tiered Storage](tiered-storage.md)
  * [Known limitations](tiered-storage.md#known-limitations)
  * [Enabling Tiered Storage on a broker](tiered-storage.md#enabling-tiered-storage-on-a-broker)
    * [AWS](tiered-storage.md#aws)
    * [GCS](tiered-storage.md#gcs)
    * [Azure](tiered-storage.md#azure)
    * [Pure Storage FlashBlade](tiered-storage.md#pure-storage-flashblade)
    * [Nutanix Objects](tiered-storage.md#nutanix-objects)
    * [NetApp Object Storage](tiered-storage.md#netapp-object-storage)
    * [Dell EMC ECS](tiered-storage.md#dell-emc-ecs)
    * [MinIO](tiered-storage.md#minio)
    * [Cloudian HyperStore Object Storage](tiered-storage.md#cloudian-hyperstore-object-storage)
    * [CEPH](tiered-storage.md#ceph)
    * [Scality](tiered-storage.md#scality)
  * [Configuring Tiered Storage to support compacted topics](tiered-storage.md#configuring-tiered-storage-to-support-compacted-topics)
  * [Creating a topic with Tiered Storage](tiered-storage.md#creating-a-topic-with-tiered-storage)
  * [Sending test messages to experiment with data storage](tiered-storage.md#sending-test-messages-to-experiment-with-data-storage)
  * [Best practices and recommendations](tiered-storage.md#best-practices-and-recommendations)
    * [Tuning](tiered-storage.md#tuning)
    * [Time interval for topic deletes](tiered-storage.md#time-interval-for-topic-deletes)
    * [Log segment sizes](tiered-storage.md#log-segment-sizes)
    * [ACLs on Tiered Storage internal topics](tiered-storage.md#acls-on-tiered-storage-internal-topics)
    * [TLS settings and troubleshooting certificates](tiered-storage.md#tls-settings-and-troubleshooting-certificates)
  * [Sizing brokers with Tiered Storage](tiered-storage.md#sizing-brokers-with-tiered-storage)
    * [Tier archiver metrics](tiered-storage.md#tier-archiver-metrics)
    * [Tier fetcher metrics](tiered-storage.md#tier-fetcher-metrics)
    * [Kafka log of tier size per partition](tiered-storage.md#kafka-log-of-tier-size-per-partition)
  * [Example performance test](tiered-storage.md#example-performance-test)
  * [Supported platforms and features](tiered-storage.md#supported-platforms-and-features)
  * [Configuration options](tiered-storage.md#configuration-options)
  * [Disabling Tiered Storage](tiered-storage.md#disabling-tiered-storage)
  * [Related content](tiered-storage.md#related-content)


## Authentication Mechanisms

The authentication mechanism for incoming requests to Schema Registry is determined by the `confluent.schema.registry.auth.mechanism`
config. Both TLS and [Jetty](https://github.com/eclipse/jetty.project) authentication mechanisms are supported.

When using [Role Based Access Control](../../schema-registry/security/rbac-schema-registry.md#schemaregistry-rbac) (RBAC), Schema Registry expects HTTP
Basic Auth (or token) credentials provided by the Schema Registry client for RBAC
authorization.  If you relied on TLS certificate authentication across Confluent Platform before
enabling and configuring RBAC, be aware that you must also provide Basic Auth
credentials (such as LDAP user) for Confluent Platform components other than Kafka. More specifically, for
Schema Registry, you must specify the bearer token for [Use HTTP Basic Authentication in Confluent Platform](../../security/authentication/http-basic-auth/overview.md#http-basic-auth) and must include
`basic.auth.user.info` and `basic.auth.credentials.source`. For
details about which authentication methods to use when using RBAC, refer to
[RBAC Authentication Options](../../security/authorization/rbac/overview.md#rbac-authentication-options).

Here is an example properties file for Schema Registry using mTLS authentication and RBAC.

```properties
listeners=https://sr:8081
kafkastore.bootstrap.servers=SSL://node1:9095,SSL://node2:9095,SSL://node2:9095
kafkastore.topic=_schemas
debug=true

schema.registry.resource.extension.class=io.confluent.kafka.schemaregistry.security.SchemaRegistrySecurityResourceExtension
confluent.schema.registry.authorizer.class=io.confluent.kafka.schemaregistry.security.authorizer.rbac.RbacAuthorizer
confluent.schema.registry.auth.mechanism=SSL
kafkastore.bootstrap.servers=node1:9093,node2:9093,node3:9093
kafkastore.security.protocol=SASL_PLAINTEXT
kafkastore.topic=_schemas
kafkastore.sasl.mechanism=OAUTHBEARER
kafkastore.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
kafkastore.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
username="kafka" \
password="secret" \
metadataServerUrls="http://node1:8090";
confluent.metadata.basic.auth.user.info=kafka:secret
confluent.metadata.bootstrap.server.urls=http://node1:8090
confluent.metadata.http.auth.credentials.provider=BASIC
public.key.path=/opt/cp/current/certs/public.pem

confluent.schema.registry.auth.ssl.principal.mapping.rules=RULE:^CN=([a-zA-Z0-9.]*).*$/$1/L,DEFAULT
ssl.client.authentication=REQUIRED
ssl.client.auth=true
ssl.keystore.location=/opt/cp/current/certs/sr.jks
ssl.keystore.password=secret
ssl.key.password=secret
ssl.truststore.location=/opt/cp/current/certs/truststore.jks
ssl.truststore.password=secret
inter.instance.protocol=https
kafkastore.ssl.endpoint.identification.algorithm=
ssl.endpoint.identification.algorithm=
rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler
```

If the authentication mechanism is not set, all requests are rejected with a HTTP error code of 403.

See [Schema Registry Authorization](authorization/index.md#confluentsecurityplugins-schema-registry-authorization)
for details on how this authorization happens and how to configure it.

Configure license client authentication
: When using principal propagation and the following security types, you must
  configure client authentication for the license topic. For more information,
  see the following documentation:


  - [SASL OAUTHBEARER (RBAC) client authentication](../../security/authentication/sasl/oauthbearer/configure-clients.md#security-sasl-rbac-oauthbearer-clientconfig)
  - [SASL PLAIN client authentication](../../security/authentication/sasl/plain/overview.md#sasl-plain-clients)
  - [SASL SCRAM client authentication](../../security/authentication/sasl/scram/overview.md#sasl-scram-clients)
  - [mTLS client authentication](../../security/authentication/mutual-tls/overview.md#authentication-ssl-clients)

Configure license client authorization
: When using principal propagation and RBAC or ACLs, you must configure client
  authorization for the license topic.


  #### NOTE
  The `_confluent-command` internal topic is available as the preferred
  alternative to the `_confluent-license` topic for components such as Schema Registry, REST Proxy, and Confluent Server
  (which were previously using `_confluent-license`). Both topics will be supported going
  forward. Here are some guidelines:


  - New deployments (Confluent Platform 6.2.1 and later) will default to using `_confluent-command` as shown below.
  - Existing clusters will continue using the `_confluent-license` unless manually changed.
  - Newly created clusters on Confluent Platform 6.2.1 and later will default to creating the
    `_confluent-command`  topic, and only existing clusters that already have a
    `_confluent-license` topic will continue to use it.


  - **RBAC authorization**


    Run this command to add `ResourceOwner` for the component user for the
    Confluent license topic resource (default name is `_confluent-command`).
    ```none
    confluent iam rbac role-binding create \
    --role ResourceOwner \
    --principal User:<service-account-id> \
    --resource Topic:_confluent-command \
    --kafka-cluster <kafka-cluster-id>
    ```
  - **ACL authorization**


    Run this command to configure Kafka authorization, where bootstrap server,
    client configuration, service account ID is specified. This grants create,
    read, and write on the `_confluent-command` topic.
    ```none
    kafka-acls --bootstrap-server <broker-listener> --command-config <client conf> \
    --add --allow-principal User:<service-account-id>  --operation Create --operation Read --operation Write \
    --topic _confluent-command
    ```


## Manage Schemas

The FileStream connectors are good examples because they are simple, but they also have trivially
structured data – each line is just a string. Almost all connectors will need schemas
with more complex data formats.

To create more complex data, you’ll need to work with the `org.apache.kafka.connect.data` API. Most structured
records will need to interact with two classes in addition to primitive types:
[Schema](/platform/current/connect/javadocs/javadoc/org/apache/kafka/connect/data/Schema.html) and
[Struct](/platform/current/connect/javadocs/javadoc/org/apache/kafka/connect/data/Struct.html).

The API documentation provides a complete reference, but here is a simple example creating a
`Schema` and `Struct`:

```java
Schema schema = SchemaBuilder.struct().name(NAME)
    .field("name", Schema.STRING_SCHEMA)
    .field("age", Schema.INT_SCHEMA)
    .field("admin", new SchemaBuilder.boolean().defaultValue(false).build())
    .build();

Struct struct = new Struct(schema)
    .put("name", "Barbara Liskov")
    .put("age", 75)
    .build();
```

If you are implementing a source connector, you’ll need to decide when and how to create schemas.
Where possible, you should avoid recomputing them if possible. For example, if your
connector is guaranteed to have a fixed schema, create it statically and reuse a single instance.

However, many connectors will have dynamic schemas. One example of this is a database
connector. Considering even just a single table, the schema will not be fixed for a single table
over the lifetime of the connector since the user may execute an `ALTER TABLE` command. The
connector must be able to detect these changes and react appropriately by creating an updated `Schema`.

Sink connectors are usually simpler because they are consuming data and therefore do not need to
create schemas. However, they should take just as much care to validate that the schemas they
receive have the expected format. When the schema does not match – usually indicating the upstream
producer is generating invalid data that cannot be correctly translated to the destination
system – sink connectors should throw an exception to indicate this error to the Kafka Connect framework.

When using the `AvroConverter` included with Confluent Platform, schemas are registered under
the hood with Confluent Schema Registry, so any new schemas must satisfy the compatibility
requirements for the destination topic.


# Quick Start: Move Data In and Out of Kafka with Kafka Connect

This tutorial provides a hands-on look at how you can move data into and out of Apache Kafka® without writing
a single line of code. It is helpful to review the [concepts](index.md#connect-concepts) for Kafka Connect in tandem
with running the steps in this guide to gain a deeper understanding. At the end of this tutorial you will be able
to:

* Use Confluent CLI to manage Confluent services, including starting a single connect worker in distributed mode and loading and unloading connectors.
* Read data from a file and publish to a Kafka topic.
* Read data from a Kafka topic and publish to file.
* Integrate Schema Registry with a connector.

To demonstrate the basic functionality of Kafka Connect and its integration with the Confluent Schema Registry, a few
local standalone Kafka Connect processes with connectors are run. You can insert data written to a file into Kafka and
write data from a Kafka topic to the console. If you are using JSON as the Connect data format, see the instructions
[here](https://kafka.apache.org/documentation#quickstart_kafkaconnect) for a tutorial that does not include Schema Registry.


# Kafka Connect Worker Configuration Properties for Confluent Platform

The following lists many of the configuration properties related to Connect
workers. The first section lists common properties that can be set in either
standalone or distributed mode. These control basic functionality like which
Apache Kafka® cluster to communicate with and what format data you’re working with.
The next two sections list properties specific to standalone or distributed mode.

For additional configuration properties see the following sections:

* Connect and Schema Registry: See [Integrate Schemas from Kafka Connect in Confluent Platform](../../schema-registry/connect.md#schemaregistry-kafka-connect).
* Producer configuration properties: See [Kafka Producer for Confluent Platform](../../clients/producer.md#kafka-producer).
* Consumer configuration properties: See [Kafka Consumer for Confluent Platform](../../clients/consumer.md#kafka-consumer).
* TLS/SSL encryption properties: See [Protect Data in Motion with TLS Encryption in Confluent Platform](../../security/protect-data/encrypt-tls.md#kafka-ssl-encryption).
* All Kafka configuration properties: See [Kafka Configuration Reference for Confluent Platform](../../installation/configuration/index.md#cp-config-reference).

For information about how the Connect worker functions, see [Configuring and Running Workers](/kafka-connectors/self-managed/userguide.html#configuring-and-running-workers).


## Distributed Worker Configuration

In addition to the common worker configuration options, the following are
available in distributed mode. For information about how the Connect worker
functions, see [Configuring and Running
Workers](/kafka-connectors/self-managed/userguide.html#configuring-and-running-workers).

`group.id`
: A unique string that identifies the Connect cluster group this Worker belongs to.


  #### IMPORTANT
  - For production environments, you must explicitly set this configuration.
    When using the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html), this
    configuration property is set to `connect-cluster` by default. All
    workers with the same `group.id` will be in the same Connect cluster.
    For example, if worker A has `group.id=connect-cluster-a` and worker B
    has the same `group.id`, worker A and worker B form a cluster called
    `connect-cluster-a`.
  - The `group.id` for sink connectors is derived from the
    `consumer.group.id` in the worker properties. The `group.id` is
    created using the prefix `connect-` and the connector name. To override
    this value for a sink connector, add the following line ​in the worker
    properties file.
    ```bash
    connector.client.config.override.policy=All
    ```


  * Type: string
  * Default: “”
  * Importance: high

`config.storage.topic`
: The name of the topic where connector and task configuration data are stored. This *must* be the same for all Workers with the same `group.id`.
  Kafka Connect will upon startup attempt to automatically create this topic with a single-partition and compacted cleanup policy to avoid losing data,
  but it will simply use the topic if it already exists. If you choose to create this topic manually, **always** create it as a compacted topic
  with a single partition and a high replication factor (3x or more).


  * Type: string
  * Default: “”
  * Importance: high

`config.storage.replication.factor`
: The replication factor used when Kafka Connects creates the topic used to store connector and task configuration data. This should **always**
  be at least `3` for a production system, but cannot be larger than the number of Kafka brokers in the cluster. Enter `-1` to use the Kafka broker default replication factor.


  * Type: short
  * Default: 3
  * Importance: low

`offset.storage.topic`
: The name of the topic where connector and task configuration offsets are stored. This *must* be the same for all Workers with the same `group.id`.
  Kafka Connect will upon startup attempt to automatically create this topic with multiple partitions and a compacted cleanup policy to avoid losing
  data, but it will simply use the topic if it already exists. If you choose to create this topic manually, **always** create it as a compacted,
  highly replicated (3x or more) topic with a large number of partitions (e.g., 25 or 50, just like Kafka’s built-in `__consumer_offsets` topic) to
  support large Kafka Connect clusters.


  * Type: string
  * Default: “”
  * Importance: high

`offset.storage.replication.factor`
: The replication factor used when Connect creates the topic used to store connector offsets. This should **always**
  be at least `3` for a production system, but cannot be larger than the number of Kafka brokers in the cluster. Enter `-1` to use the Kafka broker default replication factor.


  * Type: short
  * Default: 3
  * Importance: low

`offset.storage.partitions`
: The number of partitions used when Connect creates the topic used to store connector offsets. A large value
  (e.g., `25` or `50`, just like Kafka’s built-in `__consumer_offsets` topic) is necessary to support large Kafka Connect clusters. Enter `-1` to use the default number of partitions configured in the Kafka broker.


  * Type: int
  * Default: 25
  * Importance: low

`status.storage.topic`
: The name of the topic where connector and task configuration status updates are stored. This *must* be the same for all Workers
  with the same `group.id`. Kafka Connect will upon startup attempt to automatically create this topic with multiple partitions and a compacted cleanup policy to avoid losing data, but it will simply use the topic if it already exists. If you choose to create this topic manually, **always** create it as a compacted, highly replicated (3x or more) topic with multiple partitions.


  * Type: string
  * Default: “”
  * Importance: high

`status.storage.replication.factor`
: The replication factor used when Connect creates the topic used to store connector and task status updates.
  This should **always** be at least `3` for a production system, but cannot be larger than the number of Kafka brokers in the cluster. Enter `-1` to use the Kafka broker default replication factor.


  * Type: short
  * Default: 3
  * Importance: low

`status.storage.partitions`
: The number of partitions used when Connect creates the topic used to store connector and task status updates. Enter `-1` to use the default number of partitions configured in the Kafka broker.


  * Type: int
  * Default: 5
  * Importance: low

`heartbeat.interval.ms`
: The expected time between heartbeats to the group coordinator when using
  Kafka’s group management facilities. Heartbeats are used to ensure that the
  Worker’s session stays active and to facilitate rebalancing when new members
  join or leave the group. The value must be set lower than
  `session.timeout.ms`, but typically should be set no higher than 1/3 of that
  value. It can be adjusted even lower to control the expected time for normal
  rebalances.


  * Type: int
  * Default: 3000
  * Importance: high


  The `heartbeat.interval.ms` setting is ignored when `group.protocol =
  consumer`, instead use the broker configuration `group.consumer.heartbeat.interval.ms` to
  control the heartbeat.

`session.timeout.ms`
: The timeout used to detect failures when using Kafka’s group management facilities.


  * Type: int
  * Default: 30000
  * Importance: high

`ssl.key.password`
: The password of the private key in the key store file. This is optional for client.


  * Type: password
  * Importance: high

`ssl.keystore.location`
: The location of the key store file. This is optional for client and can be used for two-way client authentication.


  * Type: string
  * Importance: high

`ssl.keystore.password`
: The store password for the key store file.This is optional for client and only needed if ssl.keystore.location is configured.


  * Type: password
  * Importance: high

`ssl.truststore.location`
: The location of the trust store file.


  * Type: string
  * Importance: high

`ssl.truststore.password`
: The password for the trust store file.


  * Type: password
  * Importance: high

`connections.max.idle.ms`
: Close idle connections after the number of milliseconds specified by this config.


  * Type: long
  * Default: 540000
  * Importance: medium

`receive.buffer.bytes`
: The size of the TCP receive buffer (SO_RCVBUF) to use when reading data.


  * Type: int
  * Default: 32768
  * Importance: medium

`request.timeout.ms`
: The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.


  * Type: int
  * Default: 40000
  * Importance: medium

`sasl.kerberos.service.name`
: The Kerberos principal name that Kafka runs as. This can be defined either in Kafka’s JAAS config or in Kafka’s config.


  * Type: string
  * Importance: medium

`security.protocol`
: Protocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL.


  * Type: string
  * Default: “PLAINTEXT”
  * Importance: medium

`send.buffer.bytes`
: The size of the TCP send buffer (SO_SNDBUF) to use when sending data.


  * Type: int
  * Default: 131072
  * Importance: medium

`ssl.enabled.protocols`
: The comma-separated list of protocols enabled for TLS connections. The default
  value is `TLSv1.2,TLSv1.3` when running with Java 11 or later, `TLSv1.2`
  otherwise. With the default value for Java 11 (`TLSv1.2,TLSv1.3`), Kafka
  clients and brokers prefer TLSv1.3 if both support it, and falls back to
  TLSv1.2 otherwise (assuming both support at least TLSv1.2).


  * Type: list
  * Default: `TLSv1.2,TLSv1.3`
  * Importance: medium

`ssl.keystore.type`
: The file format of the key store file. This is optional for client. Default value is JKS


  * Type: string
  * Default: “JKS”
  * Importance: medium

`ssl.protocol`
: The TLS protocol used to generate the SSLContext. The default is `TLSv1.3`
  when running with Java 11 or newer, `TLSv1.2` otherwise. This value should
  be fine for most use cases. Allowed values in recent JVMs are `TLSv1.2` and
  `TLSv1.3`. `TLS`, `TLSv1.1`, `SSL`, `SSLv2` and `SSLv3` might be
  supported in older JVMs, but their usage is discouraged due to known security
  vulnerabilities. With the default value for this configuration and `ssl.enabled.protocols`,
  clients downgrade to `TLSv1.2` if the server does not support `TLSv1.3`.
  If this configuration is set to `TLSv1.2`, clients do not use `TLSv1.3`,
  even if it is one of the values in `ssl.enabled.protocols` and the server
  only supports `TLSv1.3`.


  * Type: string
  * Default: `TLSv1.3`
  * Importance: medium

`ssl.provider`
: The name of the security provider used for TLS/SSL connections. Default value is the default security provider of the JVM.


  * Type: string
  * Importance: medium

`ssl.truststore.type`
: The file format of the trust store file. Default value is JKS.


  * Type: string
  * Default: “JKS”
  * Importance: medium

`worker.sync.timeout.ms`
: When the Worker is out of sync with other Workers and needs to resynchronize configurations, wait up to this amount of time before giving up, leaving the group, and waiting a backoff period before rejoining.


  * Type: int
  * Default: 3000
  * Importance: medium

`worker.unsync.backoff.ms`
: When the Worker is out of sync with other Workers and  fails to catch up within Worker.sync.timeout.ms, leave the Connect cluster for this long before rejoining.


  * Type: int
  * Default: 300000
  * Importance: medium

`client.id`
: An ID string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.


  * Type: string
  * Default: “”
  * Importance: low

`metadata.max.age.ms`
: The period of time in milliseconds after which you force a refresh of metadata even if you haven’t seen any partition leadership changes to proactively discover any new brokers or partitions.


  * Type: long
  * Default: 300000
  * Importance: low

`metric.reporters`
: A list of classes to use as metrics reporters. Implementing the `MetricReporter` interface allows plugging in classes that will be notified of new metric creation. The JmxReporter is always included to register JMX statistics.


  * Type: list
  * Default: []
  * Importance: low

`metrics.num.samples`
: The number of samples maintained to compute metrics.


  * Type: int
  * Default: 2
  * Importance: low

`metrics.sample.window.ms`
: The number of samples maintained to compute metrics.


  * Type: long
  * Default: 30000
  * Importance: low

`reconnect.backoff.ms`
: The amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all requests sent by the consumer to the broker.


  * Type: long
  * Default: 50
  * Importance: low

`retry.backoff.ms`
: The amount of time to wait before attempting to retry a failed fetch request to a given topic partition. This avoids repeated fetching-and-failing in a tight loop.


  * Type: long
  * Default: 100
  * Importance: low

`sasl.kerberos.kinit.cmd`
: Kerberos kinit command path. Default is /usr/bin/kinit


  * Type: string
  * Default: “/usr/bin/kinit”
  * Importance: low

`sasl.kerberos.min.time.before.relogin`
: Login thread sleep time between refresh attempts.


  * Type: long
  * Default: 60000
  * Importance: low

`sasl.kerberos.ticket.renew.jitter`
: Percentage of random jitter added to the renewal time.


  * Type: double
  * Default: 0.05
  * Importance: low

`sasl.kerberos.ticket.renew.window.factor`
: Login thread will sleep until the specified window factor of time from last refresh to ticket’s expiry has been reached, at which time it will try to renew the ticket.


  * Type: double
  * Default: 0.8
  * Importance: low

`ssl.cipher.suites`
: A list of cipher suites. This is a named combination of authentication,
  encryption, MAC, and key exchange algorithms used to negotiate the security
  settings for a network connection using TLS. By default, all the available
  cipher suites are supported.


  * Type: list
  * Importance: low

`ssl.endpoint.identification.algorithm`
: The endpoint identification algorithm to validate server hostname using server certificate.


  * Type: string
  * Importance: low

`ssl.keymanager.algorithm`
: The algorithm used by key manager factory for TLS/SSL connections. Default value is the key manager factory algorithm configured for the Java Virtual Machine.


  * Type: string
  * Default: “SunX509”
  * Importance: low

`ssl.trustmanager.algorithm`
: The algorithm used by trust manager factory for TLS/SSL connections. Default value is the trust manager factory algorithm configured for the Java Virtual Machine.


  * Type: string
  * Default: “PKIX”
  * Importance: low


# - Spring Petclinic: https://github.com/spring-petclinic/spring-petclinic-rest/blob/master/src/main/resources/openapi.yml
openapi: 3.0.1
info:
  title: Confluent Manager for Apache Flink / CMF
  description: Apache Flink job lifecycle management component for Confluent Platform.
  version: '1.0'
servers:
  - url: http://localhost:8080
tags:
  - name: Environments
  - name: FlinkApplications
  - name: Secrets
  - name: SQL
  - name: Savepoints

paths:
  ## ---------------------------- Environments API ---------------------------- ##
  /cmf/api/v1/environments:
    post:
      tags:
        - Environments
      operationId: createOrUpdateEnvironment
      summary: Create or update an Environment
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/PostEnvironment'
          application/yaml:
            schema:
              $ref: '#/components/schemas/PostEnvironment'
        required: true
      responses:
        201:
          description: The Environment was successfully created or updated.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Environment'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Environment'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - Environments
      operationId: getEnvironments
      summary: Retrieve a paginated list of all environments.
      x-spring-paginated: true
      responses:
        200:
          description: List of environments found. If no environments are found, an empty list is returned. Note the information about secret is not included in the list call yet. In order to get the information about secret, make a getSecret call.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/EnvironmentsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/EnvironmentsPage'
        304:
          description: Not modified.
          headers:
            ETag:
              description: An ID for this version of the response.
              schema:
                type: string
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}:
    get:
      operationId: getEnvironment
      tags:
        - Environments
      summary: Get/Describe an environment with the given name.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment to be retrieved.
          required: true
          schema:
            type: string
      responses:
        200:
          description: Environment found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Environment'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Environment'
        404:
          description: Environment not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      operationId: deleteEnvironment
      tags:
        - Environments
      parameters:
        - name: envName
          in: path
          description: Name of the Environment to be deleted.
          required: true
          schema:
            type: string
      responses:
        200:
          description: Environment found and deleted.
        304:
          description: Not modified.
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Environment not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  ## ---------------------------- Applications API ---------------------------- ##
  /cmf/api/v1/environments/{envName}/applications:
    post:
      tags:
        - FlinkApplications
      operationId: createOrUpdateApplication
      summary: Creates a new Flink Application or updates an existing one in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/FlinkApplication'
          application/yaml:
            schema:
              $ref: '#/components/schemas/FlinkApplication'
        required: true
      responses:
        201:
          description: The Application was successfully created or updated.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FlinkApplication'
            application/yaml:
              schema:
                $ref: '#/components/schemas/FlinkApplication'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - FlinkApplications
      operationId: getApplications
      summary: Retrieve a paginated list of all applications in the given Environment.
      x-spring-paginated: true
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
      responses:
        200:
          description: Application found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ApplicationsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/ApplicationsPage'
        304:
          description: Not modified.
          headers:
            ETag:
              description: An ID for this version of the response.
              schema:
                type: string
        404:
          description: Environment not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/applications/{appName}:
    get:
      tags:
        - FlinkApplications
      operationId: getApplication
      summary: Retrieve an Application of the given name in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
      responses:
        200:
          description: Application found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FlinkApplication'
            application/yaml:
              schema:
                $ref: '#/components/schemas/FlinkApplication'
        304:
          description: Not modified.
          headers:
            ETag:
              description: An ID for this version of the response.
              schema:
                type: string
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Application not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - FlinkApplications
      operationId: deleteApplication
      summary: Deletes an Application of the given name in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
      responses:
        200:
          description: Application found and deleted.
        304:
          description: Not modified.
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1alpha1/environments/{envName}/applications/{appName}/events:
    get:
      tags:
        - FlinkApplications
      operationId: getApplicationEvents
      summary: Get a paginated list of events of the given Application
      x-spring-paginated: true
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
      responses:
        200:
          description: Events found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/EventsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/EventsPage'
        404:
          description: Environment or Application not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/applications/{appName}/start:
    post:
      tags:
        - FlinkApplications
      operationId: startApplication
      summary: Starts an earlier submitted Flink Application
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
        - name: startFromSavepointUid
          in: query
          description: UID of the Savepoint from which the application should be started. This savepoint could belong to the application or can be a detached savepoint.
          required: false
          schema:
            type: string
      responses:
        200:
          description: Application started
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FlinkApplication'
            application/yaml:
              schema:
                $ref: '#/components/schemas/FlinkApplication'
        304:
          description: Not modified.
        404:
          description: Environment, Application or Savepoint not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/applications/{appName}/suspend:
    post:
      tags:
        - FlinkApplications
      operationId: suspendApplication
      summary: Suspends an earlier started Flink Application
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
      responses:
        200:
          description: Application suspended
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FlinkApplication'
            application/yaml:
              schema:
                $ref: '#/components/schemas/FlinkApplication'
        304:
          description: Not modified.
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/applications/{appName}/instances:
    get:
      tags:
        - FlinkApplications
      operationId: getApplicationInstances
      summary: Get a paginated list of instances of the given Application
      x-spring-paginated: true
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
      responses:
        200:
          description: Instances found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ApplicationInstancesPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/ApplicationInstancesPage'
        404:
          description: Environment or Application not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/applications/{appName}/instances/{instName}:
    get:
      tags:
        - FlinkApplications
      operationId: getApplicationInstance
      summary: Retrieve an Instance of an Application
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
        - name: instName
          in: path
          description: Name of the ApplicationInstance
          required: true
          schema:
            type: string
      responses:
        200:
          description: ApplicationInstance found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FlinkApplicationInstance'
            application/yaml:
              schema:
                $ref: '#/components/schemas/FlinkApplicationInstance'
        404:
          description: FlinkApplicationInstance or environment or application not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  ## ---------------------------- Environment Secret Mapping API ---------------------------- ##

  /cmf/api/v1/environments/{envName}/secret-mappings/{name}:
    delete:
      tags:
        - Environments
      operationId: deleteEnvironmentSecretMapping
      summary: Deletes the Environment Secret Mapping for the given Environment and Secret.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment in which the mapping has to be deleted.
          required: true
          schema:
            type: string
        - name: name
          in: path
          description: Name of the environment secret mapping to be deleted in the given environment.
          required: true
          schema:
            type: string
      responses:
        204:
          description: The Environment Secret Mapping was successfully deleted.
        404:
          description: Environment or Secret not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - Environments
      operationId: getEnvironmentSecretMapping
      summary: Retrieve the Environment Secret Mapping for the given name in the given environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: name
          in: path
          description: Name of the environment secret mapping to be retrieved.
          required: true
          schema:
            type: string
      responses:
        200:
          description: Environment Secret Mapping found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/EnvironmentSecretMapping'
            application/yaml:
              schema:
                $ref: '#/components/schemas/EnvironmentSecretMapping'
        404:
          description: Environment or Secret not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    put:
      tags:
        - Environments
      operationId: updateEnvironmentSecretMapping
      summary:
        Updates the Environment Secret Mapping for the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: name
          in: path
          description: Name of the environment secret mapping to be updated
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/EnvironmentSecretMapping'
          application/yaml:
            schema:
              $ref: '#/components/schemas/EnvironmentSecretMapping'
        required: true
      responses:
        200:
          description: The Environment Secret Mapping was successfully updated.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/EnvironmentSecretMapping'
            application/yaml:
              schema:
                $ref: '#/components/schemas/EnvironmentSecretMapping'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Environment or Secret not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/secret-mappings:
    post:
      tags:
        - Environments
      operationId: createEnvironmentSecretMapping
      summary: Creates the Environment Secret Mapping for the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/EnvironmentSecretMapping'
          application/yaml:
            schema:
              $ref: '#/components/schemas/EnvironmentSecretMapping'
        required: true
      responses:
        200:
          description: The Environment Secret Mapping was successfully created.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/EnvironmentSecretMapping'
            application/yaml:
              schema:
                $ref: '#/components/schemas/EnvironmentSecretMapping'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Environment not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        409:
          description: Environment Secret Mapping already exists.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - Environments
      operationId: getEnvironmentSecretMappings
      summary: Retrieve a paginated list of all Environment Secret Mappings.
      x-spring-paginated: true
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
      responses:
        200:
          description: Environment Secret Mappings found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/EnvironmentSecretMappingsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/EnvironmentSecretMappingsPage'
        404:
          description: Environment not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  ## ---------------------------- Statement API ---------------------------- ##

  /cmf/api/v1/environments/{envName}/statements:
    post:
      tags:
        - SQL
      operationId: createStatement
      summary: Creates a new Flink SQL Statement in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Statement'
          application/yaml:
            schema:
              $ref: '#/components/schemas/Statement'
        required: true
      responses:
        200:
          description: The Statement was successfully created.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Statement'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Statement'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Environment not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        409:
          description: Statement already exists.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - SQL
      operationId: getStatements
      summary: Retrieve a paginated list of Statements in the given Environment.
      x-spring-paginated: true
      parameters:
        - $ref: '#/components/parameters/computePoolParam'
        - $ref: '#/components/parameters/phaseParam'
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
      responses:
        200:
          description: Statements found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/StatementsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/StatementsPage'
        404:
          description: Environment not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/statements/{stmtName}:
    get:
      tags:
        - SQL
      operationId: getStatement
      summary: Retrieve the Statement of the given name in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
      responses:
        200:
          description: Statement found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Statement'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Statement'
        404:
          description: Statement not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - SQL
      operationId: deleteStatement
      summary: Deletes the Statement of the given name in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
      responses:
        204:
          description: Statement was found and deleted.
        202:
          description: Statement was found and deletion request received.
        404:
          description: Statement not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    put:
      tags:
        - SQL
      operationId: updateStatement
      summary: Updates a Statement of the given name in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Statement'
          application/yaml:
            schema:
              $ref: '#/components/schemas/Statement'
        required: true
      responses:
        200:
          description: Statement was found and updated.
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Statement not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/statements/{stmtName}/results:
    get:
      tags:
        - SQL
      operationId: getStatementResult
      summary: Retrieve the result of the interactive Statement with the given name in the given Environment.
      parameters:
        - $ref: '#/components/parameters/pageTokenParam'
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
      responses:
        200:
          description: StatementResults found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/StatementResult'
            application/yaml:
              schema:
                $ref: '#/components/schemas/StatementResult'
        400:
          description: Statement does not return results.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Statement not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        410:
          description: Results are gone.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/statements/{stmtName}/exceptions:
    get:
      tags:
        - SQL
      operationId: getStatementExceptions
      summary: Retrieves the last 10 exceptions of the Statement with the given name in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
      responses:
        200:
          description: StatementExceptions found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/StatementExceptionList'
            application/yaml:
              schema:
                $ref: '#/components/schemas/StatementExceptionList'
        404:
          description: Statement not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  ## ---------------------------- Compute Pool API ---------------------------- ##
  /cmf/api/v1/environments/{envName}/compute-pools:
    post:
      tags:
        - SQL
      operationId: createComputePool
      summary: Creates a new Flink Compute Pool in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ComputePool'
          application/yaml:
            schema:
              $ref: '#/components/schemas/ComputePool'
        required: true
      responses:
        200:
          description: The Compute Pool was successfully created.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ComputePool'
            application/yaml:
              schema:
                $ref: '#/components/schemas/ComputePool'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        409:
          description: Compute Pool already exists.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - SQL
      operationId: getComputePools
      summary: Retrieve a paginated list of Compute Pools in the given Environment.
      x-spring-paginated: true
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
      responses:
        200:
          description: Compute Pools found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ComputePoolsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/ComputePoolsPage'
        404:
          description: Environment not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/environments/{envName}/compute-pools/{computePoolName}:
    get:
      tags:
        - SQL
      operationId: getComputePool
      summary: Retrieve the Compute Pool of the given name in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: computePoolName
          in: path
          description: Name of the Compute Pool
          required: true
          schema:
            type: string
      responses:
        200:
          description: Compute Pool found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ComputePool'
            application/yaml:
              schema:
                $ref: '#/components/schemas/ComputePool'
        404:
          description: Compute Pool not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - SQL
      operationId: deleteComputePool
      summary: Deletes the ComputePool of the given name in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: computePoolName
          in: path
          description: Name of the ComputePool
          required: true
          schema:
            type: string
      responses:
        204:
          description: Compute Pool was found and deleted.
        404:
          description: Compute Pool not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        409:
          description: Compute Pool is in use and cannot be deleted.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'


  ## ---------------------------- Secrets API ---------------------------- ##

  /cmf/api/v1/secrets:
    post:
      tags:
        - Secrets
      operationId: createSecret
      summary: Create a Secret.
      description: Create a Secret. This secrets can be then used to specify sensitive information in the Flink SQL statements. Right now these secrets are only used for Kafka and Schema Registry credentials.
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Secret'
          application/yaml:
            schema:
              $ref: '#/components/schemas/Secret'
        required: true
      responses:
        200:
          description: The Secret was successfully created. Note that for security reasons, you can never view the contents of the secret itself once created.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Secret'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Secret'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        409:
          description: The Secret already exists.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - Secrets
      operationId: getSecrets
      summary: Retrieve a paginated list of all secrets. Note that the actual secret data is masked for security reasons.
      x-spring-paginated: true
      responses:
        200:
          description: List of secrets found. If no secrets are found, an empty list is returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/SecretsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/SecretsPage'
        304:
          description: The list of secrets has not changed.
          headers:
            ETag:
              description: An ID for this version of the response.
              schema:
                type: string
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/secrets/{secretName}:
    get:
      tags:
        - Secrets
      operationId: getSecret
      summary: Retrieve the Secret of the given name. Note that the secret data is not returned for security reasons.
      parameters:
        - name: secretName
          in: path
          description: Name of the Secret
          required: true
          schema:
            type: string
      responses:
        200:
          description: Secret found and returned, with security data masked.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Secret'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Secret'
        404:
          description: Secret not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    put:
      tags:
        - Secrets
      operationId: updateSecret
      summary: Update the secret.
      parameters:
        - name: secretName
          in: path
          description: Name of the Secret
          required: true
          schema:
            type: string
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Secret'
          application/yaml:
            schema:
              $ref: '#/components/schemas/Secret'
      responses:
        200:
          description: Returns the updated Secret
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Secret'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Secret'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Secret with the given name not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - Secrets
      operationId: deleteSecret
      summary: Delete the secret with the given name.
      parameters:
        - name: secretName
          in: path
          description: Name of the Secret
          required: true
          schema:
            type: string
      responses:
        204:
          description: Secret was successfully deleted.
        404:
          description: Secret not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  ## ---------------------------- Catalog API ---------------------------- ##

  /cmf/api/v1/catalogs/kafka:
    post:
      tags:
        - SQL
      operationId: createKafkaCatalog
      summary: Creates a new Kafka Catalog that can be referenced by Flink Statements
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/KafkaCatalog'
          application/yaml:
            schema:
              $ref: '#/components/schemas/KafkaCatalog'
        required: true
      responses:
        200:
          description: The Kafka Catalog was successfully created.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/KafkaCatalog'
            application/yaml:
              schema:
                $ref: '#/components/schemas/KafkaCatalog'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        409:
          description: Kafka Catalog already exists.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - SQL
      operationId: getKafkaCatalogs
      summary: Retrieve a paginated list of Kafka Catalogs
      x-spring-paginated: true
      responses:
        200:
          description: Kafka Catalogs found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/KafkaCatalogsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/KafkaCatalogsPage'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/catalogs/kafka/{catName}:
    get:
      tags:
        - SQL
      operationId: getKafkaCatalog
      summary: Retrieve the Kafka Catalog of the given name.
      parameters:
        - name: catName
          in: path
          description: Name of the Kafka Catalog
          required: true
          schema:
            type: string
      responses:
        200:
          description: Kafka Catalog found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/KafkaCatalog'
            application/yaml:
              schema:
                $ref: '#/components/schemas/KafkaCatalog'
        404:
          description: Kafka Catalog not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - SQL
      operationId: deleteKafkaCatalog
      summary: Deletes the Kafka Catalog of the given name.
      parameters:
        - name: catName
          in: path
          description: Name of the Kafka Catalog
          required: true
          schema:
            type: string
      responses:
        204:
          description: Kafka Catalog was found and deleted.
        404:
          description: Kafka Catalog not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        409:
          description: Catalog contains databases and cannot be deleted.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    put:
      tags:
        - SQL
      operationId: updateKafkaCatalog
      summary: Updates a KafkaCatalog of the given name.
      parameters:
        - name: catName
          in: path
          description: Name of the KafkaCatalog
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/KafkaCatalog'
          application/yaml:
            schema:
              $ref: '#/components/schemas/KafkaCatalog'
        required: true
      responses:
        200:
          description: KafkaCatalog was found and updated.
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: KafkaCatalog not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  ## ---------------------------- Database API ---------------------------- ##

  /cmf/api/v1/catalogs/kafka/{catName}/databases:
    post:
      tags:
        - SQL
      operationId: createKafkaDatabase
      summary: Creates a new Kafka Database
      parameters:
        - name: catName
          in: path
          description: Name of the Catalog
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/KafkaDatabase'
          application/yaml:
            schema:
              $ref: '#/components/schemas/KafkaDatabase'
        required: true
      responses:
        200:
          description: The Kafka Database was successfully created.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/KafkaDatabase'
            application/yaml:
              schema:
                $ref: '#/components/schemas/KafkaDatabase'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        409:
          description: Kafka Database already exists.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - SQL
      operationId: getKafkaDatabases
      summary: Retrieve a paginated list of Kafka Databases
      parameters:
        - name: catName
          in: path
          description: Name of the Catalog
          required: true
          schema:
            type: string
      x-spring-paginated: true
      responses:
        200:
          description: Kafka Databases found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/KafkaDatabasesPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/KafkaDatabasesPage'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/catalogs/kafka/{catName}/databases/{dbName}:
    get:
      tags:
        - SQL
      operationId: getKafkaDatabase
      summary: Retrieve the Kafka Database of the given name in the given KafkaCatalog.
      parameters:
        - name: catName
          in: path
          description: Name of the Kafka Catalog
          required: true
          schema:
            type: string
        - name: dbName
          in: path
          description: Name of the Kafka Database
          required: true
          schema:
            type: string
      responses:
        200:
          description: Kafka Database found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/KafkaDatabase'
            application/yaml:
              schema:
                $ref: '#/components/schemas/KafkaDatabase'
        404:
          description: Kafka Database not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - SQL
      operationId: deleteKafkaDatabase
      summary: Deletes the Kafka Database of the given name in the given KafkaCatalog.
      parameters:
        - name: catName
          in: path
          description: Name of the Kafka Catalog
          required: true
          schema:
            type: string
        - name: dbName
          in: path
          description: Name of the Kafka Database
          required: true
          schema:
            type: string
      responses:
        204:
          description: Kafka Database was found and deleted.
        404:
          description: Kafka Database not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    put:
      tags:
        - SQL
      operationId: updateKafkaDatabase
      summary: Updates a KafkaDatabase of the given name in the given KafkaCatalog.
      parameters:
        - name: catName
          in: path
          description: Name of the KafkaCatalog
          required: true
          schema:
            type: string
        - name: dbName
          in: path
          description: Name of the KafkaDatabase
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/KafkaDatabase'
          application/yaml:
            schema:
              $ref: '#/components/schemas/KafkaDatabase'
        required: true
      responses:
        200:
          description: KafkaDatabase was found and updated.
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: KafkaDatabase not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

    ## ---------------------------- Savepoint API ---------------------------- ##

  ### --------------------------- Savepoint for Applications API --------------------------- ###
  /cmf/api/v1/environments/{envName}/applications/{appName}/savepoints:
    post:
      tags:
        - Savepoints
      operationId: createSavepointForFlinkApplication
      summary: Creates a new Savepoint for the given Application in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Savepoint'
          application/yaml:
            schema:
              $ref: '#/components/schemas/Savepoint'
      responses:
        201:
          description: Savepoint was successfully created.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Savepoint'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Savepoint'
        404:
          description: Environment or Application not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - Savepoints
      operationId: getSavepointsForFlinkApplication
      summary: Retrieve a paginated list of all Savepoints for the given Application in the given Environment.
      x-spring-paginated: true
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
      responses:
        200:
          description: Savepoints found and returned. In case, there are no savepoints, an empty list is returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/SavepointsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/SavepointsPage'
        404:
          description: Environment or Application not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  /cmf/api/v1/environments/{envName}/applications/{appName}/savepoints/{savepointName}:
    get:
      tags:
        - Savepoints
      operationId: getSavepointForFlinkApplication
      summary: Retrieve the Savepoint of the given name for the given Application in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
        - name: savepointName
          in: path
          description: Name of the Savepoint
          required: true
          schema:
            type: string
      responses:
        200:
          description: Savepoint found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Savepoint'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Savepoint'
        404:
          description: Environment, Application or Savepoint not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - Savepoints
      operationId: deleteSavepointForFlinkApplication
      summary: Deletes the Savepoint of the given name for the given Application in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
        - name: savepointName
          in: path
          description: Name of the Savepoint to be deleted.
          required: true
          schema:
            type: string
        - name: force
          in: query
          description: If a Savepoint is marked for deletion, it can be force deleted.
          required: false
          schema:
            type: boolean
            default: false
      responses:
        204:
          description: Savepoint was found and deleted.
        404:
          description: Environment, Application or Savepoint not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  /cmf/api/v1/environments/{envName}/applications/{appName}/savepoints/{savepointName}/detach:
    post:
      tags:
        - Savepoints
      operationId: detachSavepointFromFlinkApplication
      summary: Detaches the Savepoint of the given name for the given Application in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: appName
          in: path
          description: Name of the Application
          required: true
          schema:
            type: string
        - name: savepointName
          in: path
          description: Name of the Savepoint to be detached.
          required: true
          schema:
            type: string
      responses:
        200:
          description: Savepoint was successfully detached and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Savepoint'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Savepoint'
        404:
          description: Environment, Application or Savepoint not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  ### --------------------------- Savepoint for Statements API --------------------------- ###
  /cmf/api/v1/environments/{envName}/statements/{stmtName}/savepoints:
    post:
      tags:
        - Savepoints
      operationId: createSavepointForFlinkStatement
      summary: Creates a new Savepoint for the given Statement in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Savepoint'
          application/yaml:
            schema:
              $ref: '#/components/schemas/Savepoint'
      responses:
        201:
          description: Savepoint was successfully created.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Savepoint'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Savepoint'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        404:
          description: Environment or Statement not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - Savepoints
      operationId: getSavepointsForFlinkStatement
      summary: Retrieve a paginated list of all Savepoints for the given Statement in the given Environment.
      x-spring-paginated: true
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
      responses:
        200:
          description: Savepoints found and returned. In case, there are no savepoints, an empty list is returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/SavepointsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/SavepointsPage'
        404:
          description: Environment or Statement not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  /cmf/api/v1/environments/{envName}/statements/{stmtName}/savepoints/{savepointName}:
    get:
      tags:
        - Savepoints
      operationId: getSavepointForFlinkStatement
      summary: Retrieve the Savepoint of the given name for the given Statement in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
        - name: savepointName
          in: path
          description: Name of the Savepoint
          required: true
          schema:
            type: string
      responses:
        200:
          description: Savepoint found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Savepoint'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Savepoint'
        404:
          description: Environment, Statement or Savepoint not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - Savepoints
      operationId: deleteSavepointForFlinkStatement
      summary: Deletes the Savepoint of the given name for the given Statement in the given Environment.
      parameters:
        - name: envName
          in: path
          description: Name of the Environment
          required: true
          schema:
            type: string
        - name: stmtName
          in: path
          description: Name of the Statement
          required: true
          schema:
            type: string
        - name: savepointName
          in: path
          description: Name of the Savepoint to be deleted.
          required: true
          schema:
            type: string
        - name: force
          in: query
          description: If a Savepoint is marked for deletion, it can be force deleted.
          required: false
          schema:
            type: boolean
            default: false
      responses:
        204:
          description: Savepoint was found and deleted.
        404:
          description: Environment, Statement or Savepoint not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

  ### --------------------------- Detached Savepoint API --------------------------- ###

  /cmf/api/v1/detached-savepoints:
    post:
      tags:
        - DetachedSavepoints
      operationId: createDetachedSavepoint
      summary: Creates a new detached savepoint.
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Savepoint'
          application/yaml:
            schema:
              $ref: '#/components/schemas/Savepoint'
      responses:
        201:
          description: Detached Savepoint was successfully created and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Savepoint'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Savepoint'
        400:
          description: Bad request.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        422:
          description: Request valid but invalid content.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    get:
      tags:
        - DetachedSavepoints
      operationId: listDetachedSavepoints
      summary: Retrieve a paginated list of all Detached Savepoints.
      x-spring-paginated: true
      parameters:
        - name: name
          in: query
          description: Filter by detached savepoint name prefix (e.g. ?name=abc)
          required: false
          schema:
            type: string
      responses:
        200:
          description: Detached Savepoints found and returned. In case, there are none, an empty list is returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/SavepointsPage'
            application/yaml:
              schema:
                $ref: '#/components/schemas/SavepointsPage'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
  /cmf/api/v1/detached-savepoints/{detachedSavepointName}:
    get:
      tags:
        - DetachedSavepoints
      operationId: getDetachedSavepoint
      summary: Retrieve the Detached Savepoint of the given name.
      parameters:
        - name: detachedSavepointName
          in: path
          description: Name of the Detached Savepoint
          required: true
          schema:
            type: string
      responses:
        200:
          description: Detached Savepoint found and returned.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Savepoint'
            application/yaml:
              schema:
                $ref: '#/components/schemas/Savepoint'
        404:
          description: Detached Savepoint not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
    delete:
      tags:
        - DetachedSavepoints
      operationId: deleteDetachedSavepoint
      summary: Deletes the Detached Savepoint of the given name.
      parameters:
        - name: detachedSavepointName
          in: path
          description: Name of the Detached Savepoint
          required: true
          schema:
            type: string
      responses:
        204:
          description: Detached Savepoint was found and deleted.
        404:
          description: Detached Savepoint not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'
        500:
          description: Server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/RestError'
            application/yaml:
              schema:
                $ref: '#/components/schemas/RestError'

components:
  # https://github.com/daniel-shuy/swaggerhub-spring-pagination / Copyright (c) 2023 Daniel Shuy
  parameters:
    pageTokenParam:
      in: query
      name: page-token
      schema:
        type: string
      description: Token for the next page of results
    computePoolParam:
      in: query
      name: compute-pool
      schema:
        type: string
      description: Name of the ComputePool to filter on
    phaseParam:
      in: query
      name: phase
      schema:
        type: string
        enum: [pending, running, completed, deleting, failing, failed, stopped]
      description: Phase to filter on
  schemas:
    ## ---------------------------- Shared Utilities ---------------------------- ##
    RestError:
      title: REST Error
      description: The schema for all error responses.
      type: object
      properties:
        errors:
          title: errors
          description: List of all errors
          type: array
          items:
            title: error
            type: object
            description: An error
            properties:
              message:
                type: string
                description: An error message
    PaginationResponse:
      type: object
      properties:
        pageable:
          $ref: '#/components/schemas/Pageable'
    Sort:
      type: object
      format: sort
      properties:
        sorted:
          type: boolean
          description: Whether the results are sorted.
          example: true
        unsorted:
          type: boolean
          description: Whether the results are unsorted.
          example: false
        empty:
          type: boolean
    Pageable:
      type: object
      format: pageable
      properties:
        page:
          type: integer
          minimum: 0
        size:
          type: integer
          description: The number of items in a page.
          minimum: 1
        sort:
          $ref: '#/components/schemas/Sort'

    ## ---------------------------- Shared Bases ---------------------------- ##
    ResourceBaseV2:
      type: object
      properties:
        apiVersion:
          description: API version for spec
          type: string
        kind:
          description: Kind of resource - set to resource type
          type: string
      required:
        - apiVersion
        - kind

    PostResourceBase:
      type: object
      properties:
        name:
          title: Name
          description: A unique name for the resource.
          type: string
          # Validate for DNS subdomain name
          minLength: 4
          maxLength: 253
          pattern: '^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$'
    GetResourceBase:
      type: object
      properties:
        created_time:
          title: Time when the resource has been created
          type: string
          format: date-time
        updated_time:
          title: Time when the resource has been last updated
          type: string
          format: date-time
    # defines kubernetesNamespace
    KubernetesNamespace:
      type: object
      properties:
        kubernetesNamespace:
          type: string
          title: Kubernetes namespace name where resources referencing this environment are created in.
          minLength: 1
          maxLength: 253
          pattern: '^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$'
    # defines properties of fields with flinkApplicationDefaults
    ResourceWithFlinkApplicationDefaults:
      type: object
      properties:
        flinkApplicationDefaults:
          title: the defaults as YAML or JSON for FlinkApplications
          type: object
          format: yamlorjson
    # defines computePool defaults
    ComputePoolDefaults:
      type: object
      properties:
        computePoolDefaults:
          title: ComputePoolDefaults
          description: the defaults as YAML or JSON for ComputePools
          type: object
          format: yamlorjson
    # defines statement defaults
    StatementDefaults:
      type: object
      properties:
        flinkConfiguration:
          description: default Flink configuration for Statements
          type: object
          additionalProperties:
            type: string
    # defines defaults for detached and interactive statements
    AllStatementDefaults:
      type: object
      properties:
        statementDefaults:
          title: AllStatementDefaults
          description: the defaults for detached and interactive Statements
          type: object
          properties:
            detached:
              description: defaults for detached statements
              $ref: '#/components/schemas/StatementDefaults'
            interactive:
              description: defaults for interactive statements
              $ref: '#/components/schemas/StatementDefaults'

    ## ---------------------------- Request Schemas ---------------------------- ##
    PostEnvironment:
      title: Environment
      description: Environment
      type: object
      required:
        - name
      allOf:
        - $ref: '#/components/schemas/PostResourceBase'
        - $ref: '#/components/schemas/ResourceWithFlinkApplicationDefaults'
        - $ref: '#/components/schemas/KubernetesNamespace'
        - $ref: '#/components/schemas/ComputePoolDefaults'
        - $ref: '#/components/schemas/AllStatementDefaults'

    ## ---------------------------- Response Schemas ---------------------------- ##
    Environment:
      title: Environment
      description: Environment
      type: object
      allOf:
        - $ref: '#/components/schemas/PostResourceBase'
        - $ref: '#/components/schemas/GetResourceBase'
        - $ref: '#/components/schemas/ResourceWithFlinkApplicationDefaults'
        - $ref: '#/components/schemas/KubernetesNamespace'
        - $ref: '#/components/schemas/ComputePoolDefaults'
        - $ref: '#/components/schemas/AllStatementDefaults'
      properties:
        secrets:
          title: Secrets
          description: The secrets mapping for the environment. This is a mapping between connection_secret_id and the secret name.
          type: object
          additionalProperties:
            type: string
          default: { }
      required:
        - name
        - kubernetesNamespace

    EnvironmentSecretMapping:
      title: EnvironmentSecretMapping
      description: The secrets mapping for the environment. The name shows the name of the Connection Secret ID to be mapped.
      type: object
      properties:
        apiVersion:
          title: API version for EnvironmentSecretMapping spec
          type: string
        kind:
          title: Kind of resource - set to EnvironmentSecretMapping
          type: string
        metadata:
          title: EnvironmentSecretMappingMetadata
          description: Metadata about the environment secret mapping
          type: object
          properties:
            name:
              description: Name of the Connection Secret ID
              type: string
            uid:
              description: Unique identifier of the EnvironmentSecretMapping
              type: string
            creationTimestamp:
              description: Timestamp when the EnvironmentSecretMapping was created
              type: string
            updateTimestamp:
              description: Timestamp when the EnvironmentSecretMapping was last updated
              type: string
            labels:
              description: Labels of the EnvironmentSecretMapping
              type: object
              additionalProperties:
                type: string
            annotations:
              description: Annotations of the EnvironmentSecretMapping
              type: object
              additionalProperties:
                type: string
        spec:
          title: EnvironmentSecretMappingSpec
          description: Spec for environment secret mapping
          type: object
          writeOnly: true
          properties:
            secretName:
              description: Name of the secret to be mapped to the connection secret id of this mapping.
              type: string
          required:
            - secretName
      required:
        - apiVersion
        - kind

    EnvironmentSecretMappingsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: EnvironmentSecretMappingsPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/EnvironmentSecretMapping'
              default: [ ]

    Secret:
      title: Secret
      description: Represents a Secret that can be used to specify sensitive information in the Flink SQL statements.
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              title: SecretMetadata
              description: Metadata about the secret
              type: object
              properties:
                name:
                  type: string
                  description: Name of the Secret
                creationTimestamp:
                  description: Timestamp when the Secret was created
                  type: string
                  readOnly: true
                updateTimestamp:
                  description: Timestamp when the Secret was last updated
                  type: string
                  readOnly: true
                uid:
                  description: Unique identifier of the Secret
                  type: string
                  readOnly: true
                labels:
                  description: Labels of the Secret
                  type: object
                  additionalProperties:
                    type: string
                annotations:
                  description: Annotations of the Secret
                  type: object
                  additionalProperties:
                    type: string
              required:
                - name
            spec:
              title: SecretSpec
              description: Spec for secret
              type: object
              writeOnly: true
              properties:
                data:
                  title: SecretData
                  description: Data of the secret
                  type: object
                  additionalProperties:
                    type: string
            status:
              title: SecretStatus
              description: Status for the secret
              type: object
              readOnly: true
              properties:
                version:
                  title: SecretVersion
                  description: The version of the secret
                  type: string
                environments:
                  title: Environments
                  description: The environments to which the secret is attached to.
                  type: array
                  items:
                    type: string
          required:
            - metadata
            - spec

    SecretsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: SecretsPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/Secret'
              default: [ ]

    EnvironmentsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: EnvironmentsPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              title: "Env"
              items:
                $ref: '#/components/schemas/Environment'
              default: [ ]

    FlinkApplication:
      title: FlinkApplication
      description: Represents a Flink Application submitted by the user
      type: object
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              title: Metadata about the application
              type: object
              format: yamlorjson
            spec:
              title: Spec for Flink Application
              type: object
              format: yamlorjson
            status:
              title: Status for Flink Application
              type: object
              format: yamlorjson
          required: # status is optional for application spec
            - metadata
            - spec

    ApplicationsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: ApplicationPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/FlinkApplication'
              default: [ ]

    FlinkApplicationEvent:
      title: FlinkApplicationEvent
      description: Events from the deployment of Flink clusters
      # TODO(CF-1159): Using the ResourceBaseV2 here leads to incorrectly generated EventType where the generated interface doesn't match the file name causing compilation errors.
      type: object
      properties:
        apiVersion:
          title: API version for Event spec - set to v1alpha1
          type: string
        kind:
          title: Kind of resource - set to FlinkApplicationEvent
          type: string
        metadata:
          title: EventMetadata
          description: Metadata about the event
          type: object
          properties:
            name:
              description: Name of the Event
              type: string
            uid:
              description: Unique identifier of the Event. Identical to name.
              type: string
            creationTimestamp:
              description: Timestamp when the Event was created
              type: string
            flinkApplicationInstance:
              description: Name of the FlinkApplicationInstance which this event is related to
              type: string
            labels:
              description: Labels of the Event
              type: object
              additionalProperties:
                type: string
            annotations:
              description: Annotations of the Event
              type: object
              additionalProperties:
                type: string
        status:
          type: object
          title: EventStatus
          properties:
            message:
              description: Human readable status message.
              type: string
            type:
              title: EventType
              description: Type of the event
              type: string
            data:
              $ref: '#/components/schemas/EventData'
      required:
        - kind
        - apiVersion
        - metadata
        - status

    EventDataNewStatus:
      type: object
      properties:
        newStatus:
          description: "The new status"
          type: string

    EventDataJobException:
      type: object
      properties:
        exceptionString:
          description: "The full exception string from the Flink job"
          type: string
    EventData:
      oneOf:
        - $ref: '#/components/schemas/EventDataNewStatus'
        - $ref: '#/components/schemas/EventDataJobException'

    EventsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: EventsPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/FlinkApplicationEvent'
              default: [ ]

    FlinkApplicationInstance:
      title: ApplicationInstance
      description: An instance of a Flink Application
      type: object
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              title: ApplicationInstanceMetadata
              description: Metadata about the instance
              type: object
              properties:
                name:
                  description: Name of the Instance - a uuid.
                  type: string
                uid:
                  description: Unique identifier of the instance. Identical to name.
                  type: string
                creationTimestamp:
                  description: Timestamp when the Instance was created
                  type: string
                updateTimestamp:
                  description: Timestamp when the Instance status was last updated
                  type: string
                labels:
                  description: Labels of the instance
                  type: object
                  additionalProperties:
                    type: string
                annotations:
                  description: Annotations of the instance
                  type: object
                  additionalProperties:
                    type: string
            status:
              type: object
              title: ApplicationInstanceStatus
              properties:
                spec:
                  description: The environment defaults merged with the FlinkApplication spec at instance creation time
                  type: object
                  format: yamlorjson
                jobStatus:
                  type: object
                  properties:
                    jobId:
                      description: Flink job id inside the Flink cluster
                      type: string
                    state:
                      description: Tracks the final Flink JobStatus of the instance
                      type: string
    ApplicationInstancesPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: ApplicationInstancesPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/FlinkApplicationInstance'
              default: [ ]

    Statement:
      title: Statement
      description: Represents a SQL Statement submitted by the user
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              title: StatementMetadata
              description: Metadata about the statement
              type: object
              properties:
                name:
                  description: Name of the Statement
                  type: string
                creationTimestamp:
                  description: Timestamp when the Statement was created
                  type: string
                updateTimestamp:
                  description: Timestamp when the Statement was updated last
                  type: string
                uid:
                  description: Unique identifier of the Statement
                  type: string
                labels:
                  description: Labels of the Statement
                  type: object
                  additionalProperties:
                    type: string
                annotations:
                  description: Annotations of the Statement
                  type: object
                  additionalProperties:
                    type: string
              required:
                - name
            spec:
              title: StatementSpec
              description: Spec for statement
              type: object
              properties:
                statement:
                  description: SQL statement
                  type: string
                properties:
                  title: SessionProperties
                  description: Properties of the client session
                  type: object
                  additionalProperties:
                    type: string
                flinkConfiguration:
                  title: StatementFlinkConfiguration
                  description: Flink configuration for the statement
                  type: object
                  additionalProperties:
                    type: string
                computePoolName:
                  description: Name of the ComputePool
                  type: string
                parallelism:
                  description: Parallelism of the statement
                  type: integer
                  format: int32
                stopped:
                  description: Whether the statement is stopped
                  type: boolean
                startFromSavepoint:
                  description: Configuration for starting/resuming the statement from a savepoint
                  $ref: '#/components/schemas/StatementStartFromSavepoint'
              required:
                - statement
                - computePoolName
            status:
              title: StatementStatus
              description: Status for statement
              type: object
              properties:
                phase:
                  description: The lifecycle phase of the statement
                  type: string
                detail:
                  description: Details about the execution status of the statement
                  type: string
                traits:
                  title: StatementTraits
                  description: Detailed information about the properties of the statement
                  type: object
                  properties:
                    sqlKind:
                      description: The kind of SQL statement
                      type: string
                    isBounded:
                      description: Whether the result of the statement is bounded
                      type: boolean
                    isAppendOnly:
                      description: Whether the result of the statement is append only
                      type: boolean
                    upsertColumns:
                      description: The column indexes that are updated by the statement
                      type: array
                      items:
                        type: integer
                        format: int32
                    schema:
                      title: StatementResultSchema
                      description: The schema of the statement result
                      $ref: '#/components/schemas/ResultSchema'
              required:
                - phase
            result:
              title: StatementResult
              description: Result of the statement
              $ref: '#/components/schemas/StatementResult'
          required: # status and result are optional for Statement spec
            - metadata
            - spec

    StatementsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: StatementPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/Statement'
              default: [ ]

    StatementResult:
      title: StatementResult
      description: Represents the result of a SQL Statement
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              title: StatementResultMetadata
              description: Metadata about the StatementResult
              type: object
              properties:
                creationTimestamp:
                  description: Timestamp when the StatementResult was created
                  type: string
                annotations:
                  description: Annotations of the StatementResult
                  type: object
                  additionalProperties:
                    type: string
            results:
              title: StatementResults
              description: Results of the Statement
              type: object
              properties:
                data:
                  title: Data
                  type: array
                  items:
                    description: A result row
                    type: object
                    format: yamlorjson
          required:
            - metadata
            - results

    StatementException:
      title: StatementException
      description: Represents an exception that occurred while executing a SQL Statement
      type: object
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            name:
              description: Name of the StatementException
              type: string
            message:
              description: Message of the StatementException
              type: string
            timestamp:
              description: Timestamp when the StatementException was created
              type: string
          required:
            - name
            - message
            - timestamp

    StatementExceptionList:
      title: StatementExceptionList
      description: Represents a list of exceptions that occurred while executing a SQL Statement
      type: object
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            data:
              title: Exceptions
              description: List of exceptions
              type: array
              maxItems: 10
              items:
                $ref: '#/components/schemas/StatementException'
                default: [ ]
          required:
            - data

    DataType:
      title: DataType
      description: Represents a SQL data type
      type: object
      properties:
        type:
          description: Name of the data type of the column
          type: string
        nullable:
          description: Whether the data type is nullable
          type: boolean
        length:
          description: Length of the data type
          type: integer
          format: int32
        precision:
          description: Precision of the data type
          type: integer
          format: int32
        scale:
          description: Scale of the data type
          type: integer
          format: int32
        keyType:
          description: Type of the key in the data type (if applicable)
          $ref: '#/components/schemas/DataType'
          x-go-pointer: true
        valueType:
          description: Type of the value in the data type (if applicable)
          $ref: '#/components/schemas/DataType'
          x-go-pointer: true
        elementType:
          description: Type of the elements in the data type (if applicable)
          $ref: '#/components/schemas/DataType'
          x-go-pointer: true
        fields:
          description: Fields of the data type (if applicable)
          type: array
          items:
            type: object
            title: DataTypeField
            description: Field of the data type
            properties:
              name:
                description: Name of the field
                type: string
              fieldType:
                description: Type of the field
                $ref: '#/components/schemas/DataType'
                x-go-pointer: true
              description:
                description: Description of the field
                type: string
            required:
              - name
              - fieldType
        resolution:
          description: Resolution of the data type (if applicable)
          type: string
        fractionalPrecision:
          description: Fractional precision of the data type (if applicable)
          type: integer
          format: int32
      required:
        - type
        - nullable

    ResultSchema:
      title: ResultSchema
      description: Represents the schema of the result of a SQL Statement
      type: object
      properties:
        columns:
          description: Properites of all columns in the schema
          type: array
          items:
            title: ResultSchemaColumn
            type: object
            properties:
              name:
                description: Name of the column
                type: string
              type:
                description: Type of the column
                $ref: '#/components/schemas/DataType'
            required:
              - name
              - type
      required:
        - columns

    ComputePool:
      title: ComputePool
      description: Represents the configuration of a Flink cluster
      type: object
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              title: ComputePoolMetadata
              description: Metadata about the ComputePool
              type: object
              properties:
                name:
                  description: Name of the ComputePool
                  type: string
                creationTimestamp:
                  description: Timestamp when the ComputePool was created
                  type: string
                uid:
                  description: Unique identifier of the ComputePool
                  type: string
                labels:
                  description: Labels of the ComputePool
                  type: object
                  additionalProperties:
                    type: string
                annotations:
                  description: Annotations of the ComputePool
                  type: object
                  additionalProperties:
                    type: string
              required:
                - name
            spec:
              title: ComputePoolSpec
              description: Spec for ComputePool
              type: object
              properties:
                type:
                  description: Type of the ComputePool
                  type: string
                clusterSpec:
                  description: Cluster Spec
                  type: object
                  format: yamlorjson
              required:
                - type
                - clusterSpec
            status:
              title: ComputePoolStatus
              description: Status for ComputePool
              type: object
              properties:
                phase:
                  description: Phase of the ComputePool
                  type: string
              required:
                - phase
          required: # status is optional for ComputePool spec
            - metadata
            - spec

    ComputePoolsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: ComputePoolPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/ComputePool'
              default: [ ]

    CatalogMetadata:
      title: CatalogMetadata
      description: Metadata about the Catalog
      type: object
      properties:
        name:
          description: Name of the Catalog
          type: string
        creationTimestamp:
          description: Timestamp when the Catalog was created
          type: string
        updateTimestamp:
          description: Timestamp when the Catalog was updated the last time
          type: string
        uid:
          description: Unique identifier of the Catalog
          type: string
        labels:
          description: Labels of the Catalog
          type: object
          additionalProperties:
            type: string
        annotations:
          description: Annotations of the Catalog
          type: object
          additionalProperties:
            type: string
      required:
        - name

    KafkaCatalog:
      title: KafkaCatalog
      description: Represents a the configuration of a Kafka Catalog
      type: object
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              $ref: '#/components/schemas/CatalogMetadata'
            spec:
              title: KafkaCatalogSpec
              description: Spec of a Kafka Catalog
              type: object
              properties:
                srInstance:
                  description: Details about the SchemaRegistry instance of the Catalog
                  type: object
                  properties:
                    connectionConfig:
                      description: connection options for the SR client
                      type: object
                      additionalProperties:
                        type: string
                    connectionSecretId:
                      description: an identifier to look up a Kubernetes secret that contains the connection credentials
                      type: string
                  required:
                    - connectionConfig
                kafkaClusters:
                  type: array
                  items:
                    type: object
                    properties:
                      databaseName:
                        description: the database name under which the Kafka cluster is listed in the Catalog
                        type: string
                      connectionConfig:
                        description: connection options for the Kafka client
                        type: object
                        additionalProperties:
                          type: string
                      connectionSecretId:
                        description: an identifier to look up a Kubernetes secret that contains the connection credentials
                        type: string
                    required:
                      - databaseName
                      - connectionConfig
              required:
                - srInstance
          required:
            - metadata
            - spec

    KafkaCatalogsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: CatalogPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/KafkaCatalog'
              default: [ ]

    DatabaseMetadata:
      title: DatabaseMetadata
      description: Metadata about the Database
      type: object
      properties:
        name:
          description: Name of the Database
          type: string
        creationTimestamp:
          description: Timestamp when the Database was created
          type: string
        updateTimestamp:
          description: Timestamp when the Database was updated the last time
          type: string
        uid:
          description: Unique identifier of the Database
          type: string
        labels:
          description: Labels of the Database
          type: object
          additionalProperties:
            type: string
        annotations:
          description: Annotations of the Database
          type: object
          additionalProperties:
            type: string
      required:
        - name

    KafkaDatabase:
      title: KafkaDatabase
      description: Represents a the configuration of a Kafka Database
      type: object
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              $ref: '#/components/schemas/DatabaseMetadata'
            spec:
              title: KafkaDatabaseSpec
              description: Spec of a Kafka Database
              type: object
              properties:
                kafkaCluster:
                  description: Details about the Kafka cluster of the database
                  type: object
                  properties:
                    connectionConfig:
                      description: connection options for the Kafka client
                      type: object
                      additionalProperties:
                        type: string
                    connectionSecretId:
                      description: an identifier to look up a secret that contains the connection credentials
                      type: string
                  required:
                    - connectionConfig
                alterEnvironments:
                  description: List of environments that have permission to alter the tables of this database
                  type: array
                  items:
                    type: string
              required:
                - kafkaCluster
          required:
            - metadata
            - spec

    KafkaDatabasesPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: DatabasePageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/KafkaDatabase'
              default: [ ]

    Savepoint:
      title: Savepoint
      description: Represents a Savepoint for a Flink Application or Statement
      type: object
      allOf:
        - $ref: '#/components/schemas/ResourceBaseV2'
        - type: object
          properties:
            metadata:
              title: SavepointMetadata
              description: Metadata about the Savepoint
              type: object
              x-class-extra-annotation: '@com.fasterxml.jackson.annotation.JsonInclude(com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL)'
              properties:
                name:
                  description: Name of the Savepoint
                  type: string
                creationTimestamp:
                  description: Timestamp when the Savepoint was created
                  type: string
                uid:
                  description: Unique identifier of the Savepoint
                  type: string
                labels:
                  description: Labels of the Savepoint
                  type: object
                  additionalProperties:
                    type: string
                annotations:
                  description: Annotations of the Savepoint
                  type: object
                  additionalProperties:
                    type: string
            spec:
              title: SavepointSpec
              description: Spec for Savepoint
              type: object
              x-class-extra-annotation: '@com.fasterxml.jackson.annotation.JsonInclude(com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL)'
              properties:
                path:
                  description: Path of the Savepoint
                  type: string
                backoffLimit:
                  description: Backoff limit for the Savepoint
                  type: integer
                  format: int32
                  default: -1
                formatType:
                  description: Format type of the Savepoint
                  type: string
                  enum:
                    - CANONICAL
                    - NATIVE
                    - UNKNOWN
                  default: CANONICAL
            status:
              title: SavepointStatus
              description: Status for Savepoint
              type: object
              x-class-extra-annotation: '@com.fasterxml.jackson.annotation.JsonInclude(com.fasterxml.jackson.annotation.JsonInclude.Include.NON_NULL)'
              properties:
                state:
                  description: State of the Savepoint
                  type: string
                path:
                  description: Path of the Savepoint
                  type: string
                triggerTimestamp:
                  description: Timestamp when the Savepoint was triggered
                  type: string
                resultTimestamp:
                  description: Timestamp when the Savepoint result was received
                  type: string
                failures:
                  description: The number of failures of the Savepoint
                  type: integer
                  format: int32
                error:
                  description: The error message for the Savepoint
                  type: string
                pendingDeletion:
                  description: Whether the Savepoint is pending deletion
                  type: boolean
          required:
            - metadata
            - spec

    SavepointsPage:
      type: object
      allOf:
        - $ref: '#/components/schemas/PaginationResponse'
        - type: object
          properties:
            metadata:
              type: object
              title: SavepointPageMetadata
              properties:
                size:
                  type: integer
                  format: int64
                  default: 0
            items:
              type: array
              items:
                $ref: '#/components/schemas/Savepoint'
              default: [ ]
    StatementStartFromSavepoint:
      title: StartStatementFromSavepoint
      description: Configuration for resuming a Statement from a savepoint. Works only with update Statement.
      type: object
      properties:
        savepointName:
          description: The name of the Savepoint resource to start Statement from. The request will be rejected if savepoint that has not completed is referenced.
          type: string
        uid:
          description: The uuid of the Savepoint resource to start from.
          type: string
        initialSavepointPath:
          description: The path of the savepoint to start the Statement from. This could be an external path too.
          type: string
        allowNonRestoredState:
          description: A boolean flag to allow the job to start even if some state could not be restored.
          type: boolean
        savepointRedeployNonce:
          description: Nonce used to trigger a full redeployment of the job from the savepoint. In order to trigger redeployment, change the number to a different non-null value. Rollback is not possible after redeployment.
          type: integer
          format: int64
```


# Configure Access Control for Confluent Manager for Apache Flink

Confluent Manager for Apache Flink® models its access control around seven resource types, which different types of users have access to.
For a general description of role-based access control (RBAC), see [Use Role-Based Access Control (RBAC) for Authorization in Confluent Platform](../../security/authorization/rbac/overview.md#rbac-overview).
Following is a list of the resources that are available in Confluent Manager for Apache Flink® for Flink SQL, and their descriptions:

* **Flink application**: Defines a Flink application, which starts the Flink Cluster in Application mode. Depending on their assigned role,
  developers have access to their Flink environment to create, update, and view Flink applications.
* **Flink environment**: The environment contains where and how to deploy the application, such as the Kubernetes namespace or
  central configurations that cannot be overridden. You can use Flink environments to separate the privileges of different teams or organizations.
  System administrators are responsible for managing the Flink environments and provisioning them correctly.
* **Flink statement**: Statements are the resource in CMF used to execute and maintain SQL queries.
* **Flink secret**: These are resources that manage the confidential data that can be used by Flink Statements.
  Currently these can be used for Kafka connection configuration or Schema Registry configuration.
* **Flink catalog**: Provides Kafka topics as tables with schemas derived from Schema Registry.
* **Flink compute pool**: In CMF, the compute resources that are used to execute a SQL statement.
* **Flink detached savepoint**: Detached savepoints are standalone Flink savepoint resources that are not tied to a specific running job.


# Stream Processing with Confluent Platform for Apache Flink

Confluent Platform for Apache Flink® brings support for Apache Flink® to Confluent Platform.

Apache Flink applications are composed of streaming dataflows that are transformed by one or more
user-defined operators. These dataflows form directed acyclic graphs that start with one or
more sources, and end in one or more sinks. Sources and sinks can be Apache Kafka® topics, which means that
Flink integrates nicely with Confluent Platform. To learn more about Confluent Platform for Apache Flink connector support, see
[Connectors](jobs/applications/supported-features.md#af-cp-connectors).

Confluent Platform for Apache Flink is fully compatible with Apache Flink.
However, not all Apache Flink features are supported in Confluent Platform for Apache Flink. To learn more about what features are
supported, see [Confluent Platform for Apache Flink Features and Support](jobs/applications/supported-features.md#cpflink-vs-oss).

Flink applications are deployed in Kubernetes with Confluent Manager for Apache Flink, which is a
central management component that enables users to securely manage a fleet of Flink applications
across multiple environments.

See the following topics to learn more and get started:

- [Install and Configure](installation/overview.md#cpf-install)
- [Get Started](get-started/get-started-application.md#cpf-get-started)
- [Supported Features](jobs/applications/supported-features.md#cpflink-vs-oss)
- [Flink Concepts](concepts/flink.md#cp-flink-concepts)
- [Disaster Recovery](disaster-recovery.md#backup-restore-cmf)
- [Confluent Manager for Apache Flink](concepts/cmf.md#cmf)
- [Get Help](get-help.md#cpf-get-help)


## Confluent Cloud

Confluent Cloud provides Kafka as a cloud service, so that means you no longer need to install, upgrade or patch Kafka server components.
You also get access to a [cloud-native design](../_glossary.md#term-Kora), which offers Infinite Storage, elastic scaling and an uptime guarantee.
If you’re coming to Confluent Cloud from open source Kafka, you can use data-streaming features only available from Confluent,
including non-Java client libraries and proxies for Kafka producers and consumers, tools for monitoring and observability,
an intuitive browser-based user interface, enterprise-grade security and data governance features.

Confluent Cloud includes different types of server processes for steaming data in a production environment. In addition to brokers
and topics, Confluent Cloud provides implementations of Kafka Connect, Schema Registry, and ksqlDB.


## Related content

- To download an automated version of this quick start, see [the Quick Start on GitHub](https://github.com/confluentinc/examples/tree/latest/cp-quickstart/README.md)
- To configure and run a multi-broker cluster without Docker, see [Tutorial: Set Up a Multi-Broker Kafka Cluster](tutorial-multi-broker.md#basics-multi-broker-setup)
- To learn how to develop with Confluent Platform, see [Confluent Developer](https://developer.confluent.io/learn-kafka)
- For training and certification guidance, including resources and access to hands-on
  training and certification exams, see [Confluent Education](https://www.confluent.io/training)
- To try out basic Kafka, Kafka Streams, and ksqlDB tutorials with step-by-step instructions, see [Kafka Tutorials](https://developer.confluent.io/tutorials/)
- To learn how to build stream processing applications in Java or Scala, see [Kafka Streams documentation](../streams/overview.md#kafka-streams)
- To learn how to read and write data to and from Kafka using programming
  languages such as Go, Python, .NET, C/C++, see [Kafka Clients documentation](../clients/overview.md#kafka-clients)


## Run multiple clusters

Another option to experiment with is a multi-cluster deployment. This is
relevant for trying out features like Replicator, Cluster Linking, and
multi-cluster Schema Registry, where you want to share or replicate topic data across two
clusters, often modeled as the origin and the destination cluster.

These configurations can be used for data sharing across data centers and regions
and are often modeled as source and destination clusters. An example configuration
for [cluster linking](../multi-dc-deployments/cluster-linking/index.md#cluster-linking) is shown in the diagram below. (A full
guide to this setup is available in the [Tutorial: Share Data Across Topics Using Cluster Linking for Confluent Platform](../multi-dc-deployments/cluster-linking/topic-data-sharing.md#tutorial-topic-data-sharing).)

![image](images/kafka-basics-multi-cluster.png)

Multi-cluster configurations are described in context under the relevant use
cases. Since these configurations will vary depending on what you want to
accomplish, the best way to test out multi-cluster is to choose a use case, and
follow the feature-specific tutorial.

- [Tutorial: Share Data Across Topics Using Cluster Linking for Confluent Platform](../multi-dc-deployments/cluster-linking/topic-data-sharing.md#tutorial-topic-data-sharing) (requires Confluent Platform 6.0.0 or newer, recommended as the best getting started example)
- [Tutorial: Replicate Data Across Kafka Clusters in Confluent Platform](../multi-dc-deployments/replicator/replicator-quickstart.md#replicator-quickstart)
- [Enabling Multi-Cluster Schema Registry](../schema-registry/schema.md#multi-cluster-sr)


### Control Center

1. Open Control Center in a browser. The default URL is [http://localhost:9021/](http://localhost:9021/).
2. On the **Home** page, click your cluster.
3. In the navigation menu, click **Health+** to open the overview page.
4. Click **Get started** to set up Health+ for your cluster.
5. In the **Enable your cluster to communicate with Health+** section,
   enter your API key and secret.
   ![Enable Health+ page in Confluent Control Center](images/c3-health-plus-api-key-secret.png)
   - If you used the Confluent CLI to generate the key and secret,
     enter them in the **confluent.telemetry.api.key** and
     **confluent.telemetry.api.secret** text boxes.
   - If you used Confluent Cloud Console to generate the key and secret, click
     **Upload key and secret** and navigate to the file that you
     downloaded previously.
6. Click **Continue**.
7. ### (Optional) Add additional Confluent Platform services, such as ksqlDB or Connect.

   - For any Confluent Platform components other than Confluent Server, enable Telemetry Reporting by
     adding the following lines to the corresponding configuration file for
     the service. The default location for a component’s configuration file is
     `$CONFLUENT_HOME/etc/<component_name>/<component_name>.properties`.
     ```properties
     metric.reporters=io.confluent.telemetry.reporter.TelemetryReporter
     confluent.telemetry.enabled=true
     confluent.telemetry.api.key=<api_key>
     confluent.telemetry.api.secret=<api_secret>
     ```

     #### NOTE
     Confluent Server doesn’t require the `metric.reporters` setting, but all other Confluent Platform components do require it.
   - Save the file and restart the service to deploy the new configuration.
     Use the `confluent local services <component_name> stop` and
     `confluent local services <component_name> start` commands to restart
     the service.
8. Navigate to the [Health+](https://confluent.cloud/health-plus)
   page in Confluent Cloud Console to verify that your data is being received.
   The tile for your Confluent Platform cluster should show **Running**.
9. Click **Finish**.
10. Click the tile for your cluster to see your telemetry data on the
    [Monitor Using Health+ Dashboard for Confluent Platform](health-plus-monitoring-dashboard.md#health-plus-monitoring-dashboard).


#### Enterprise license for Confluent Platform subscription

The following Confluent Platform components are under the Confluent Enterprise license for Confluent Platform subscription:

* [Confluent Server](available_packages.md#confluent-server-package)

  The following are a few key features included in Confluent Server:
  * [Cluster Linking](../multi-dc-deployments/cluster-linking/index.md#cluster-linking)
  * [Multi-Region Clusters](../multi-dc-deployments/multi-region.md#bmrr)
  * [Role-based Access Control (RBAC)](../security/authorization/rbac/overview.md#rbac-overview)
  * [Structured Audit Logs](../security/compliance/audit-logs/audit-logs-concepts.md#audit-logs-concepts)
  * [Schema Validation](../schema-registry/schema-validation.md#schema-validation)
  * [Schema Registry Security Plugin for Confluent Platform](../confluent-security-plugins/schema-registry/introduction.md#confluentsecurityplugins-schema-registry-security-plugin)
  * [Secrets Protection](../security/compliance/secrets/overview.md#secrets)
  * [Self-Balancing Clusters](../clusters/sbc/index.md#sbc)
  * [Tiered Storage](../clusters/tiered-storage.md#tiered-storage)
* [Schema Linking](../schema-registry/schema-linking-cp.md#schema-linking-cp-overview)
* [Data Contracts](/platform/current/schema-registry/fundamentals/data-contracts.html)
* Pre-built Connectors

  In [Confluent Hub](https://www.confluent.io/hub), filter by the Premium and
  Commercial license types to see the Connectors under the Confluent Enterprise
  license.
* [Control Center for Confluent Platform](https://docs.confluent.io/control-center/current/overview.html)
* [Confluent for Kubernetes](https://docs.confluent.io/operator/current/overview.html)
* [Confluent Replicator](../multi-dc-deployments/replicator/index.md#replicator-detail)
* [MQTT Proxy](../kafka-mqtt/index.md#mqtt-proxy)
* [Confluent Platform for Apache Flink](../flink/overview.md#cp-flink-overview)


# Upgrade Confluent Platform


* [Overview](upgrade-checklist.md)
  * [Step 0: Prepare for the upgrade](upgrade-checklist.md#step-0-prepare-for-the-upgrade)
  * [Step 1: Upgrade Kafka controllers and brokers](upgrade-checklist.md#step-1-upgrade-ak-controllers-and-brokers)
  * [Step 2: Upgrade Confluent Platform components](upgrade-checklist.md#step-2-upgrade-cp-components)
  * [Step 3: Update configuration files](upgrade-checklist.md#step-3-update-configuration-files)
    * [Connect Log Redactor configuration](upgrade-checklist.md#connect-log-redactor-configuration)
    * [Confluent license](upgrade-checklist.md#confluent-license)
    * [Replication factor for Self-Balancing Clusters](upgrade-checklist.md#replication-factor-for-sbc-long)
  * [Step 4: Enable Health+](upgrade-checklist.md#step-4-enable-health)
  * [Step 5: Rebuild applications](upgrade-checklist.md#step-5-rebuild-applications)
  * [Other Considerations](upgrade-checklist.md#other-considerations)
  * [Related content](upgrade-checklist.md#related-content)
* [Upgrade the Operating System](upgrade-os.md)
  * [Upgrade order](upgrade-os.md#upgrade-order)
  * [Pre-upgrade checks](upgrade-os.md#pre-upgrade-checks)
  * [Post-upgrade checks](upgrade-os.md#post-upgrade-checks)
  * [Related content](upgrade-os.md#related-content)
* [Confluent Platform Upgrade Procedure](upgrade.md)
  * [Upgrade prerequisite: client protocol deprecation](upgrade.md#upgrade-prerequisite-client-protocol-deprecation)
  * [Preparation](upgrade.md#preparation)
  * [Upgrade order](upgrade.md#upgrade-order)
    * [Stage 1: Preparation (If you are on Confluent Platform 7.7 or earlier)](upgrade.md#stage-1-preparation-if-you-are-on-cp-7-7-or-earlier)
    * [Stage 2: Upgrade to Confluent Platform 8.1](upgrade.md#stage-2-upgrade-to-cp-version)
  * [Upgrade Kafka](upgrade.md#upgrade-ak)
    * [Steps to upgrade for any fix pack release](upgrade.md#steps-to-upgrade-for-any-fix-pack-release)
    * [Steps for upgrading to 8.1.x](upgrade.md#steps-for-upgrading-to-version-x)
    * [Confluent license](upgrade.md#confluent-license)
    * [Advertised listeners](upgrade.md#advertised-listeners)
    * [Security](upgrade.md#security)
    * [Replication factor for Self-Balancing Clusters](upgrade.md#replication-factor-for-sbc-long)
    * [Upgrade DEB packages using APT](upgrade.md#upgrade-deb-packages-using-apt)
      * [Install a specific version on Debian or Ubuntu](upgrade.md#install-a-specific-version-on-debian-or-ubuntu)
        * [Method 1: Install the latest version (Recommended)](upgrade.md#method-1-install-the-latest-version-recommended)
        * [Method 2: Force a specific older version with pinning](upgrade.md#method-2-force-a-specific-older-version-with-pinning)
    * [Upgrade RPM packages by using YUM](upgrade.md#upgrade-rpm-packages-by-using-yum)
    * [Upgrade using TAR or ZIP archives](upgrade.md#upgrade-using-tar-or-zip-archives)
  * [Upgrade Confluent Control Center](upgrade.md#upgrade-c3)
  * [Upgrade Schema Registry](upgrade.md#upgrade-sr)
  * [Upgrade Confluent REST Proxy](upgrade.md#upgrade-crest-long)
  * [Upgrade Kafka Streams applications](upgrade.md#upgrade-kstreams-applications)
  * [Upgrade Kafka Connect](upgrade.md#upgrade-kconnect-long)
    * [Upgrade Kafka Connect standalone mode](upgrade.md#upgrade-kconnect-long-standalone-mode)
    * [Upgrade Kafka Connect distributed mode](upgrade.md#upgrade-kconnect-long-distributed-mode)
  * [Upgrade ksqlDB](upgrade.md#upgrade-ksqldb)
  * [Upgrade other client applications](upgrade.md#upgrade-other-client-applications)
  * [Related content](upgrade.md#related-content)


## Next steps

**RBAC:**

* [Role-Based Access Control for Confluent Platform Quick Start](../../security/authorization/rbac/rbac-cli-quickstart.md#rbac-cli-quickstart)
* [Configure RBAC for Control Center on Confluent Platform](/control-center/current/security/c3-rbac.html)
* [Deploy Secure ksqlDB with RBAC in Confluent Platform](../../security/authorization/rbac/ksql-rbac.md#ksql-rbac)
* [Configure Role-Based Access Control for Schema Registry in Confluent Platform](../../schema-registry/security/rbac-schema-registry.md#schemaregistry-rbac)
* [Kafka Connect and RBAC](../../connect/rbac-index.md#connect-rbac-index)
* [Role-Based Access Control (RBAC)](../../kafka-rest/production-deployment/rest-proxy/security.md#rbac-rest-proxy-security)

**Centralized ACLs:**

* [Use Centralized ACLs with MDS for Authorization in Confluent Platform](../../security/authorization/rbac/authorization-acl-with-mds.md#authorization-acl-with-mds)

**Centralized audit logs:**

* [Configure Audit Logs in Confluent Platform Using Confluent CLI](../../security/compliance/audit-logs/audit-logs-cli-config.md#audit-log-cli-config)

**Cluster registry:**

* [Cluster Registry in Confluent Platform](../../security/cluster-registry.md#cluster-registry)


#### NOTE
Most configuration attributes show example values in `<>`, which can be helpful in terms of
understanding the type of value expected. Users are expected to replace the example
with values matching their own setup. Values displayed without
`<>` can be used as recommended values.

```RST
 ############################# Broker Settings ##################################
 zookeeper.connect=<host-1>:2181,<host-2>:2181,<host-3>:2181
 log.dirs=/var/lib/kafka/data
 broker.id=1

 ############################# Log Retention Policy, Log Basics ##################
 log.retention.check.interval.ms=300000
 log.retention.hours=168
 log.segment.bytes=1073741824
 num.io.threads=16
 num.network.threads=8
 num.partitions=1
 num.recovery.threads.per.data.dir=2

 ########################### Socket Server Settings #############################
 socket.receive.buffer.bytes=102400
 socket.request.max.bytes=104857600
 socket.send.buffer.bytes=102400

 ############################# Internal Topic Settings  #########################
 offsets.topic.replication.factor=3
 transaction.state.log.min.isr=2
 transaction.state.log.replication.factor=3

 ######################## Metrics Reporting ########################################
 metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter
 confluent.metrics.reporter.bootstrap.servers=<address>-west-2.compute.internal:9092
 confluent.metrics.reporter.topic.replicas=3
 confluent.support.customer.id=anonymous

 ######################## LISTENERS ######################################
 listeners=INTERNAL://:9092,EXTERNAL://:9093,TOKEN://:9094
 advertised.listeners=INTERNAL://<localhost>:9092,\
                      EXTERNAL://<external-hostname>:9093,\
                      TOKEN://<external-hostname>:9094
 listener.security.protocol.map=INTERNAL:SSL,EXTERNAL:SSL,TOKEN:SASL_SSL

 inter.broker.listener.name=INTERNAL

 ############################ TLS/SSL SETTINGS #####################################
 ssl.truststore.location=/var/ssl/private/client.truststore.jks
 ssl.truststore.password=<truststore-password>
 ssl.keystore.location=/var/ssl/private/kafka.keystore.jks
 ssl.keystore.password=<keystore-password>
 ssl.key.password=<key-password>
 ssl.client.auth=required
 ssl.endpoint.identification.algorithm=HTTPS

 ##############  TLS/SSL settings for metrics reporting ##############
 confluent.metrics.reporter.security.protocol=SSL
 confluent.metrics.reporter.ssl.truststore.location=/var/ssl/private/client.truststore.jks
 confluent.metrics.reporter.ssl.truststore.password=<truststore-password>
 confluent.metrics.reporter.ssl.keystore.location=/var/ssl/private/kafka.keystore.jks
 confluent.metrics.reporter.ssl.keystore.password=<keystore-password>
 confluent.metrics.reporter.ssl.key.password=<key-password>

 ############################# TLS/SSL LISTENERS #############################
 listener.name.internal.ssl.principal.mapping.rules= \
         RULE:^CN=([a-zA-Z0-9.]*).*$/$1/L ,\
         DEFAULT

 listener.name.external.ssl.principal.mapping.rules= \
         RULE:^CN=([a-zA-Z0-9.]*).*$/$1/L ,\
         DEFAULT

 ############################# TOKEN LISTENER #############################
 listener.name.token.sasl.enabled.mechanisms=OAUTHBEARER
 listener.name.token.oauthbearer.sasl.jaas.config= \
     org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
         publicKeyPath="<path-to-mds-public-key.pem>";
 listener.name.token.oauthbearer.sasl.server.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerValidatorCallbackHandler
 listener.name.token.oauthbearer.sasl.login.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerServerLoginCallbackHandler

 ############################# Authorization Settings #############################
 authorizer.class.name=io.confluent.kafka.security.authorizer.ConfluentServerAuthorizer
 confluent.authorizer.access.rule.providers=ZK_ACL,CONFLUENT
 super.users=User:kafka

 #############################  MDS Listener - which port to listen on #############################
 confluent.metadata.server.listeners=https://0.0.0.0:8090,http://0.0.0.0:8091
 confluent.metadata.server.advertised.listeners=https://<localhost>:8090,\
                                                http://<localhost>:8091

 ############################# TLS/SSL Settings for MDS #############################
 confluent.metadata.server.ssl.keystore.location=<path-to-kafka.keystore.jks>
 confluent.metadata.server.ssl.keystore.password=<keystore-password>
 confluent.metadata.server.ssl.key.password=<key-password>
 confluent.metadata.server.ssl.truststore.location=<path-to-client.truststore.jks>
 confluent.metadata.server.ssl.truststore.password=<truststore-password>

 ############################# MDS Token Service Settings - enable token generation #############################
 confluent.metadata.server.token.max.lifetime.ms=3600000
 confluent.metadata.server.token.key.path=<path-to-token-key-pair.pem>
 confluent.metadata.server.token.signature.algorithm=RS256
 confluent.metadata.server.authentication.method=BEARER

 ############################# Identity Provider Settings(LDAP - local OpenLDAP) #############################
 ldap.java.naming.factory.initial=com.sun.jndi.ldap.LdapCtxFactory
 ldap.com.sun.jndi.ldap.read.timeout=3000
 ldap.java.naming.provider.url=ldap:<ldap-server-address>
 # how mds authenticates to ldap server
 ldap.java.naming.security.principal=<CN=mds,CN=Demo,DC=confluent,DC=io>
 ldap.java.naming.security.credentials=<password>
 ldap.java.naming.security.authentication=simple
 # ldap search mode (GROUPS is default)
 #ldap.search.mode=GROUPS
 #ldap.search.mode=USERS
 # how to search for users
 ldap.user.search.base=<CN=Demo,DC=confluent,DC=io>
 # how to search for groups
 ldap.group.search.base=<CN=Demo,DC=confluent,DC=io>
 # which attribute in ldap record corresponds to user name
 ldap.user.name.attribute=sAMAccountName
 ldap.user.memberof.attribute.pattern=<CN=(.*),CN=Demo,DC=confluent,DC=io>
 ldap.group.object.class=group
 ldap.group.name.attribute=sAMAccountName
 ldap.group.member.attribute.pattern=<CN=(.*),CN=Demo,DC=confluent,DC=io>

 ########################### Enable Swagger #############################
 confluent.metadata.server.openapi.enable=true
```


### GET /clusters

**List Clusters**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

‘Return a list of known Kafka clusters. Currently both Kafka and Kafka REST
Proxy are only aware of the Kafka cluster pointed at by the
`bootstrap.servers` configuration. Therefore only one Kafka cluster will be returned in the
response.’

**Example request:**

```http
GET /clusters HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of Kafka clusters.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaClusterList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaCluster",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1",
                    "resource_name": "crn:///kafka=cluster-1"
                },
                "cluster_id": "cluster-1",
                "controller": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
                },
                "acls": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/acls"
                },
                "brokers": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers"
                },
                "broker_configs": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/broker-configs"
                },
                "consumer_groups": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups"
                },
                "topics": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics"
                },
                "partition_reassignments": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/-/partitions/-/reassignment"
                }
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/broker-configs

**List Dynamic Broker Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return a list of dynamic cluster-wide broker configuration parameters for the specified Kafka
cluster. Returns an empty list if there are no dynamic cluster-wide broker configuration parameters.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.

**Example request:**

```http
GET /clusters/{cluster_id}/broker-configs HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of cluster configs.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaClusterConfigList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/broker-configs",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaClusterConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/broker-configs/max.connections",
                    "resource_name": "crn:///kafka=cluster-1/broker-config=max.connections"
                },
                "cluster_id": "cluster-1",
                "config_type": "BROKER",
                "name": "max.connections",
                "value": "1000",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_DEFAULT_BROKER_CONFIG",
                "synonyms": [
                    {
                        "name": "max.connections",
                        "value": "1000",
                        "source": "DYNAMIC_DEFAULT_BROKER_CONFIG"
                    },
                    {
                        "name": "max.connections",
                        "value": "2147483647",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            },
            {
                "kind": "KafkaClusterConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/broker-configs/compression.type",
                    "resource_name": "crn:///kafka=cluster-1/broker-config=compression.type"
                },
                "cluster_id": "cluster-1",
                "config_type": "BROKER",
                "name": "compression.type",
                "value": "gzip",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_DEFAULT_BROKER_CONFIG",
                "synonyms": [
                    {
                        "name": "compression.type",
                        "value": "gzip",
                        "source": "DYNAMIC_DEFAULT_BROKER_CONFIG"
                    },
                    {
                        "name": "compression.type",
                        "value": "producer",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### POST /clusters/{cluster_id}/broker-configs:alter

**Batch Alter Dynamic Broker Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Update or delete a set of dynamic cluster-wide broker configuration parameters.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.

**Example request:**

```http
POST /clusters/{cluster_id}/broker-configs:alter HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "data": [
        {
            "name": "max.connections",
            "operation": "DELETE"
        },
        {
            "name": "compression.type",
            "value": "gzip"
        }
    ]
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/broker-configs/{name}

**Get Dynamic Broker Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the dynamic cluster-wide broker configuration parameter specified by `name`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **name** (*string*) – The configuration parameter name.

**Example request:**

```http
GET /clusters/{cluster_id}/broker-configs/{name} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The cluster configuration parameter.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaClusterConfig",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/broker-configs/compression.type",
            "resource_name": "crn:///kafka=cluster-1/broker-config=compression.type"
        },
        "cluster_id": "cluster-1",
        "config_type": "BROKER",
        "name": "compression.type",
        "value": "gzip",
        "is_default": false,
        "is_read_only": false,
        "is_sensitive": false,
        "source": "DYNAMIC_DEFAULT_BROKER_CONFIG",
        "synonyms": [
            {
                "name": "compression.type",
                "value": "gzip",
                "source": "DYNAMIC_DEFAULT_BROKER_CONFIG"
            },
            {
                "name": "compression.type",
                "value": "producer",
                "source": "DEFAULT_CONFIG"
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/brokers/-/configs

**List Dynamic Broker Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the list of dynamic configuration parameters for all the brokers in the given Kafka cluster.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.

**Example request:**

```http
GET /clusters/{cluster_id}/brokers/-/configs HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of broker configs.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaBrokerConfigList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/configs",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaBrokerConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/configs/max.connections",
                    "resource_name": "crn:///kafka=cluster-1/broker=1/config=max.connections"
                },
                "cluster_id": "cluster-1",
                "broker_id": 1,
                "name": "max.connections",
                "value": "1000",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_BROKER_CONFIG",
                "synonyms": [
                    {
                        "name": "max.connections",
                        "value": "1000",
                        "source": "DYNAMIC_BROKER_CONFIG"
                    },
                    {
                        "name": "max.connections",
                        "value": "2147483647",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            },
            {
                "kind": "KafkaBrokerConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/configs/compression.type",
                    "resource_name": "crn:///kafka=cluster-1/broker=1/config=compression.type"
                },
                "cluster_id": "cluster-1",
                "broker_id": 1,
                "name": "compression.type",
                "value": "gzip",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_BROKER_CONFIG",
                "synonyms": [
                    {
                        "name": "compression.type",
                        "value": "gzip",
                        "source": "DYNAMIC_BROKER_CONFIG"
                    },
                    {
                        "name": "compression.type",
                        "value": "producer",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/brokers/{broker_id}/configs

**List Broker Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the list of configuration parameters that belong to the specified Kafka broker.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.

**Example request:**

```http
GET /clusters/{cluster_id}/brokers/{broker_id}/configs HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of broker configs.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaBrokerConfigList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/configs",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaBrokerConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/configs/max.connections",
                    "resource_name": "crn:///kafka=cluster-1/broker=1/config=max.connections"
                },
                "cluster_id": "cluster-1",
                "broker_id": 1,
                "name": "max.connections",
                "value": "1000",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_BROKER_CONFIG",
                "synonyms": [
                    {
                        "name": "max.connections",
                        "value": "1000",
                        "source": "DYNAMIC_BROKER_CONFIG"
                    },
                    {
                        "name": "max.connections",
                        "value": "2147483647",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            },
            {
                "kind": "KafkaBrokerConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/configs/compression.type",
                    "resource_name": "crn:///kafka=cluster-1/broker=1/config=compression.type"
                },
                "cluster_id": "cluster-1",
                "broker_id": 1,
                "name": "compression.type",
                "value": "gzip",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_BROKER_CONFIG",
                "synonyms": [
                    {
                        "name": "compression.type",
                        "value": "gzip",
                        "source": "DYNAMIC_BROKER_CONFIG"
                    },
                    {
                        "name": "compression.type",
                        "value": "producer",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### POST /clusters/{cluster_id}/brokers/{broker_id}/configs:alter

**Batch Alter Broker Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Update or delete a set of broker configuration parameters.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.

**Example request:**

```http
POST /clusters/{cluster_id}/brokers/{broker_id}/configs:alter HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "data": [
        {
            "name": "max.connections",
            "operation": "DELETE"
        },
        {
            "name": "compression.type",
            "value": "gzip"
        }
    ]
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/brokers/{broker_id}/configs/{name}

**Get Broker Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the configuration parameter specified by `name`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.
  * **name** (*string*) – The configuration parameter name.

**Example request:**

```http
GET /clusters/{cluster_id}/brokers/{broker_id}/configs/{name} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The broker configuration parameter.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaBrokerConfig",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/configs/compression.type",
            "resource_name": "crn:///kafka=cluster-1/broker=1/config=compression.type"
        },
        "cluster_id": "cluster-1",
        "broker_id": 1,
        "name": "compression.type",
        "value": "gzip",
        "is_default": false,
        "is_read_only": false,
        "is_sensitive": false,
        "source": "DYNAMIC_BROKER_CONFIG",
        "synonyms": [
            {
                "name": "compression.type",
                "value": "gzip",
                "source": "DYNAMIC_BROKER_CONFIG"
            },
            {
                "name": "compression.type",
                "value": "producer",
                "source": "DEFAULT_CONFIG"
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/topics/{topic_name}/configs

**List Topic Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the list of configuration parameters that belong to the specified topic.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.

**Example request:**

```http
GET /clusters/{cluster_id}/topics/{topic_name}/configs HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of cluster configs.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaTopicConfigList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaTopicConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs/cleanup.policy",
                    "resource_name": "crn:///kafka=cluster-1/topic=topic-1/config=cleanup.policy"
                },
                "cluster_id": "cluster-1",
                "topic_name": "topic-1",
                "name": "cleanup.policy",
                "value": "compact",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_TOPIC_CONFIG",
                "synonyms": [
                    {
                        "name": "cleanup.policy",
                        "value": "compact",
                        "source": "DYNAMIC_TOPIC_CONFIG"
                    },
                    {
                        "name": "cleanup.policy",
                        "value": "delete",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            },
            {
                "kind": "KafkaTopicConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs/compression.type",
                    "resource_name": "crn:///kafka=cluster-1/topic=topic-1/config=compression.type"
                },
                "cluster_id": "cluster-1",
                "topic_name": "topic-1",
                "name": "compression.type",
                "value": "gzip",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_TOPIC_CONFIG",
                "synonyms": [
                    {
                        "name": "compression.type",
                        "value": "gzip",
                        "source": "DYNAMIC_TOPIC_CONFIG"
                    },
                    {
                        "name": "compression.type",
                        "value": "producer",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Indicates attempted access to an unreachable or non-existing resource like e.g. an unknown topic or partition. GET requests to endpoints not allowed in the accesslists will also result in this response.

    **endpoint_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "HTTP 404 Not Found"
    }
    ```

    **cluster_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Cluster my-cluster cannot be found."
    }
    ```

    **unknown_topic_or_partition:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 40403,
        "message": "This server does not host this topic-partition."
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/topics/{topic_name}/configs/{name}

**Get Topic Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the configuration parameter with the given `name`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.
  * **name** (*string*) – The configuration parameter name.

**Example request:**

```http
GET /clusters/{cluster_id}/topics/{topic_name}/configs/{name} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The topic configuration parameter.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaTopicConfig",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/compression.type",
            "resource_name": "crn:///kafka=cluster-1/topic=topic-1/config=compression.type"
        },
        "cluster_id": "cluster-1",
        "topic_name": "topic-1",
        "name": "compression.type",
        "value": "gzip",
        "is_default": false,
        "is_read_only": false,
        "is_sensitive": false,
        "source": "DYNAMIC_TOPIC_CONFIG",
        "synonyms": [
            {
                "name": "compression.type",
                "value": "gzip",
                "source": "DYNAMIC_TOPIC_CONFIG"
            },
            {
                "name": "compression.type",
                "value": "producer",
                "source": "DEFAULT_CONFIG"
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Indicates attempted access to an unreachable or non-existing resource like e.g. an unknown topic or partition. GET requests to endpoints not allowed in the accesslists will also result in this response.

    **endpoint_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "HTTP 404 Not Found"
    }
    ```

    **cluster_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Cluster my-cluster cannot be found."
    }
    ```

    **unknown_topic_or_partition:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 40403,
        "message": "This server does not host this topic-partition."
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### PUT /clusters/{cluster_id}/topics/{topic_name}/configs/{name}

**Update Topic Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Update the configuration parameter with given `name`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.
  * **name** (*string*) – The configuration parameter name.

**Example request:**

```http
PUT /clusters/{cluster_id}/topics/{topic_name}/configs/{name} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "value": "gzip"
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Indicates attempted access to an unreachable or non-existing resource like e.g. an unknown topic or partition. GET requests to endpoints not allowed in the accesslists will also result in this response.

    **endpoint_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "HTTP 404 Not Found"
    }
    ```

    **cluster_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Cluster my-cluster cannot be found."
    }
    ```

    **unknown_topic_or_partition:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 40403,
        "message": "This server does not host this topic-partition."
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### DELETE /clusters/{cluster_id}/topics/{topic_name}/configs/{name}

**Reset Topic Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Reset the configuration parameter with given `name` to its default value.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.
  * **name** (*string*) – The configuration parameter name.
* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Indicates attempted access to an unreachable or non-existing resource like e.g. an unknown topic or partition. GET requests to endpoints not allowed in the accesslists will also result in this response.

    **endpoint_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "HTTP 404 Not Found"
    }
    ```

    **cluster_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Cluster my-cluster cannot be found."
    }
    ```

    **unknown_topic_or_partition:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 40403,
        "message": "This server does not host this topic-partition."
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/topics/{topic_name}/default-configs

**List New Topic Default Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

List the default configuration parameters used if the topic were to be newly created.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.

**Example request:**

```http
GET /clusters/{cluster_id}/topics/{topic_name}/default-configs HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of cluster configs.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaTopicConfigList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaTopicConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs/cleanup.policy",
                    "resource_name": "crn:///kafka=cluster-1/topic=topic-1/config=cleanup.policy"
                },
                "cluster_id": "cluster-1",
                "topic_name": "topic-1",
                "name": "cleanup.policy",
                "value": "compact",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_TOPIC_CONFIG",
                "synonyms": [
                    {
                        "name": "cleanup.policy",
                        "value": "compact",
                        "source": "DYNAMIC_TOPIC_CONFIG"
                    },
                    {
                        "name": "cleanup.policy",
                        "value": "delete",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            },
            {
                "kind": "KafkaTopicConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs/compression.type",
                    "resource_name": "crn:///kafka=cluster-1/topic=topic-1/config=compression.type"
                },
                "cluster_id": "cluster-1",
                "topic_name": "topic-1",
                "name": "compression.type",
                "value": "gzip",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_TOPIC_CONFIG",
                "synonyms": [
                    {
                        "name": "compression.type",
                        "value": "gzip",
                        "source": "DYNAMIC_TOPIC_CONFIG"
                    },
                    {
                        "name": "compression.type",
                        "value": "producer",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Indicates attempted access to an unreachable or non-existing resource like e.g. an unknown topic or partition. GET requests to endpoints not allowed in the accesslists will also result in this response.

    **endpoint_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "HTTP 404 Not Found"
    }
    ```

    **cluster_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Cluster my-cluster cannot be found."
    }
    ```

    **unknown_topic_or_partition:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 40403,
        "message": "This server does not host this topic-partition."
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}

**Get Consumer Group**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the consumer group specified by the `consumer_group_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **consumer_group_id** (*string*) – The consumer group ID.

**Example request:**

```http
GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The consumer group.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumerGroup",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1",
            "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1"
        },
        "cluster_id": "cluster-1",
        "consumer_group_id": "consumer-group-1",
        "is_simple": false,
        "partition_assignor": "org.apache.kafka.clients.consumer.RoundRobinAssignor",
        "state": "STABLE",
        "coordinator": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
        },
        "consumers": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers"
        },
        "lag_summary": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lag-summary"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/lag-summary

**Get Consumer Group Lag Summary**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the maximum and total lag of the consumers belonging to the
specified consumer group.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **consumer_group_id** (*string*) – The consumer group ID.

**Example request:**

```http
GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/lag-summary HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The max and total consumer lag in a consumer group.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumerGroupLagSummary",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lag-summary",
            "resource_name": "crn:///kafka=cluster-1/consumer-groups=consumer-group-1/lag-summary"
        },
        "cluster_id": "cluster-1",
        "consumer_group_id": "consumer-group-1",
        "max_lag_consumer_id": "consumer-1",
        "max_lag_instance_id": "consumer-instance-1",
        "max_lag_client_id": "client-1",
        "max_lag_topic_name": "topic-1",
        "max_lag_partition_id": 1,
        "max_lag": 100,
        "total_lag": 110,
        "max_lag_consumer": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers/consumer-1"
        },
        "max_lag_partition": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/1"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/lags/{topic_name}/partitions/{partition_id}

**Get Consumer Lag**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)[![Available in dedicated clusters only](https://img.shields.io/badge/-Available%20in%20dedicated%20clusters%20only-%23bc8540)](https://docs.confluent.io/cloud/current/clusters/cluster-types.html#dedicated-cluster)

Return the consumer lag on a partition with the given `partition_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **consumer_group_id** (*string*) – The consumer group ID.
  * **topic_name** (*string*) – The topic name.
  * **partition_id** (*integer*) – The partition ID.

**Example request:**

```http
GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/lags/{topic_name}/partitions/{partition_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The consumer lag.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumerLag",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags/topic-1/partitions/1",
            "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/lag=topic-1/partition=1"
        },
        "cluster_id": "cluster-1",
        "consumer_group_id": "consumer-group-1",
        "topic_name": "topic-1",
        "partition_id": 1,
        "consumer_id": "consumer-1",
        "instance_id": "consumer-instance-1",
        "client_id": "client-1",
        "current_offset": 1,
        "log_end_offset": 101,
        "lag": 100
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### POST /clusters/{cluster_id}/topics/{topic_name}/records

**Produce Records**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Produce records to the given topic, returning delivery reports for each
record produced. This API can be used in streaming mode by setting
“Transfer-Encoding: chunked” header. For as long as the connection is
kept open, the server will keep accepting records. For each record sent
to the server, the server will asynchronously send back a delivery
report, in the same order. Records are streamed to and from the server
as Concatenated JSON. Errors are reported per record. The HTTP status
code will be HTTP 200 OK as long as the connection is successfully
established.

Note that the cluster_id is validated only when running in Confluent
Cloud.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.

**binary_and_json:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/records HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "partition_id": 1,
    "headers": [
        {
            "name": "Header-1",
            "value": "SGVhZGVyLTE="
        },
        {
            "name": "Header-2",
            "value": "SGVhZGVyLTI="
        }
    ],
    "key": {
        "type": "BINARY",
        "data": "Zm9vYmFy"
    },
    "value": {
        "type": "JSON",
        "data": {
            "foo": "bar"
        }
    },
    "timestamp": "2021-02-05T19:14:42Z"
}
```

**binary_and_avro_with_subject_and_raw_schema:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/records HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "partition_id": 1,
    "headers": [
        {
            "name": "Header-1",
            "value": "SGVhZGVyLTE="
        },
        {
            "name": "Header-2",
            "value": "SGVhZGVyLTI="
        }
    ],
    "key": {
        "type": "BINARY",
        "data": "Zm9vYmFy"
    },
    "value": {
        "type": "AVRO",
        "subject": "topic-1-key",
        "schema": "{\\\"type\\\":\\\"string\\\"}",
        "data": "foobar"
    },
    "timestamp": "2021-02-05T19:14:42Z"
}
```

**string:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/records HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "value": {
        "type": "STRING",
        "data": "My message"
    }
}
```

**schema_id_and_schema_version:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/records HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "key": {
        "subject_name_strategy": "TOPIC_NAME",
        "schema_id": 1,
        "data": 1000
    },
    "value": {
        "schema_version": 1,
        "data": {
            "foo": "bar"
        }
    }
}
```

**latest_schema:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/records HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "key": {
        "data": 1000
    },
    "value": {
        "data": "foobar"
    }
}
```

**null_and_empty_data:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/records HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "key": {
        "schema_id": 1
    },
    "value": {
        "schema_version": 1,
        "data": null
    }
}
```

**empty_value:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/records HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "key": {
        "data": 1000
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The response containing a delivery report for a record produced to a topic. In streaming mode,
    for each record sent, a separate delivery report will be returned, in the same order,
    each with its own error_code.

    **produce_record_success:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "error_code": 200,
        "cluster_id": "cluster-1",
        "topic_name": "topic-1",
        "partition_id": 1,
        "offset": 0,
        "timestamp": "2021-02-05T19:14:42Z",
        "key": {
            "type": "BINARY",
            "size": 7
        },
        "value": {
            "type": "JSON",
            "size": 15
        }
    }
    ```

    **produce_record_bad_binary_data:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Bad Request: data=1 is not a base64 string."
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **header_not_base64_encoded:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `byte[]` from String \"\": Unexpected end of base64-encoded String: base64 variant 'MIME-NO-LINEFEEDS' expects padding (one or more '=' characters) at the end. This Base64Variant might have been incorrectly configured"
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Indicates attempted access to an unreachable or non-existing resource like e.g. an unknown topic or partition. GET requests to endpoints not allowed in the accesslists will also result in this response.

    **endpoint_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "HTTP 404 Not Found"
    }
    ```

    **cluster_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Cluster my-cluster cannot be found."
    }
    ```

    **unknown_topic_or_partition:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 40403,
        "message": "This server does not host this topic-partition."
    }
    ```
  * [413 Request Entity Too Large](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.14) –

    This implies the client is sending a request payload that is larger than the maximum message size the server can accept.

    **produce_records_expects_json:**
    ```http
    HTTP/1.1 413 Request Entity Too Large
    Content-Type: application/json

    {
        "error_code": 413,
        "message": "The request included a message larger than the maximum message size the server can accept."
    }
    ```
  * [415 Unsupported Media Type](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.16) –

    This implies the client is sending the request payload format in an unsupported format.

    **produce_records_expects_json:**
    ```http
    HTTP/1.1 415 Unsupported Media Type
    Content-Type: application/json

    {
        "error_code": 415,
        "message": "HTTP 415 Unsupported Media Type"
    }
    ```
  * [422 Unprocessable Entity](https://www.rfc-editor.org/rfc/rfc4918#section-11.2) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **produce_record_empty_request_body:**
    ```http
    HTTP/1.1 422 Unprocessable Entity
    Content-Type: application/json

    {
        "error_code": 422,
        "message": "Payload error. Request body is empty. Data is required."
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


## Role-Based Access Control (RBAC)

This is a commercial component of Confluent Platform.

**Prerequisites:**

* [HTTPS](config.md#confluent-server-rest-http-ssl-config) is recommended, but not required.
* RBAC-enabled Kafka and Schema Registry clusters. For details about RBAC, see [Use Role-Based Access Control (RBAC) for Authorization in Confluent Platform](../../../security/authorization/rbac/overview.md#rbac-overview).

To enable token authentication (in the `kafka.properties` file) set
`kafka.rest.rest.servlet.initializor.classes` to
`io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler` and
`kafka.rest.kafka.rest.resource.extension.class` to
`io.confluent.kafkarest.security.KafkaRestSecurityResourceExtension`.

```bash
kafka.rest.rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler
kafka.rest.kafka.rest.resource.extension.class=io.confluent.kafkarest.security.KafkaRestSecurityResourceExtension
```

When token authentication is enabled, the generated token is used to impersonate the API requests.
The Admin REST APIs Kafka clients use the `SASL_PLAINTEXT` or `SASL_SSL` authentication
mechanism to authenticate with Kafka brokers.


## Role-Based Access Control (RBAC)

This is a commercial component of Confluent Platform.

**Prerequisites:**

* RBAC-enabled Kafka and Schema Registry clusters. For details about RBAC, see [Use Role-Based Access Control (RBAC) for Authorization in Confluent Platform](../../../security/authorization/rbac/overview.md#rbac-overview).

[HTTPS](config.md#kafka-rest-http-ssl-config) is recommended, but not required.

Confluent REST Proxy supports the cross-component, proprietary role-based access control (RBAC) solution to
enforce access controls across Confluent Platform. The REST Proxy security plugin supports a bearer
token-based authentication mechanism. With token authentication, REST Proxy can impersonate the user
requests when communicating with Kafka brokers and Schema Registry clusters.

RBAC REST Proxy security resolves a number of usability challenges, including:

* Local configuration of principals. With RBAC REST Proxy security, principals are no longer configured
  locally; instead, principals are handled by the Metadata Service (MDS).
* Existing REST Proxy security capabilities do not scale for very large deployments without significant
  manual operations; in RBAC REST Proxy security, the MDS binds and enforces an Kafka cluster
  configuration across different resources (Topics, Connectors, Schema Registry, etc.), thereby
  saving users the time and challenge associated with reconfiguring ACLs and roles separately for
  each Kafka cluster resource.


# Stream Processing Concepts in ksqlDB for Confluent Platform

ksqlDB enables stream processing, which is a way to compute over events as
they arrive, rather than in batches at a later time. These events come from
Apache Kafka® topics. In ksqlDB, events are stored in a stream, which is a Kafka topic
with a defined schema.

When you create a stream in ksqlDB, if the backing Kafka topic doesn’t exist,
ksqlDB creates it with the specified number of partitions. The stream’s
metadata (schema, serialization scheme, etc.) is stored in ksqlDB’s command
topic, which is an internal communication channel. Each ksqlDB server keeps a
local copy of this metadata.

Events are added to a stream as rows, which are essentially Kafka records with
extra metadata. ksqlDB uses a Kafka producer to insert these records into the
backing Kafka topic. The data itself is persisted in Kafka, not on the ksqlDB
servers.

ksqlDB offers a SQL-like interface for transforming streams. You can create
new streams derived from existing ones by selecting and manipulating columns.
This is done with persistent queries. For example, you can filter a stream or
convert the case of a string field.

ksqlDB also supports stateful operations. This means that the processing of
an event can depend on the accumulated effects of previous events. State can be
used for simple aggregations, like counting events, or more complex operations,
like feature engineering for machine learning. Each parallel instance of a
ksqlDB application handles events for a specific group of keys, and the state
for those keys is kept locally. This allows for high throughput and low latency.

For fault tolerance, ksqlDB uses state snapshots and stream replay. Snapshots
capture the entire state of the pipeline, including offsets in the input queues
and the state derived from processed data. In case of failure, the pipeline can
be restored from the snapshot, and it can replay the stream from the saved
offsets. Sources tables are not kept entirely in state. Stateless operations,
like filtering and projections, don’t require state.

- [Apache Kafka and ksqlDB](apache-kafka-primer.md#ksqldb-apache-kafka-primer): A quick overview
  of Kafka.
- [Connectors in ksqlDB](connectors.md#ksqldb-connectors): Connectors source and sink
  data from external systems.
- [Events in ksqlDB](events.md#ksqldb-events): An event is the fundamental unit of
  data in stream processing.
- [Joins](../developer-guide/joins/overview.md#ksqldb-joins): Joins are how to combine data from many streams
  and tables into one.
- [User-defined Functions](functions.md#ksqldb-concepts-udfs): Extend ksqlDB to invoke
  custom code written in Java.
- [Lambda Functions](lambda-functions.md#ksqldb-concepts-lambda-functions): Lambda functions
  enable you to apply in-line functions without creating a full UDF.
- [Materialized Views](materialized-views.md#ksqldb-concepts-materialized-views): Materialized
  views precompute the results of queries at write-time so reads become predictably fast.
- [Queries](queries.md#ksqldb-concepts-queries): Queries are how you process events
  and retrieve computed results.
- [Stream Processing](stream-processing.md#ksqldb-concepts-stream-processing): Stream
  processing is a way to write programs computing over unbounded streams of events.
- [Streams](streams.md#ksqldb-concepts-streams): A stream is an immutable,
  append-only collection of events that represents a series of historical facts.
- [Tables](tables.md#ksqldb-concepts-tables): A table is a mutable collection of
  events that models change over time.
- [Time and Windows](time-and-windows-in-ksqldb-queries.md#ksqldb-time-and-windows): Windows help you bound a
  continuous stream of events into distinct time intervals.


## Get started

Start by creating a `pom.xml` for your Java application:

```xml
<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>my.ksqldb.app</groupId>
    <artifactId>my-ksqldb-app</artifactId>
    <version>0.0.1</version>

    <properties>

        <java.version>8</java.version>
        <ksqldb.version>0.29.0</ksqldb.version>

        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    </properties>

    <repositories>
        <repository>
            <id>ksqlDB</id>
            <name>ksqlDB</name>
            <url>https://ksqldb-mvns.s3.amazonaws.com/maven/</url>
        </repository>
    </repositories>

    <pluginRepositories>
        <pluginRepository>
            <id>ksqlDB</id>
            <url>https://ksqldb-mvns.s3.amazonaws.com/maven/</url>
        </pluginRepository>
    </pluginRepositories>

    <dependencies>
        <dependency>
            <groupId>io.confluent.ksql</groupId>
            <artifactId>ksqldb-api-client</artifactId>
            <version>${ksqldb.version}</version>
            <classifier>with-dependencies</classifier>
        </dependency>
        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka-streams</artifactId>
            <version>4.1.0</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
              <groupId>org.apache.maven.plugins</groupId>
              <artifactId>maven-compiler-plugin</artifactId>
              <version>3.8.1</version>
              <configuration>
                <source>${java.version}</source>
                <target>${java.version}</target>
                <compilerArgs>
                  <arg>-Xlint:all</arg>
                </compilerArgs>
              </configuration>
            </plugin>
        </plugins>
    </build>
</project>
```


# Configure ksqlDB for Confluent Platform

- [Configure Security for ksqlDB](security.md#ksqldb-installation-security)
- [ksqlDB Configuration Parameter Reference](../../reference/server-configuration.md#ksqldb-reference-server-configuration)
- [Configure ksqlDB for Avro, Protobuf, and JSON schemas](avro-schema.md#ksqldb-installation-configure-serialization-formats)

ksqlDB configuration parameters can be set for ksqlDB Server and for
queries, as well as for the underlying Kafka Streams and Kafka
Clients (producer and consumer).

These instructions assume you are installing Confluent Platform by using ZIP or TAR
archives. For more information, see [On-Premises Deployments](../../../installation/overview.md#installation).


#### Step 2. Configure the link on the private cluster

The more privileged / private cluster (cluster A in the diagram) requires:

- Connectivity to its remote cluster (one-way connectivity is acceptable; such as AWS PrivateLink)
- A user to create a cluster link object on it **second** (after the remote cluster) with the following configuration:
  ```properties
  # bootstrap of the remote cluster
  bootstrap.servers=localhost:9992
  link.mode=BIDIRECTIONAL

  # authentication for the link principal on the remote cluster
  sasl.mechanism=SCRAM-SHA-512
  security.protocol=SASL_PLAINTEXT
  sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
    username="link" \
    password="link-secret";

  # authentication for the link principal on the local cluster
  local.sasl.mechanism=SCRAM-SHA-512
  local.security.protocol=SASL_PLAINTEXT
  local.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
    username="link" \
    password="link-secret";
  ```
- Additional configurations can be included in this file as needed, per [cluster link configurations](#cp-cluster-link-config-options).
  - An authentication configuration (such as API key or OAuth) for a principal on its **remote cluster** with ACLs or
    RBAC role bindings giving permission to read topic data and metadata.
    - **Alter:Cluster ACL** – this is unique to the advanced mode
    - Describe:Cluster ACL
    - The required ACLs or RBAC role bindings for a cluster link, as described in [Manage Security for Cluster Linking on Confluent Platform](security.md#cluster-link-security)
      (for [a cluster link on a source cluster](security.md#cluster-link-acls-for-link-on-source) and for a
      [source-initiated link on the destination cluster](security.md#cluster-link-acls-for-source-initiated-links)).
  - **Local authentication** (unique to the unidirectional mode): An authentication configuration (such as API key or OAuth)
    for a principal **on its own cluster** with ACLs or RBAC role bindings giving permission to read topic data and the
    Describe:Cluster ACL
    - The required ACLs or RBAC role bindings as described giving permission to read topic data and the Describe:Cluster ACL,
      as described in [Manage Security for Cluster Linking on Confluent Platform](security.md#cluster-link-security). This authentication configuration never leaves this cluster.
      The lines for these configurations must be prefixed with `local.` to indicate that they belong to the local cluster.
    - Link mode set to `link.mode=BIDIRECTIONAL`

For example, run the following command to create a bidirectional
INBOUND/OUTBOUND link on the private cluster (cluster A in the diagram),
including the call to your cluster A config file:

```bash
$CONFLUENT_HOME/bin/kafka-cluster-links --create --link bidirectional-link \
--config-file my-examples/a-link.config \
--bootstrap-server localhost:9092 --command-config my-examples/command.config
```


## FAQ Quick List

- [How do I know the data was successfully copied to the destination and can be safely deleted from the source topic?](#faq-verify-data-replication)
- [How can I throttle a Confluent Platform to Confluent Cloud cluster link to make the best use of my bandwidth?](#faq-throttle)
- [Will adding a cluster link result in throttling consumers on the source cluster?](#faq-throttle-consumers-effects)
- [Will adding a cluster link cause throttling of existing producers on the destination cluster?](#faq-throttle-producers-effects)
- [How is the consumer offset sync accomplished?](#faq-consumer-offset-sync)
- [Can I modify consumer group filters on-the-fly?](#faq-consumer-groups-filters-modify-on-fly)
- [How do I create a cluster link?](#faq-create-link)
- [Which clusters can create cluster links?](#faq-cluster-details)
- [Can I prioritize one link over another?](#faq-prioritize-links)
- [How do I create a mirror topic?](#faq-mirror-topics)
- [Can I prevent certain topics from being mirrored by a cluster link?](#faq-block-mirror-topics)
- [Can I override a topic configuration when using auto-create mirror topics?](#faq-override-topic-config-with-auto-create-mirror-topics)
- [Can I use Cluster Linking without the traffic going over the public internet?](#faq-no-public-internet)
- [Does Schema Linking have the same limitations as Cluster Linking for private networking and cross-region?](#faq-schema-linking-private-net-rqmts)
- [I need RPO==0 (guarantee of no data loss after a failover) in Confluent Cloud. What can I do?](#faq-rpo-zero)
- [If I want to join two topics from different clusters in ksqlDB, how can Cluster Linking help me?](#faq-ksqldb-multi-cluster)
- [Does Cluster Linking work with mTLS?](#faq-mtls)
- [How does Schema Registry multi-region disaster recovery (DR) work in Confluent Cloud?](#faq-cloud-multi-region-dr)
- [How can I automatically failover Kafka clients?](#faq-failover-ak-clients)
- [How does Cluster Linking optimize network bandwidth and performance in Confluent Cloud?](#faq-optimize)
- [How do I perform a failover on a cluster link used primarily for data sharing?](#faq-failover-with-data-sharing)
- [Does Cluster Linking support compacted topics?](#faq-compacted-topics-support)
- [Does Cluster Linking support bidirectional links between two clusters?](#faq-bidirectional-links)
- [Does Cluster Linking support repartitioning or renaming of topics?](#faq-repartitioning-and-renaming-topics)
- [Can Cluster Linking create circular dependencies? How can I prevent infinite loops?](#faq-circular-dependencies)


#### NOTE
As a general guideline (not just for this tutorial), any customer-owned firewall that allows the cluster link connection from
source cluster brokers to destination cluster brokers must allow the TCP connection to persist in order for Cluster Linking to work.

- These instructions assume you have a local installation of [Confluent Platform 7.1.0 or later](https://www.confluent.io/get-started/?product=software),
  and Java 8, 11, or 17 (recommended) (needed for Confluent Platform). [Install instructions for self-managed deployments](/platform/current/installation/overview.html)
  are available in the documentation. If you are new to Confluent Platform, you may want to first work through the [Quick Start for Apache Kafka using Confluent Platform](/platform/current/platform-quickstart.html),
  and/or the [basic Cluster Linking tutorial](/platform/current/multi-dc-deployments/cluster-linking/topic-data-sharing.html) then return to this tutorial.
- This tutorial and the source-initiated link feature require Confluent Enterprise, and are not supported in Confluent Community or Apache Kafka®.
- With a default install of Confluent Platform, the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/command-reference/overview.html)
  and [Cluster Linking commands](https://docs.confluent.io/platform/current/multi-dc-deployments/cluster-linking/commands.html)
  should be available in `$CONFLUENT_HOME/bin` and properties files will be in the directory `CONFLUENT_CONFIG` (`$CONFLUENT_HOME/etc/kafka/`).
  You must have Confluent Platform running to access these commands. Once Confluent Platform is [configured](#cluster-link-hybrid-config) and [running](#cluster-link-hybrid-start-cp),
  you can type any command with no arguments to get help (for example, `kafka-cluster-links`).
- This tutorial requires a Confluent Cloud login and the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/overview.html).
  To learn more, see [Get the latest version of Confluent Cloud](https://docs.confluent.io/cloud/current/multi-cloud/cluster-linking/quickstart.html#get-the-latest-version-of-ccloud)
  in the [Confluent Cloud Cluster Linking Quick Start](https://docs.confluent.io/cloud/current/multi-cloud/cluster-linking/quickstart.html#) as well as
  [Migrate Confluent CLI](https://docs.confluent.io/confluent-cli/current/migrate.html). If you are new to Confluent Cloud, you might want to walk through that Quick Start first, and then return to this tutorial.
- This tutorial requires that you run a [Dedicated cluster](https://docs.confluent.io/cloud/current/clusters/cluster-types.html#dedicated-clusters)
  in Confluent Cloud, which will incur Confluent Cloud charges.
- The parameter `password.encoder.secret` is used to encrypt the credentials which will be stored in the cluster link. This is required for ZooKeeper as supported on pre-8.0 versions of Confluent Platform
  and when [migrating from ZooKeeper to KRaft](/platform/current/installation/migrate-zk-kraft.html), as described in [What’s supported](/platform/current/multi-dc-deployments/cluster-linking/index.html#what-s-supported).
  To learn more about this parameter, see [Multi-Region Clusters](/platform/current/kafka/dynamic-config.html#dynamic-config-passwords-upgrade).


## Authentication

The following example shows how to configure SASL_SSL with GSSAPI as the SASL
mechanism for the cluster link to talk to the source cluster. You can set
these configurations using a `config-file`, as described in the section on
[how to set properties on a cluster link](configs.md#cluster-link-specific-configs).

```bash
security.protocol=SASL_SSL
ssl.truststore.location=/path/to/truststore.p12
ssl.truststore.password=truststore-password
ssl.truststore.type=PKCS12
sasl.mechanism=GSSAPI
sasl.kerberos.service.name=kafka
sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
    useKeyTab=true \
    storeKey=true \
    keyTab="/path/to/link.keytab" \
    principal="clusterlink1@EXAMPLE.COM";
```

Cluster Linking configurations should include client-side TLS/SSL and SASL/GSSAPI configuration options for
connections to the source cluster in this scenario.

If you reference a `keystore`/`truststore` directly (for example, `keystore.jks`),
the same files must be available in the same location on each of the brokers

For details on creating TLS/SSL key and trust stores, see [Use TLS Authentication in Confluent Platform](../../security/authentication/mutual-tls/overview.md#kafka-ssl-authentication). For details on
SASL/GSSAPI, see [Configure GSSAPI in Confluent Platform clusters](../../security/authentication/sasl/gssapi/overview.md#kafka-sasl-auth-gssapi).

To configure cluster links to use other SASL mechanisms, include client-side
security configurations for that mechanism. See [SASL](../../security/authentication/overview.md#kafka-sasl-auth) for other
supported mechanisms. To use mutual TLS authentication as the security protocol,
a key store should also be configured for the link. See
[Use TLS Authentication in Confluent Platform](../../security/authentication/mutual-tls/overview.md#kafka-ssl-authentication) for details.


# Deploy Confluent Platform in a Multi-Datacenter Environment


Confluent Platform supports several types of multi-datacenter deployment solutions.

* [Overview](index.md)
* [Multi-Data Center Architectures on Confluent Platform](multi-region-architectures.md)
* [Cluster Linking on Confluent Platform](cluster-linking/overview.md)
  * [Overview](cluster-linking/index.md)
  * [Tutorials](cluster-linking/overview-use-cases.md)
    * [Share Data Across Topics](cluster-linking/topic-data-sharing.md)
    * [Link Hybrid Cloud and Bridge-to-Cloud Clusters](cluster-linking/hybrid-cp.md)
    * [Migrate Data](cluster-linking/migrate-cp.md)
  * [Manage](cluster-linking/overview-config-and-manage.md)
    * [Manage Mirror Topics](cluster-linking/mirror-topics-cp.md)
    * [Configure](cluster-linking/configs.md)
    * [Command Reference](cluster-linking/commands.md)
    * [Monitor](cluster-linking/metrics.md)
    * [Security](cluster-linking/security.md)
  * [FAQ](cluster-linking/faqs-cp.md)
  * [Troubleshooting](cluster-linking/trouble-cp.md)
* [Multi-Region Clusters on Confluent Platform](multi-region-overview.md)
  * [Overview](multi-region.md)
  * [Tutorial: Multi-Region Clusters](multi-region-tutorial.md)
  * [Tutorial: Move Active-Passive to Multi-Region](mrc-move-from-active-passive.md)
* [Replicate Topics Across Kafka Clusters in Confluent Platform](replicator/overview.md)
  * [Overview](replicator/index.md)
  * [Example: Active-active Multi-Datacenter](replicator/replicator-docker-tutorial.md)
  * [Tutorial: Replicate Data Across Clusters](replicator/replicator-quickstart.md)
  * [Tutorial: Run as an Executable or Connector](replicator/replicator-run.md)
  * [Configure](replicator/configuration_options.md)
  * [Verify Configuration](replicator/replicator-verifier.md)
  * [Tune](replicator/replicator-tuning.md)
  * [Monitor](replicator/replicator-monitoring.md)
  * [Configure for Cross-Cluster Failover](replicator/replicator-failover.md)
  * [Migrate from MirrorMaker to Replicator](replicator/migrate-replicator.md)
  * [Replicator Schema Translation Example for Confluent Platform](replicator/replicator-schema-translation.md)


## Multi-Datacenter Use Cases

Replicator can be deployed across clusters and in multiple datacenters. Multi-datacenter deployments enable use-cases such as:

* Active-active geo-localized deployments: allows users to access a near-by datacenter to optimize their architecture for low latency and high performance
* Active-passive disaster recover (DR) deployments: in an event of a partial or complete datacenter disaster, allow failing over applications to use Confluent Platform in a different datacenter.
* Centralized analytics: Aggregate data from multiple Apache Kafka® clusters into one location for organization-wide analytics
* Cloud migration: Use Kafka to synchronize data between on-prem applications and cloud deployments

Replication of events in Kafka topics from one cluster to another is the foundation of Confluent’s multi datacenter architecture.

Replication can be done with Confluent Replicator or using the open source [Kafka MirrorMaker](https://kafka.apache.org/documentation/#basic_ops_mirror_maker).
Replicator can be used for replication of topic data as well as [migrating schemas](../../schema-registry/installation/migrate.md#schemaregistry-migrate) in Schema Registry.

This documentation focuses on Replicator, including [architecture](#replicator-architecture), [quick start tutorial](replicator-quickstart.md#replicator-quickstart), how to [configure and run](replicator-run.md#replicator-run) Replicator in different contexts, [tuning and monitoring](replicator-tuning.md#replicator-tuning), [cross-cluster failover](replicator-failover.md#replicator-failover), and more. A section on how to [migrate from MirrorMaker to Replicator](migrate-replicator.md#migrate-replicator) is also included.

Some of the general thinking on deployment strategies can also apply to MirrorMaker, but if you are primarily interested in MirrorMaker, see [Mirroring data between clusters](https://kafka.apache.org/documentation/#basic_ops_mirror_maker) in the Kafka documentation.


### Inspect topics

1. For each datacenter, inspect the data in various topics, provenance information, timestamp information, and cluster ID.
   ```bash
   ./read-topics.sh
   ```
2. Verify the output resembles:
   ```text
   -----dc1-----

   list topics:
   __consumer_offsets
   __consumer_timestamps
   _confluent-command
   _confluent-license
   _confluent-telemetry-metrics
   _confluent_balancer_api_state
   _schemas
   connect-configs-dc1
   connect-offsets-dc1
   connect-status-dc1
   topic1
   topic2

   topic1:
   {"userid":{"string":"User_7"},"dc":{"string":"dc1"}}
   {"userid":{"string":"User_7"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_9"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_2"},"dc":{"string":"dc1"}}
   {"userid":{"string":"User_5"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_1"},"dc":{"string":"dc1"}}
   {"userid":{"string":"User_3"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_7"},"dc":{"string":"dc1"}}
   {"userid":{"string":"User_1"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_8"},"dc":{"string":"dc1"}}
   Processed a total of 10 messages

   topic2:
   {"registertime":{"long":1513471082347},"userid":{"string":"User_2"},"regionid":{"string":"Region_7"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1496006007512},"userid":{"string":"User_5"},"regionid":{"string":"Region_6"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1494319368203},"userid":{"string":"User_7"},"regionid":{"string":"Region_2"},"gender":{"string":"FEMALE"}}
   {"registertime":{"long":1493150028737},"userid":{"string":"User_1"},"regionid":{"string":"Region_5"},"gender":{"string":"FEMALE"}}
   {"registertime":{"long":1517151907191},"userid":{"string":"User_5"},"regionid":{"string":"Region_3"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1489672305692},"userid":{"string":"User_2"},"regionid":{"string":"Region_6"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1511471447951},"userid":{"string":"User_2"},"regionid":{"string":"Region_5"},"gender":{"string":"MALE"}}
   {"registertime":{"long":1488018372941},"userid":{"string":"User_7"},"regionid":{"string":"Region_2"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1500952152251},"userid":{"string":"User_2"},"regionid":{"string":"Region_1"},"gender":{"string":"MALE"}}
   {"registertime":{"long":1493556444692},"userid":{"string":"User_1"},"regionid":{"string":"Region_8"},"gender":{"string":"FEMALE"}}
   Processed a total of 10 messages

   _schemas:
   null
   null
   null
   {"subject":"topic1-value","version":1,"id":1,"schema":"{\"type\":\"record\",\"name\":\"KsqlDataSourceSchema\",\"namespace\":\"io.confluent.ksql.avro_schemas\",\"fields\":[{\"name\":\"userid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"dc\",\"type\":[\"null\",\"string\"],\"default\":null}]}","deleted":false}
   {"subject":"topic2-value","version":1,"id":2,"schema":"{\"type\":\"record\",\"name\":\"KsqlDataSourceSchema\",\"namespace\":\"io.confluent.ksql.avro_schemas\",\"fields\":[{\"name\":\"registertime\",\"type\":[\"null\",\"long\"],\"default\":null},{\"name\":\"userid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"regionid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"gender\",\"type\":[\"null\",\"string\"],\"default\":null}]}","deleted":false}
   {"subject":"topic2.replica-value","version":1,"id":2,"schema":"{\"type\":\"record\",\"name\":\"KsqlDataSourceSchema\",\"namespace\":\"io.confluent.ksql.avro_schemas\",\"fields\":[{\"name\":\"registertime\",\"type\":[\"null\",\"long\"],\"default\":null},{\"name\":\"userid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"regionid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"gender\",\"type\":[\"null\",\"string\"],\"default\":null}]}","deleted":false}
   [2021-01-04 19:16:09,579] ERROR Error processing message, terminating consumer process:  (kafka.tools.ConsoleConsumer$)
   org.apache.kafka.common.errors.TimeoutException
   Processed a total of 6 messages

   provenance info (cluster, topic, timestamp):
   2qHo2TsdTIaTjvkCyf3qdw,topic1,1609787778125
   2qHo2TsdTIaTjvkCyf3qdw,topic1,1609787779123
   2qHo2TsdTIaTjvkCyf3qdw,topic1,1609787780125
   2qHo2TsdTIaTjvkCyf3qdw,topic1,1609787781246
   2qHo2TsdTIaTjvkCyf3qdw,topic1,1609787782125
   Processed a total of 10 messages

   timestamp info (group: topic-partition):
   replicator-dc1-to-dc2-topic2: topic2-0    1609787797164
   replicator-dc1-to-dc2-topic1: topic1-0    1609787797117
   Processed a total of 2 messages

   cluster id:
   ZagAAEfORQG-lxwq6OsV5Q

   -----dc2-----

   list topics:
   __consumer_offsets
   __consumer_timestamps
   _confluent-command
   _confluent-controlcenter-6-1-0-1-AlertHistoryStore-changelog
   _confluent-controlcenter-6-1-0-1-AlertHistoryStore-repartition
   _confluent-controlcenter-6-1-0-1-Group-ONE_MINUTE-changelog
   _confluent-controlcenter-6-1-0-1-Group-ONE_MINUTE-repartition
   _confluent-controlcenter-6-1-0-1-Group-THREE_HOURS-changelog
   _confluent-controlcenter-6-1-0-1-Group-THREE_HOURS-repartition
   _confluent-controlcenter-6-1-0-1-KSTREAM-OUTEROTHER-0000000106-store-changelog
   _confluent-controlcenter-6-1-0-1-KSTREAM-OUTEROTHER-0000000106-store-repartition
   _confluent-controlcenter-6-1-0-1-KSTREAM-OUTERTHIS-0000000105-store-changelog
   _confluent-controlcenter-6-1-0-1-KSTREAM-OUTERTHIS-0000000105-store-repartition
   _confluent-controlcenter-6-1-0-1-MetricsAggregateStore-changelog
   _confluent-controlcenter-6-1-0-1-MetricsAggregateStore-repartition
   _confluent-controlcenter-6-1-0-1-MonitoringMessageAggregatorWindows-ONE_MINUTE-changelog
   _confluent-controlcenter-6-1-0-1-MonitoringMessageAggregatorWindows-ONE_MINUTE-repartition
   _confluent-controlcenter-6-1-0-1-MonitoringMessageAggregatorWindows-THREE_HOURS-changelog
   _confluent-controlcenter-6-1-0-1-MonitoringMessageAggregatorWindows-THREE_HOURS-repartition
   _confluent-controlcenter-6-1-0-1-MonitoringStream-ONE_MINUTE-changelog
   _confluent-controlcenter-6-1-0-1-MonitoringStream-ONE_MINUTE-repartition
   _confluent-controlcenter-6-1-0-1-MonitoringStream-THREE_HOURS-changelog
   _confluent-controlcenter-6-1-0-1-MonitoringStream-THREE_HOURS-repartition
   _confluent-controlcenter-6-1-0-1-MonitoringTriggerStore-changelog
   _confluent-controlcenter-6-1-0-1-MonitoringTriggerStore-repartition
   _confluent-controlcenter-6-1-0-1-MonitoringVerifierStore-changelog
   _confluent-controlcenter-6-1-0-1-MonitoringVerifierStore-repartition
   _confluent-controlcenter-6-1-0-1-TriggerActionsStore-changelog
   _confluent-controlcenter-6-1-0-1-TriggerActionsStore-repartition
   _confluent-controlcenter-6-1-0-1-TriggerEventsStore-changelog
   _confluent-controlcenter-6-1-0-1-TriggerEventsStore-repartition
   _confluent-controlcenter-6-1-0-1-actual-group-consumption-rekey
   _confluent-controlcenter-6-1-0-1-aggregate-topic-partition-store-changelog
   _confluent-controlcenter-6-1-0-1-aggregate-topic-partition-store-repartition
   _confluent-controlcenter-6-1-0-1-aggregatedTopicPartitionTableWindows-ONE_MINUTE-changelog
   _confluent-controlcenter-6-1-0-1-aggregatedTopicPartitionTableWindows-ONE_MINUTE-repartition
   _confluent-controlcenter-6-1-0-1-aggregatedTopicPartitionTableWindows-THREE_HOURS-changelog
   _confluent-controlcenter-6-1-0-1-aggregatedTopicPartitionTableWindows-THREE_HOURS-repartition
   _confluent-controlcenter-6-1-0-1-cluster-rekey
   _confluent-controlcenter-6-1-0-1-expected-group-consumption-rekey
   _confluent-controlcenter-6-1-0-1-group-aggregate-store-ONE_MINUTE-changelog
   _confluent-controlcenter-6-1-0-1-group-aggregate-store-ONE_MINUTE-repartition
   _confluent-controlcenter-6-1-0-1-group-aggregate-store-THREE_HOURS-changelog
   _confluent-controlcenter-6-1-0-1-group-aggregate-store-THREE_HOURS-repartition
   _confluent-controlcenter-6-1-0-1-group-stream-extension-rekey
   _confluent-controlcenter-6-1-0-1-metrics-trigger-measurement-rekey
   _confluent-controlcenter-6-1-0-1-monitoring-aggregate-rekey-store-changelog
   _confluent-controlcenter-6-1-0-1-monitoring-aggregate-rekey-store-repartition
   _confluent-controlcenter-6-1-0-1-monitoring-message-rekey-store
   _confluent-controlcenter-6-1-0-1-monitoring-trigger-event-rekey
   _confluent-license
   _confluent-metrics
   _confluent-monitoring
   _confluent-telemetry-metrics
   _confluent_balancer_api_state
   _schemas
   connect-configs-dc2
   connect-offsets-dc2
   connect-status-dc2
   topic1
   topic2.replica

   topic1:
   {"userid":{"string":"User_2"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_1"},"dc":{"string":"dc1"}}
   {"userid":{"string":"User_6"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_9"},"dc":{"string":"dc1"}}
   {"userid":{"string":"User_9"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_9"},"dc":{"string":"dc1"}}
   {"userid":{"string":"User_9"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_9"},"dc":{"string":"dc1"}}
   {"userid":{"string":"User_9"},"dc":{"string":"dc2"}}
   {"userid":{"string":"User_9"},"dc":{"string":"dc1"}}
   Processed a total of 10 messages

   topic2.replica:
   {"registertime":{"long":1488571887136},"userid":{"string":"User_2"},"regionid":{"string":"Region_4"},"gender":{"string":"FEMALE"}}
   {"registertime":{"long":1496554479008},"userid":{"string":"User_3"},"regionid":{"string":"Region_9"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1515819037639},"userid":{"string":"User_1"},"regionid":{"string":"Region_7"},"gender":{"string":"FEMALE"}}
   {"registertime":{"long":1498630829454},"userid":{"string":"User_9"},"regionid":{"string":"Region_5"},"gender":{"string":"FEMALE"}}
   {"registertime":{"long":1491954362758},"userid":{"string":"User_6"},"regionid":{"string":"Region_6"},"gender":{"string":"FEMALE"}}
   {"registertime":{"long":1498308706008},"userid":{"string":"User_2"},"regionid":{"string":"Region_2"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1509409463384},"userid":{"string":"User_5"},"regionid":{"string":"Region_8"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1494736574275},"userid":{"string":"User_4"},"regionid":{"string":"Region_4"},"gender":{"string":"OTHER"}}
   {"registertime":{"long":1513254638109},"userid":{"string":"User_3"},"regionid":{"string":"Region_5"},"gender":{"string":"FEMALE"}}
   {"registertime":{"long":1499607488391},"userid":{"string":"User_4"},"regionid":{"string":"Region_2"},"gender":{"string":"OTHER"}}
   Processed a total of 10 messages

   _schemas:
   null
   null
   null
   {"subject":"topic1-value","version":1,"id":1,"schema":"{\"type\":\"record\",\"name\":\"KsqlDataSourceSchema\",\"namespace\":\"io.confluent.ksql.avro_schemas\",\"fields\":[{\"name\":\"userid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"dc\",\"type\":[\"null\",\"string\"],\"default\":null}]}","deleted":false}
   {"subject":"topic2-value","version":1,"id":2,"schema":"{\"type\":\"record\",\"name\":\"KsqlDataSourceSchema\",\"namespace\":\"io.confluent.ksql.avro_schemas\",\"fields\":[{\"name\":\"registertime\",\"type\":[\"null\",\"long\"],\"default\":null},{\"name\":\"userid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"regionid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"gender\",\"type\":[\"null\",\"string\"],\"default\":null}]}","deleted":false}
   {"subject":"topic2.replica-value","version":1,"id":2,"schema":"{\"type\":\"record\",\"name\":\"KsqlDataSourceSchema\",\"namespace\":\"io.confluent.ksql.avro_schemas\",\"fields\":[{\"name\":\"registertime\",\"type\":[\"null\",\"long\"],\"default\":null},{\"name\":\"userid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"regionid\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"gender\",\"type\":[\"null\",\"string\"],\"default\":null}]}","deleted":false}
   [2021-01-04 19:17:26,336] ERROR Error processing message, terminating consumer process:  (kafka.tools.ConsoleConsumer$)
   org.apache.kafka.common.errors.TimeoutException
   Processed a total of 6 messages

   provenance info (cluster, topic, timestamp):
   ZagAAEfORQG-lxwq6OsV5Q,topic1,1609787854055
   ZagAAEfORQG-lxwq6OsV5Q,topic1,1609787854057
   ZagAAEfORQG-lxwq6OsV5Q,topic1,1609787856052
   ZagAAEfORQG-lxwq6OsV5Q,topic1,1609787857052
   ZagAAEfORQG-lxwq6OsV5Q,topic1,1609787857054
   Processed a total of 10 messages

   timestamp info (group: topic-partition):
   replicator-dc2-to-dc1-topic1: topic1-0    1609787867007
   replicator-dc2-to-dc1-topic1: topic1-0    1609787877008
   Processed a total of 2 messages

   cluster id:
   2qHo2TsdTIaTjvkCyf3qdw
   ```


#### Configure and run Replicator on the Connect cluster

You should have at least one distributed mode Connect Worker already up and running. To learn more, review the [distributed mode documentation](/kafka-connectors/self-managed/userguide.html#distributed-mode)   .

You can check if the Connect Worker is up and running by checking its REST API:

```bash
curl http://localhost:8083/
{"version":"8.1.0-ccs","commit":"078e7dc02a100018"}
```

If everything is fine, you will see a version number and commit hash for the version of the Connect Worker you are running.

Run Replicator by sending the Connect REST API its configuration file in JSON format. Here’s an example configuration:

```none
{
        "name":"replicator",
        "config":{
                "connector.class":"io.confluent.connect.replicator.ReplicatorSourceConnector",
                "tasks.max":4,
                "key.converter":"io.confluent.connect.replicator.util.ByteArrayConverter",
                "value.converter":"io.confluent.connect.replicator.util.ByteArrayConverter",
                "src.kafka.bootstrap.servers":"localhost:9082",
                "topic.whitelist":"test-topic",
                "topic.rename.format":"${topic}.replica",
                "confluent.license":"XYZ"
        }
}
```

You can send this to Replicator using `curl`. This assumes the above JSON is in a file called `example-replicator.json`:

```none
curl -X POST -d @example-replicator.json  http://localhost:8083/connectors --header "content-Type:application/json"
```

This example demonstrates use of some important configuration parameters. For an explanation of all configuration parameters, see [Replicator Configuration Reference for Confluent Platform](configuration_options.md#replicator-config-options).

* `key.converter` and `value.converter` - Classes used to convert Kafka records to Connect’s internal format. The Connect Worker configuration specifies global converters and those will be used if you don’t specify anything in the Replicator configuration. For Replication, however, no conversion is necessary. You just want to read bytes out of the origin cluster and write them to the destination with no changes. Therefore, you can override the global converters with the `ByteArrayConverter`, which leaves the records as is.
* `src.kafka.bootstrap.servers` - A list of brokers from the **origin** cluster
* `topic.whitelist` - An explicit list of the topics that you want replicated. The quick start replicates a topic named `test-topic`.
* `topic.rename.format` - A substitution string that is used to rename topics in the destination cluster. The snippet above uses `${topic}.replica`, where `${topic}` will be substituted with the topic name from the origin cluster. That means that the `test-topic` being replicated from the origin cluster will be renamed to `test-topic.replica` in the destination cluster.
* `confluent.license` - You cannot use Confluent Replicator without the license on Confluent Platform versions 5.5.0 and later, as there is no trial period for Replicator on these newer Confluent Platform versions. Contact Confluent Support for more information.


## Suggested Reading

* [Use Schema Registry to Migrate Schemas in Confluent Platform](../../schema-registry/installation/migrate.md#schemaregistry-migrate)
* [Schemas, subjects, and topics](../../schema-registry/fundamentals/index.md#sr-subjects-topics-primer)
* [Tutorial: Replicate Data Across Kafka Clusters in Confluent Platform](replicator-quickstart.md#replicator-quickstart)
* [Configure Replicator for Cross-Cluster Failover in Confluent Platform](replicator-failover.md#replicator-failover)
* These sections in [Replicator Configuration Properties](https://docs.confluent.io/kafka-connect-replicator/current/configuration_options.html):
  - [Source Topics](https://docs.confluent.io/kafka-connect-replicator/current/configuration_options.html#destination-data-conversion)
  - [Destination Topics](https://docs.confluent.io/kafka-connect-replicator/current/configuration_options.html#destination-topics)
  - [Schema Translation](https://docs.confluent.io/kafka-connect-replicator/current/configuration_options.html#schema-translation)


## Confluent Community software / Kafka

New features in Confluent Platform 8.1 include the following:

* [KIP-932 Queues for Kafka:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka)
  Queues for Kafka is now available as a preview. You can test this feature, which can be upgraded to production-ready clusters
  when Queues becomes generally available. Note that some features, such as the partition assignor, are still in
  development. For configuration and testing details, see the [Apache Queues for Kafka Preview](https://cwiki.apache.org/confluence/display/KAFKA/Queues+for+Kafka+%28KIP-932%29+-+Preview+Release+Notes) documentation.
  - The preview includes [KIP-1103](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1103%3A+Additional+metrics+for+cooperative+consumption), which adds metrics for Share Groups.
  - To enable share groups, set `$share.version = 1$` by using the `kafka-features.sh` tool.
    ```shell
    bin/kafka-features.sh --bootstrap-server <bootstrap URL> upgrade --feature share.version=1
    ```
  - To provide feedback or to discuss the Queues for Kafka preview, contact Confluent.
* [KIP-853 KRaft Controller Membership Changes:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-853%3A+KRaft+Controller+Membership+Changes) You can now upgrade KRaft voters from a static to a dynamic configuration.
* [KIP-1166 Improve high-watermark replication:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1166:+Improve+high-watermark+replication) This KIP fixes issues where pending fetch requests could fail to complete,
  which previously impacted high-watermark progression.
* [KIP-890 Transactions Server-Side Defense:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-890%3A+Transactions+Server-Side+Defense) To prevent an infinite “out-of-order sequence” error, idempotent
  producers now reject non-zero sequences when no producer ID state exists on the partition for the transaction.
* [KIP-848 Consumer Group Protocol:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol) This KIP includes several enhancements, such as a
  new rack-aware assignor ([KIP-1101](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1101%3A+Trigger+rebalance+on+rack+topology+changes)). The new assignor makes rack-aware partition assignment
  significantly more memory-efficient, which supports hundreds of members in a single consumer group.
* [KIP-1131 Improved controller-side metrics for monitoring broker states:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1131%3A+Improved+controller-side+monitoring+of+broker+states) This KIP adds new controller-side metrics
  to improve monitoring of broker health and status.
* [KIP-1109 Unifying Kafka consumer topic metrics:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1109%3A+Unifying+Kafka+Consumer+Topic+Metrics) Kafka consumer metrics now preserve periods (`.`) in
  topic names instead of replacing them with underscores (`_`). This change aligns their behavior with
  producer metrics. The old metrics that use underscores in topic names will be removed in a future release.
* [KIP-1118 Add deadlock protection on the producer network thread:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1118%3A+Add+Deadlock+Protection+on+Producer+Network+Thread) Calling `KafkaProducer.flush()`
  from within the `KafkaProducer.send()` callback now raises an exception to prevent a potential
  deadlock in the producer.
* [KIP-1143 Deprecate Optional<String> and return String from public Endpoint:](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=345377327)
  The `Endpoint.listenerName()` method that returns `Optional<String>` is now deprecated.
  You should update your code to use the new method that returns a `String`.
* [KIP-1152 Add transactional ID pattern filter to ListTransactions API:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1152%3A+Add+transactional+ID+pattern+filter+to+ListTransactions+API) You can now filter
  transactions by a transactional ID pattern when using the `kafka-transactions.sh` tool or the Admin API.
  This feature avoids the need to retrieve all transactions and filter them on the client side.

For a full list of KIPs, features, and bug fixes, see the [Apache Kafka 4.1 release notes](https://archive.apache.org/dist/kafka/4.1.0/RELEASE_NOTES.html).


### GET /subjects

Get a list of registered subjects. (For API usage examples, see [List all subjects](using.md#kafka-key-listing-all-subjects).)

* **Parameters:**
  * **subjectPrefix** (*string*) – Add `?subjectPrefix=` (as an empty string) at the end of this request to list subjects in the default context. If this flag is not included, `GET /subjects` returns all subjects across all contexts.  To learn more about contexts, see the [exporters](#schemaregistry-api-exporters) API reference and the quick start and concepts guides for [Schema Linking on Confluent Platform](../schema-linking-cp.md#schema-linking-cp-overview) and [Schema Linking on Confluent Cloud](/cloud/current/sr/schema-linking.html).
  * **deleted** (*boolean*) – Add `?deleted=true` at the end of this request to list both current and soft-deleted subjects. The default is `false`. If this flag is not included, only current subjects are listed (not those that have been soft-deleted). Hard and soft delete are explained below in the description of the `delete` API.
* **Response JSON Array of Objects:**
  * **name** (*string*) – Subject
* **Status Codes:**
  * [500 Internal Server Error](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.1) –
    * Error code 50001 – Error in the backend datastore

**Example request**:

```http
GET /subjects HTTP/1.1
Host: schemaregistry.example.com
Accept: application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json
```

**Example response**:

```http
HTTP/1.1 200 OK
Content-Type: application/vnd.schemaregistry.v1+json

["subject1", "subject2"]
```


#### NOTE
JSON and PROTOBUF_NOSR are not supported. For more details, see [What’s supported](../../index.md#sr-supported-features) in the Schema Registry overview.

Use the serializer and deserializer for your schema format. Specify the
serializer in the code for the Kafka producer to send messages, and specify the
deserializer in the code for the Kafka consumer to read messages.

The new Protobuf and JSON Schema serializers and deserializers support many of the same configuration properties
as the Avro equivalents, including [subject name strategies](#sr-schemas-subject-name-strategy)
for the key and value.  In the case of the `RecordNameStrategy` (and `TopicRecordNameStrategy`), the subject name will be:

- For Avro, the record fullname (namespace + record name).
- For Protobuf, the message name.
- For JSON Schema, the title.

When using `RecordNameStrategy` with Protobuf and JSON Schema, there is
additional configuration that is required. This, along with examples and
command line testing utilities, is covered in the deep dive sections:

- [Avro](serdes-avro.md#serdes-and-formatter-avro)
- [Protobuf](serdes-protobuf.md#serdes-and-formatter-protobuf)
- [JSON Schema](serdes-json.md#serdes-and-formatter-json)

In addition to the detailed sections above, produce and consume examples are available in
[confluentinc/confluent-kafka-go/examples](https://github.com/confluentinc/confluent-kafka-go/tree/master/examples)
for each of the different Schema Registry SerDes.

The serializers and [Kafka Connect converters](/platform/current/connect/concepts.html#converters) for all supported schema
formats automatically register schemas by default. The Protobuf serializer
recursively registers all referenced schemas separately.

With Protobuf and JSON Schema support, the Schema Registry adds the ability to add
new schema formats using schema plugins (the existing Avro support has been
wrapped with an Avro schema plugin).


#### Avro consumer

For example, to consume messages from the beginning (`--from-beginning`) from the `stocks` topic on a Confluent Cloud cluster:

```bash
./bin/kafka-avro-console-consumer --bootstrap-server $BOOTSTRAP_SERVER \
--property basic.auth.credentials.source="USER_INFO" \
--property print.key=true --property print.schema.ids=true \
--property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \
--property schema.registry.url=$SCHEMA_REGISTRY_URL \
--consumer.config /Users/vicky/creds.config \
--topic stocks --from-beginning \
--property schema.registry.basic.auth.user.info=$SR_APIKEY:$SR_APISECRET
```

This results in output similar to the following, with the schema ID showing at the end of each message line:

```bash
...
ZVZZT        {"side":"SELL","quantity":1546,"symbol":"ZVZZT","price":629,"account":"ABC123","userid":"User_4"}       100008
ZJZZT        {"side":"SELL","quantity":765,"symbol":"ZJZZT","price":140,"account":"ABC123","userid":"User_2"}        100008
ZJZZT        {"side":"BUY","quantity":2977,"symbol":"ZJZZT","price":264,"account":"ABC123","userid":"User_9"}        100008
...
```

To drill down on a particular subset of messages, determine the offset and partition you want to focus on.
You can use the Confluent Cloud Console to navigate to a particular offset and partition.

![image](schema-registry/images/serdes-message-per-offset-partition.png)

For example, to show messages to the `stocks` topic, starting at offset `15846316` on partition `0`,
replace from `--from-beginning` in the command with the `--offset` and `--partition` numbers you want to explore.
To limit the number of messages, you can add a value for `--max-messages` such as `5` in the example:

```bash
./bin/kafka-avro-console-consumer --bootstrap-server $BOOTSTRAP_SERVER \
--property basic.auth.credentials.source="USER_INFO" \
--property print.key=true --property print.schema.ids=true \
--offset 15846316 --partition 0 --max-messages 5 \
--property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \
--property schema.registry.basic.auth.user.info=$SR_APIKEY:$SR_APISECRET \
--consumer.config /Users/vicky/creds.config --topic stocks \
--property schema.registry.basic.auth.user.info=$SR_APIKEY:$SR_APISECRET
```

The output for this example is:

```bash
....
ZWZZT        {"side":"SELL","quantity":1905,"symbol":"ZWZZT","price":33,"account":"LMN456","userid":"User_9"}        100008
ZVV  {"side":"BUY","quantity":4288,"symbol":"ZVV","price":795,"account":"XYZ789","userid":"User_9"}  100008
ZVV  {"side":"BUY","quantity":235,"symbol":"ZVV","price":918,"account":"ABC123","userid":"User_7"}   100008
ZWZZT        {"side":"BUY","quantity":3041,"symbol":"ZWZZT","price":759,"account":"LMN456","userid":"User_3"}        100008
ZVV  {"side":"BUY","quantity":3080,"symbol":"ZVV","price":79,"account":"XYZ789","userid":"User_7"}   100008
Processed a total of 5 messages
```


## Quick starts

- The Schema Registry tutorials provide full walkthroughs on how to enable client applications to read and write Avro data, check schema version compatibility, and use the UIs to manage schemas.
  - **Schema Registry Tutorial on Confluent Cloud**: Sign up for [Confluent Cloud](https://www.confluent.io/confluent-cloud/) and use the [Confluent Cloud Schema Registry Tutorial](/cloud/current/sr/schema_registry_ccloud_tutorial.html) to get started.
  - **Schema Registry Tutorial on Confluent Platform**: Download [Confluent Platform](https://www.confluent.io/download/#confluent-platform) and use the [Confluent Platform Schema Registry Tutorial](/platform/current/schema-registry/schema_registry_tutorial.html) to get started.
- For a quick hands on introduction, jump to the [Schema Registry module of the free Apache Kafka 101](https://developer.confluent.io/learn-kafka/apache-kafka/schema-registry/) course
  to learn why you would need a Schema Registry, what it is, and how to get started. Also see the free [Schema Registry 101](https://developer.confluent.io/learn-kafka/schema-registry/) course
  to learn about the schema formats and how to build, register, manage and evolve schemas.
- On Confluent Cloud, try out the interactive tutorials embedded in the Cloud Console.
  [Take this link to sign up or sign in to Confluent Cloud](https://confluent.cloud/tutorials/schema-registry-getting-started), and try out the guided workflows directly in Confluent Cloud.
- To learn about [schema formats](fundamentals/serdes-develop/index.md#serializer-and-formatter), create schemas, and use producers and consumers to send messages to topics, see [Test drive Avro schema](fundamentals/serdes-develop/serdes-avro.md#sr-test-drive-avro), [Test drive Protobuf schema](fundamentals/serdes-develop/serdes-protobuf.md#sr-test-drive-protobuf), and [Test drive JSON Schema](fundamentals/serdes-develop/serdes-json.md#sr-test-drive-json-schema).


#### Before You Begin

If you are new to Confluent Platform, consider first working through these quick starts and
tutorials to get a baseline understanding of the platform (including the role of
producers, consumers, and brokers), Confluent Cloud, and Schema Registry. Experience with these
workflows will give you better context for schema migration.

- [Quick Start for Confluent Platform](../../get-started/platform-quickstart.md#quickstart)
- [Quick Start for Apache Kafka using Confluent Cloud](/cloud/current/get-started/index.html)
- [Tutorial: Use Schema Registry on Confluent Platform to Implement Schemas for a Client Application](../schema_registry_onprem_tutorial.md#schema-registry-onprem-tutorial)

Before you begin schema migration, verify that you have:

- [Access to Confluent Cloud](https://www.confluent.io/confluent-cloud/) to serve as the destination Schema Registry
- A local install of Confluent Platform; for example, from a [Quick Start for Confluent Platform](../../get-started/platform-quickstart.md#quickstart) download, or other cluster to serve as the origin Schema Registry.

Schema migration requires that you configure and run Replicator. If you need
more information than is included in the examples here, refer to the
[replicator tutorial](../../multi-dc-deployments/replicator/replicator-quickstart.md#replicator-quickstart).


##### Recommended Deployment

![image](images/multi-dc-setup-kafka.png)

The image above shows two datacenters - DC A, and DC B. Either could be
on-premises, in [Confluent Cloud](/cloud/current/index.html), or part of a [bridge to cloud](installation/migrate.md#schemaregistry-migrate) solution. Each of the two datacenters has its own
Apache Kafka® cluster, ZooKeeper cluster, and Schema Registry.

The Schema Registry nodes in both datacenters link to the primary Kafka cluster in DC A, and
the secondary datacenter (DC B) forwards Schema Registry writes to the primary (DC A).
Note that Schema Registry nodes and hostnames must be addressable and routable
across the two sites to support this configuration.

Schema Registry instances in DC B have `leader.eligibility` set to false, meaning that
none can be elected leader during steady state operation with both datacenters
online.

To protect against complete loss of DC A, Kafka cluster A (the source) is
replicated to Kafka cluster B (the target). This is achieved by running the
[Replicator](../multi-dc-deployments/replicator/index.md#replicator-detail) local to the target cluster (DC B).

In this active-passive setup, Replicator runs in one direction, copying Kafka data
and configurations from the active DC A to the passive DC B. The Schema Registry
instances in both data centers point to the internal `_schemas` topic in DC A.
For the purposes of disaster recovery, you must replicate the
[internal schemas topic](fundamentals/index.md#schemaregistry-design) itself. If DC A goes down,
the system will failover to DC B. Therefore, DC B needs a copy of the `_schemas`
topic for this purpose.

Producers write data to just the active cluster. Depending on the overall
design, consumers can read data from the active cluster only, leaving the
passive cluster for disaster recovery, or from both clusters to optimize reads
on a geo-local cache.

In the event of a partial or complete disaster in one datacenter, applications
can failover to the secondary datacenter.


### Kafka Connect

This section describes how to enable security for Kafka Connect. Securing Kafka Connect requires that you configure security for:

1. Kafka Connect workers: part of the Kafka Connect API, a worker is really just an advanced client, underneath the covers
2. Kafka Connect connectors: connectors may have embedded producers or consumers, so you must override the default configurations for Connect producers used with source connectors and Connect consumers used with sink connectors
3. Kafka Connect REST: Kafka Connect exposes a REST API that can be configured to use TLS/SSL using [additional properties](../../../protect-data/encrypt-tls.md#encryption-ssl-rest)

Configure security for Kafka Connect as described in the section below. Additionally, if you are using Confluent Control Center streams monitoring for Kafka Connect, configure security for:

* [Confluent Metrics Reporter](#sasl-gssapi-metrics-reporter)

Configure all the following properties in `connect-distributed.properties`.

1. Configure the Connect workers to use SASL/GSSAPI.
   ```bash
   sasl.mechanism=GSSAPI
   sasl.kerberos.service.name=kafka
   # Configure SASL_SSL if TLS/SSL encryption is enabled, otherwise configure SASL_PLAINTEXT
   security.protocol=SASL_SSL
   ```
2. Configure the JAAS configuration property with a unique principal, i.e., usually the same name as the user running the worker, and keytab, i.e., secret key, for each worker.
   ```bash
   sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
      useKeyTab=true \
      storeKey=true \
      keyTab="/etc/security/keytabs/kafka_client.keytab" \
      principal="connect@EXAMPLE.COM";
   ```
3. For the connectors to leverage security, you also have to override the default producer/consumer configuration that the worker uses. Depending on whether the connector is a source or sink connector:
   * Source connector: configure the same properties adding the `producer` prefix.
     ```bash
     producer.sasl.mechanism=GSSAPI
     producer.sasl.kerberos.service.name=kafka
     # Configure SASL_SSL if TLS/SSL encryption is enabled, otherwise configure SASL_PLAINTEXT
     producer.security.protocol=SASL_SSL
     producer.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
        useKeyTab=true \
        storeKey=true \
        keyTab="/etc/security/keytabs/kafka_client.keytab" \
        principal="connect@EXAMPLE.COM";
     ```
   * Sink connector: configure the same properties adding the `consumer` prefix.
     ```bash
     consumer.sasl.mechanism=GSSAPI
     consumer.sasl.kerberos.service.name=kafka
     # Configure SASL_SSL if TLS/SSL encryption is enabled, otherwise configure SASL_PLAINTEXT
     consumer.security.protocol=SASL_SSL
     consumer.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
        useKeyTab=true \
        storeKey=true \
        keyTab="/etc/security/keytabs/kafka_client.keytab" \
        principal="connect@EXAMPLE.COM";
     ```


### Configure .NET clients

Configure your .NET client with UAMI-specific properties:

```csharp
using Confluent.Kafka;
using System;
using System.Threading.Tasks;

public class UamiKafkaClient
{
    // Azure IMDS API version - use 2025-04-07 or later
    private const string azureIMDSApiVersion = "2025-04-07";
    private const string bootstrapEndpoint = "<BOOTSTRAP_ENDPOINT>";
    private const string uamiClientId = "<UAMI_CLIENT_ID>";
    private const string azureIMDSQueryParams =
        $"api-version={azureIMDSApiVersion}&resource={bootstrapEndpoint}&client_id={uamiClientId}";
    private const string kafkaLogicalCluster = "<your-logical-cluster>";
    private const string identityPoolId = "<your-identity-pool-id>";

    public static async Task Main(string[] args)
    {
        var bootstrapServers = args[0];
        var topicName = "test-topic";
        var groupId = Guid.NewGuid().ToString();

        var commonConfig = new ClientConfig
        {
            BootstrapServers = bootstrapServers,
            SecurityProtocol = SecurityProtocol.SaslSsl,
            SaslMechanism = SaslMechanism.OAuthBearer,
            SaslOauthbearerMethod = SaslOauthbearerMethod.Oidc,
            SaslOauthbearerMetadataAuthenticationType =
                SaslOauthbearerMetadataAuthenticationType.AzureIMDS,
            SaslOauthbearerConfig = $"query={azureIMDSQueryParams}",
            SaslOauthbearerExtensions =
                $"logicalCluster={kafkaLogicalCluster},identityPoolId={identityPoolId}"
        };

        // Use commonConfig with ProducerBuilder or ConsumerBuilder
        using (var producer = new ProducerBuilder<Null, string>(commonConfig).Build())
        {
            // Producer code here
        }
    }
}
```


## Configure Kafka Connect

This section describes how to enable security for Kafka Connect. Securing Kafka Connect requires that you configure security for:

1. Kafka Connect workers: part of the Kafka Connect API, a worker is really just an advanced client, underneath the covers
2. Kafka Connect connectors: connectors may have embedded producers or consumers, so you must override the default configurations for Connect producers used with source connectors and Connect consumers used with sink connectors
3. Kafka Connect REST: Kafka Connect exposes a REST API that can be configured to use TLS/SSL using [additional properties](../../../protect-data/encrypt-tls.md#encryption-ssl-rest)

Configure security for Kafka Connect as described in the section below. Additionally, if you are using Confluent Control Center streams monitoring for Kafka Connect, configure security for:

* [Confluent Metrics Reporter](#sasl-plain-metrics-reporter)

Configure all the following properties in `connect-distributed.properties`.

1. Configure the Connect workers to use SASL/PLAIN.

```bash
sasl.mechanism=PLAIN

### Principal

A principal is an entity that can be authenticated by the authorizer. Clients of
a Confluent Server broker identify themselves as a particular principal using various security
protocols. The way a principal is identified depends upon which security protocol
it uses to connect to the Confluent Server broker (for example: [mTLS](../../../kafka/configure-mds/mutual-tls-auth-rbac.md#mutual-tls-auth-rbac),
[SASL/GSSAPI](../../authentication/sasl/gssapi/overview.md#kafka-sasl-auth-gssapi), or [SASL/PLAIN](../../authentication/sasl/plain/overview.md#kafka-sasl-auth-plain)).
Authentication depends on the security protocol in place (such as SASL or TLS)
to recognize a principal within a Confluent Server broker.

The following examples show the principal name format based on the security
protocol being used:

- When a client connects to a Confluent Server broker using the TLS security protocol,
  the principal name will be in the form of the TLS certificate subject name:
  `CN=quickstart.confluent.io,OU=TEST,O=Sales,L=PaloAlto,ST=Ca,C=US`.
  Note that there are no spaces after the comma between subject parts.
- When a client connects to a Confluent Server broker using the SASL security protocol with GSSAPI
  (Kerberos) mechanism, the principal will be in the Kerberos principal format:
  `kafka-client@hostname.com`. For more detail, refer to
  [Kerberos Principal Names](https://docs.oracle.com/cd/E19253-01/816-4557/refer-31/index.html).
- When a client connects to a Confluent Server broker using the SASL security protocol with
  a PLAIN or SCRAM mechanism, the principal is a simple text string, such as
  `alice`,  `admin`, or `billing_etl_job_03`.

In the following ACL, the plain text principals (`User:alice`, `User:fred`)
are identified as Kafka users who are allowed to run specific operations (read and
write) from either of the specified hosts (host-1, host-2) on a specific resource
(topic):

```shell
kafka-acls --bootstrap-server localhost:9092 \
  --command-config adminclient-configs.conf \
  --add \
  --allow-principal User:alice \
  --allow-principal User:fred \
  --allow-host host-1 \
  --allow-host host-2 \
  --operation read \
  --operation write \
  --topic finance-topic
```

To follow best practices, create one principal per application and give each
principal only the ACLs required and no more. For example, if Alice is writing
three programs that access different topics to automate a billing workflow, she
could create three principals: `billing_etl_job_01`, `billing_etl_job_02`,
and `billing_etl_job_03`. She would then grant each principal permissions on
only the required topics and run each program with its specific principal.

Alternatively, she could take a middle-ground approach and create a single
`billing_etl_jobs` principal with access to all topics that the billing
programs require and run all three with that principal.

Alice should not run these programs as her own principal because she would
presumably have broader permissions than the jobs actually need. Running with
one principal per application also helps significantly with debugging and auditing
because it’s clearer which application is performing each operation.


## Adding security to brokers and clients running TLS or SASL authentication

You can secure a running Confluent Platform cluster using one or more of the supported protocols.
This is done in phases:

1. Incrementally restart the cluster nodes to open additional secured port(s).
2. Restart Kafka clients using the secured rather than `PLAINTEXT` port (assuming
   you are securing the client-broker connection).
3. Incrementally restart the cluster again to enable broker-to-broker security
   (if this is required).
4. A final incremental restart to close the `PLAINTEXT` port.

The specific steps for configuring security protocols are described in the
respective sections for [TLS](authentication/mutual-tls/overview.md#kafka-ssl-authentication) and
[SASL](authentication/overview.md#kafka-sasl-auth). Follow these steps to enable security for your
desired protocol(s).

The security implementation lets you configure different protocols for both
broker-client and broker-broker communication. These must be enabled in separate
restarts. A `PLAINTEXT` port must be left open throughout so brokers and/or
clients can continue to communicate.

When performing an incremental restart, take into consideration the recommendations
for doing [rolling restarts](../kafka/post-deployment.md#rolling-restart) to avoid downtime for end users.

For example, if you want to encrypt both broker-client and broker-broker
communication with TLS:

1. In the first incremental restart, open a TLS port on each node:
   ```bash
   listeners=PLAINTEXT://broker1:9091,SSL://broker1:9092
   ```

   #### NOTE
   In Confluent Platform clusters, you can update some Confluent Server broker configurations without
   restarting the broker by adding or removing listeners
   [dynamically](../kafka/dynamic-config.md#kafka-dynamic-configurations). When adding a new
   listener, provide the security configuration of the listener using
   the listener prefix `listener.name.{listenerName}`. If the new listener
   uses SASL, then provide the JAAS configuration property `sasl.jaas.config`
   with the listener and mechanism prefix. For more details, refer to [JAAS](authentication/sasl/gssapi/overview.md#jaas-config).
2. Then restart the Kafka clients, changing their configuration to point at the
   newly-opened, secured port:
   ```bash
   bootstrap.servers=[broker1:9092,...]
   security.protocol=SSL
   ...etc
   ```

   For more details, refer to [Protect Data in Motion with TLS Encryption in Confluent Platform](protect-data/encrypt-tls.md#kafka-ssl-encryption).
3. In the second incremental server restart, instruct Confluent Platform to use TLS as the
   broker-broker protocol (which will use the same TLS port):
   ```bash
   listeners=PLAINTEXT://broker1:9091,SSL://broker1:9092
   security.inter.broker.protocol=SSL
   ```
4. In the final restart, secure the cluster by closing the `PLAINTEXT` port:
   ```bash
   listeners=SSL://broker1:9092
   security.inter.broker.protocol=SSL
   ```


#### NOTE
Use `GenricAvroSerde` to enable both forward and backward schema
compatibility if your application requires both forward schema checks of the
producer and backward compatibility for Kafka Streams.

Usage example for Confluent `GenericAvroSerde`:

```java
// Generic Avro serde example
import io.confluent.kafka.streams.serdes.avro.GenericAvroSerde;

// When configuring the default serdes of StreamConfig
final Properties streamsConfiguration = new Properties();
streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, GenericAvroSerde.class);
streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, GenericAvroSerde.class);
streamsConfiguration.put("schema.registry.url", "http://my-schema-registry:8081");

// When you want to override serdes explicitly/selectively
final Map<String, String> serdeConfig = Collections.singletonMap("schema.registry.url",
                                                                 "http://my-schema-registry:8081");
// `Foo` and `Bar` are Java classes generated from Avro schemas
final Serde<Foo> keyGenericAvroSerde = new GenericAvroSerde();
keyGenericAvroSerde.configure(serdeConfig, true); // `true` for record keys
final Serde<Bar> valueGenericAvroSerde = new GenericAvroSerde();
valueGenericAvroSerde.configure(serdeConfig, false); // `false` for record values

StreamsBuilder builder = new StreamsBuilder();
KStream<Foo, Bar> textLines = builder.stream("my-avro-topic", Consumed.with(keyGenericAvroSerde, valueGenericAvroSerde));
```

Usage example for Confluent `SpecificAvroSerde`:

```java
// Specific Avro serde example
import io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde;

// When configuring the default serdes of StreamConfig
final Properties streamsConfiguration = new Properties();
streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, SpecificAvroSerde.class);
streamsConfiguration.put("schema.registry.url", "http://my-schema-registry:8081");

// When you want to override serdes explicitly/selectively
final Map<String, String> serdeConfig = Collections.singletonMap("schema.registry.url",
                                                                 "http://my-schema-registry:8081");
// `Foo` and `Bar` are Java classes generated from Avro schemas
final Serde<Foo> keySpecificAvroSerde = new SpecificAvroSerde<>();
keySpecificAvroSerde.configure(serdeConfig, true); // `true` for record keys
final Serde<Bar> valueSpecificAvroSerde = new SpecificAvroSerde<>();
valueSpecificAvroSerde.configure(serdeConfig, false); // `false` for record values

StreamsBuilder builder = new StreamsBuilder();
KStream<Foo, Bar> textLines = builder.stream("my-avro-topic", Consumed.with(keySpecificAvroSerde, valueSpecificAvroSerde));
```

When you create source streams, you specify input serdes by using the Streams DSL.
When you construct the processor topology by using the lower-level [Processor API](processor-api.md#streams-developer-guide-processor-api),
you can specify the serde class, like the Confluent `GenericAvroSerde`
and `SpecificAvroSerde` classes.

```java
TopologyBuilder builder = new TopologyBuilder();
builder.addSource("Source", keyGenericAvroSerde.deserializer(), valueGenericAvroSerde.deserializer(), inputTopic);
```


## Using Kafka Streams within your application code

You can call Kafka Streams from anywhere in your application code, but usually these calls are made within the `main()` method of
your application, or some variant thereof.  The basic elements of defining a processing topology within your application
are described below.

First, you must create an instance of `KafkaStreams`.

* The first argument of the `KafkaStreams` constructor takes a topology (either `StreamsBuilder#build()` for the
  [DSL](dsl-api.md#streams-developer-guide-dsl) or `Topology` for the
  [Processor API](processor-api.md#streams-developer-guide-processor-api)) that is used to define a topology.
* The second argument is an instance of `StreamsConfig`, which defines the configuration for this specific topology.

Code example:

```java
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.kstream.StreamsBuilder;
import org.apache.kafka.streams.processor.Topology;

// Use the builders to define the actual processing topology, e.g. to specify
// from which input topics to read, which stream operations (filter, map, etc.)
// should be called, and so on.  We will cover this in detail in the subsequent
// sections of this Developer Guide.

StreamsBuilder builder = ...;  // when using the DSL
Topology topology = builder.build();
//
// OR
//
Topology topology = ...; // when using the Processor API

// Use the configuration properties to tell your application where the Kafka cluster is,
// which Serializers/Deserializers to use by default, to specify security settings,
// and so on.
Properties props = ...;

KafkaStreams streams = new KafkaStreams(topology, props);
```

At this point, internal structures are initialized, but the processing is not started yet.
You have to explicitly start the Kafka Streams thread by calling the `KafkaStreams#start()` method:

```java
// Start the Kafka Streams threads
streams.start();
```

If there are other instances of this stream processing application running elsewhere (e.g., on another machine), Kafka
Streams transparently re-assigns tasks from the existing instances to the new instance that you just started.
For more information, see [Stream partitions and tasks](../architecture.md#streams-architecture-tasks) and [Threading model](../architecture.md#streams-architecture-threads).

To catch any unexpected exceptions, you can set an `java.lang.Thread.UncaughtExceptionHandler` before you start the
application.  This handler is called whenever a stream thread is terminated by an unexpected exception:

```java
streams.setUncaughtExceptionHander((exception) -> StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.REPLACE_THREAD);
```

The `StreamsUncaughtExceptionHandler` interface enables responding to
exceptions not handled by Kafka Streams. It has one method, `handle`, that
returns an enum of type `StreamThreadExceptionResponse`. You have the
opportunity to define how Kafka Streams responds to the exception, with three
possible values: REPLACE_THREAD, SHUTDOWN_CLIENT, or SHUTDOWN_APPLICATION.


#### Option 3: Quarantine corrupted records (dead letter queue)

You can provide your own `DeserializationExceptionHandler` implementation.
For example, you can choose to forward corrupt records into a quarantine topic (think: a “dead letter queue”) for further processing.
To do this, use the [Producer API](../clients/overview.md#kafka-clients) to write a corrupted record directly to the quarantine topic.
The drawback of this approach is that “manual” writes are side effects that are invisible to the Kafka Streams runtime library,
so they do not benefit from the end-to-end processing guarantees of the Streams API.

Code example:

```java
public class SendToDeadLetterQueueExceptionHandler implements DeserializationExceptionHandler {
    KafkaProducer<byte[], byte[]> dlqProducer;
    String dlqTopic;

    @Override
    public DeserializationHandlerResponse handle(final ProcessorContext context,
                                                 final ConsumerRecord<byte[], byte[]> record,
                                                 final Exception exception) {

        log.warn("Exception caught during Deserialization, sending to the dead queue topic; " +
                  "taskId: {}, topic: {}, partition: {}, offset: {}",
                  context.taskId(), record.topic(), record.partition(), record.offset(),
                  exception);

        dlqProducer.send(new ProducerRecord<>(dlqTopic, null, record.timestamp(), record.key(), record.value()));

        return DeserializationHandlerResponse.CONTINUE;
    }

    @Override
    public void configure(final Map<String, ?> configs) {
        dlqProducer = .. // get a producer from the configs map
        dlqTopic = .. // get the topic name from the configs map
    }
}

Properties streamsSettings = new Properties();
streamsSettings.put(
  StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
  SendToDeadLetterQueueExceptionHandler.class.getName()
);
```


## Confluent provided tools

The following CLI tools are officially provided or bundled with your Confluent Platform installation. They offer comprehensive functionalities for managing your
Kafka cluster and its components, providing seamless integration and dedicated support.

- [Bundled CLI Tools](cli-reference.md#cp-all-cli) - These are the core CLI tools included directly with your Confluent Platform installation, located in the `CONFLUENT_HOME/bin`
  directory. They provide fundamental features for managing your Kafka cluster and its components, including Kafka tools and Confluent specific
  utilities.
- [Confluent CLI](diagnostics-tool.md#diagnostics-cp) - The Confluent CLI is a unified CLI designed for managing Confluent Platform deployments.
  It offers a comprehensive set of commands for interacting with Kafka topics, Kafka Connect, ksqlDB, Schema Registry, Role-Based Access Control (RBAC), and more.
- [Check Clusters for KRaft Migration](kraft-migration-tool.md#kraft-migration-tool) - The `kraft-migration-tool` is a utility for evaluating your clusters before migration from legacy ZooKeeper-based Kafka clusters to
  the KRaft mode. This is an important step in the migration process. For more about the migration process, see [Migrate from ZooKeeper to KRaft on Confluent Platform](../installation/migrate-zk-kraft.md#migrate-zk-kraft).
- [Confluent Diagnostics Tool](diagnostics-tool.md#diagnostics-cp) - The Confluent Platform Diagnostics Bundle Tool is a dedicated utility for collecting diagnostic information
  about your Confluent Platform installation. This tool gathers logs, configuration files, process information, and metrics from Kafka brokers and Kafka Connect,
  consolidating them into a `.tar.gz` file for analysis. This is particularly useful when engaging with [Confluent Support](https://support.confluent.io).
- [kcat (formerly kafkacat)](kafkacat-usage.md#kafkacat-usage) - kcat is a command-line utility for testing and debugging Apache Kafka® deployments. You can use kcat
  to produce and consume messages, and to inspect topic and partition details.


## Set Up Confluent CLI and variables

1. [Install Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html) locally, v3.28.0 or later (if you already have it installed, update the CLI as described in [Upgrade](https://docs.confluent.io/confluent-cli/current/install.html#upgrade)).

   Verify the installation was successful.
   ```none
   confluent version
   ```
2. Using the CLI, log in to Confluent Cloud with the command `confluent login`, and use your Confluent Cloud username and password. The `--save` argument saves your Confluent Cloud user login credentials for future use.
   ```shell
   confluent login --save
   ```
3. Use the demo Confluent Cloud environment.
   ```shell
   CC_ENV=$(confluent environment list -o json \
            | jq -r '.[] | select(.name | contains("cp-demo")) | .id') \
   && echo "Your Confluent Cloud environment: $CC_ENV" \
   && confluent environment use $CC_ENV
   ```
4. Get the Confluent Cloud cluster ID and use the cluster.
   ```shell
   CCLOUD_CLUSTER_ID=$(confluent kafka cluster list -o json \
                     | jq -r '.[] | select(.name | contains("cp-demo")) | .id') \
   && echo "Your Confluent Cloud cluster ID: $CCLOUD_CLUSTER_ID" \
   && confluent kafka cluster use $CCLOUD_CLUSTER_ID
   ```
5. Get the bootstrap endpoint for the Confluent Cloud cluster.
   ```shell
   CC_BOOTSTRAP_ENDPOINT=$(confluent kafka cluster describe -o json | jq -r .endpoint) \
   && echo "Your Cluster's endpoint: $CC_BOOTSTRAP_ENDPOINT"
   ```
6. Create a Confluent Cloud service account for CP Demo and get its ID.
   ```shell
   confluent iam service-account create cp-demo-sa --description "service account for cp-demo" \
   && SERVICE_ACCOUNT_ID=$(confluent iam service-account list -o json \
                        | jq -r '.[] | select(.name | contains("cp-demo")) | .id') \
   && echo "Your cp-demo service account ID: $SERVICE_ACCOUNT_ID"
   ```
7. Get the ID and endpoint URL for your Schema Registry cluster.
   (**Note:** The Schema Registry cluster was created by default when you [added your cloud environment](/cloud/current/get-started/schema-registry.html#cloud-sr-enable-zones).)
   ```shell
   CC_SR_CLUSTER_ID=$(confluent schema-registry cluster describe -o json | jq -r .cluster_id) \
   && CC_SR_ENDPOINT=$(confluent schema-registry cluster describe -o json | jq -r .endpoint_url) \
   && echo "Schema Registry Cluster ID: $CC_SR_CLUSTER_ID" \
   && echo "Schema Registry Endpoint: $CC_SR_ENDPOINT"
   ```
8. Create a Schema Registry API key for the cp-demo service account.
   ```shell
   confluent api-key create \
      --service-account $SERVICE_ACCOUNT_ID \
      --resource $CC_SR_CLUSTER_ID \
      --description "SR key for cp-demo schema link"
   ```

   Verify your output resembles
   ```text
   It may take a couple of minutes for the API key to be ready.
   Save the API key and secret. The secret is not retrievable later.
   +---------+------------------------------------------------------------------+
   | API Key | SZBKJLD67XK5NZNZ                                                 |
   | Secret  | NTqs/A3Mt0Ohkk4fkaIsC0oLQ5Q/F0lLowYo/UrsTrEAM5ozxY7fjqxDdVwMJz99 |
   +---------+------------------------------------------------------------------+
   ```

   Set variables to reference the Schema Registry credentials returned in the previous step.
   ```shell
   SR_API_KEY=SZBKJLD67XK5NZNZ
   SR_API_SECRET=NTqs/A3Mt0Ohkk4fkaIsC0oLQ5Q/F0lLowYo/UrsTrEAM5ozxY7fjqxDdVwMJz99
   ```
9. Create a Kafka cluster API key for the cp-demo service account.
   ```shell
   confluent api-key create \
      --service-account $SERVICE_ACCOUNT_ID \
      --resource $CCLOUD_CLUSTER_ID \
      --description "Kafka key for cp-demo cluster link"
   ```

   Verify your output resembles
   ```text
   It may take a couple of minutes for the API key to be ready.
   Save the API key and secret. The secret is not retrievable later.
   +---------+-------------------------------------------------------------------+
   | API Key | SZBKLMG61XK9NZAB                                                  |
   | Secret  | QTpi/A3Mt0Ohkk4fkaIsGR3ATQ5Q/F0lLowYo/UrsTr3AMsozxY7fjqxDdVwMJz02 |
   +---------+-------------------------------------------------------------------+
   ```

   Set variables to reference the Kafka credentials returned in the previous step.
   ```shell
   CCLOUD_CLUSTER_API_KEY=SZBKLMG61XK9NZAB
   CCLOUD_CLUSTER_API_SECRET=QTpi/A3Mt0Ohkk4fkaIsGR3ATQ5Q/F0lLowYo/UrsTr3AMsozxY7fjqxDdVwMJz02
   ```
10. We will also need the cluster ID for the on-premises Confluent Platform cluster.
    ```shell
    CP_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id \
                   --tlsv1.2 --cacert ./scripts/security/snakeoil-ca-1.crt \
                   | jq -r ".id") \
    && echo "Your on-premises Confluent Platform cluster ID: $CP_CLUSTER_ID"
    ```


### Topics

Confluent Control Center can manage topics in a Kafka cluster.

1. Click **Topics**.
2. Scroll down and click the topic `wikipedia.parsed`.
   ![image](tutorials/cp-demo/images/topic_list_wikipedia.png)
3. View an overview of this topic:
   - Throughput
   - Partition replication status

   ![image](tutorials/cp-demo/images/topic_actions.png)
4. View which brokers are leaders for which partitions and where all partitions reside.
5. Inspect messages for this topic in real-time.
   ![image](tutorials/cp-demo/images/topic_inspect.png)
6. View the schema for this topic. For `wikipedia.parsed`, the topic value is using a Schema registered with Schema Registry (the topic key is just a string).
   ![image](tutorials/cp-demo/images/topic_schema.png)
7. View configuration settings for this topic.
   ![image](tutorials/cp-demo/images/topic_settings.png)
8. Return to **Topics**, click `wikipedia.parsed.count-by-domain` to view the output topic from the Kafka Streams application.
   ![image](tutorials/cp-demo/images/count-topic-view.png)
9. Return to **Topics** view and click the **+ Add a topic** button to create a new topic in your Kafka cluster. You can also view and edit settings of Kafka topics in the cluster. Read more on Confluent Control Center [topic management](https://docs.confluent.io/control-center/current/topics/overview.html).
   ![image](tutorials/cp-demo/images/create_topic.png)


# Scripted Confluent Platform Demo

The scripted Confluent Platform demo (`cp-demo`) example builds a full Confluent Platform deployment with an Apache Kafka® event streaming
application that uses [ksqlDB](../../ksqldb/overview.md#ksql-home) and [Kafka Streams](../../streams/overview.md#kafka-streams) for stream processing,
and secures all of the components end-to-end.
The tutorial includes a module that makes it a hybrid deployment that runs Cluster Linking and Schema
Linking to copy data and schemas from a local on-premises Kafka cluster to Confluent Cloud, a fully-managed service
for Kafka.

Follow the accompanying guided tutorial to learn how Kafka and Confluent Cloud work
with Connect, Confluent Schema Registry, Confluent Control Center, and Cluster Linking with security enabled end-to-end.


### Dimension summary

Clusters are billed based on the dimensions listed in the following tables. For every available dimension, the table below lists
the [Costs API](https://docs.confluent.io/cloud/current/api.html#tag/Costs-(billingv1)) line item and the unit of measure for the dimension.

| Dimension                                      | Line Type                   | Unit of Measure                                  |
|------------------------------------------------|-----------------------------|--------------------------------------------------|
| Kafka Storage                                  | `KAFKA_STORAGE`             | Cost per GB stored per hour                      |
| Kafka Ingress                                  | `KAFKA_NETWORK_WRITE`       | Cost per GB written                              |
| Kafka Egress                                   | `KAFKA_NETWORK_READ`        | Cost per GB read                                 |
| Confluent Unit for Kafka (CKU/eCKU)            | `KAFKA_NUM_CKUS`            | Cost per CKU/eCKU per hour                       |
| Kafka Ingress via Kafka REST APIs              | `KAFKA_REST_PRODUCE`        | Cost per GB written                              |
| KSQL Confluent Streaming Unit (CSU)            | `KSQL_NUM_CSUS`             | Cost per Confluent Streaming Unit (CSU) per hour |
| Connector Capacity for Dedicated Kafka cluster | `CONNECT_CAPACITY`          | Cost per hour                                    |
| Connect Task                                   | `CONNECT_NUM_TASKS`         | Cost per task per hour                           |
| Connect Data Transfer                          | `CONNECT_THROUGHPUT`        | Cost per GB written or read                      |
| Confluent Support Plan                         | `SUPPORT`                   | Cost per hour (prorated based on monthly price)  |
| Cluster Linking Links                          | `CLUSTER_LINKING_PER_LINK`  | Cost per link per hour                           |
| Cluster Linking Ingress                        | `CLUSTER_LINKING_WRITE`     | Cost per GB written                              |
| Cluster Linking Egress                         | `CLUSTER_LINKING_READ`      | Cost per GB read                                 |
| Audit Logs                                     | `AUDIT_LOG_READ`            | Cost per GB of data read from audit log topics   |
| Stream Governance Base                         | `GOVERNANCE_BASE`           | Cost per hour                                    |
| Schema Registry Schema                         | `SCHEMA_REGISTRY`           | Cost per schema per hour                         |
| Stream Governance Rule                         | `NUM_RULES`                 | Cost per rule per hour                           |
| Credit                                         | `PROMO_CREDIT`              | Credit issued by Confluent                       |
| Custom Connect Task                            | `CUSTOM_CONNECT_NUM_TASKS`  | Cost per task per hour                           |
| Custom Connect Data Transfer                   | `CUSTOM_CONNECT_THROUGHPUT` | Cost per GB written or read per hour             |
| Confluent Unit for Flink (CFU)                 | `FLINK_NUM_CFUS`            | Cost per CFU per minute                          |


### Configure clients from the Confluent Cloud Console

The easiest way to get started connecting your client apps to Confluent Cloud is to
copy and paste the configuration file from the Confluent Cloud Console.

1. Log in to Confluent Cloud.
2. Select an environment.
3. Select a cluster.
4. Select **Clients** from the navigation menu.
5. (Optional) Click **+ New client** button.
6. Select the language you are using for your client application.
   ![image](images/cloud-client-languages.png)
7. Once you have selected a language, create or use existing API keys for your Kafka cluster and Schema Registry cluster as needed.
   Then, copy and paste the displayed configuration into your client application source code.
   ![image](images/cloud-client-configuration-example.png)


## Unit Testing

Unit tests run very quickly and verify that isolated functional blocks of code work as expected.
They can test the logic of your application with minimal dependencies on other services.

ksqlDB exposes a test runner command line tool called [ksql-test-runner](/platform/current/ksqldb/how-to-guides/test-an-app.html) that can automatically test whether your ksqlDB statements behave correctly when given a set of inputs. It runs quickly and doesn’t require a running Kafka or ksqlDB cluster. Example in [Kafka Tutorial](https://developer.confluent.io/tutorials/join-a-stream-to-a-stream/ksql.html). Note that it can change and does not have backward compatibility guarantees.

With a Kafka Streams application, use [TopologyTestDriver](https://docs.confluent.io/platform/current/streams/developer-guide/test-streams.html), a test class that tests Kafka Streams logic.
Its start-up time is very fast, and you can test a single message at a time through a Kafka Streams topology, which allows easy debugging and stepping.
Refer to the example in [Kafka Tutorial](https://developer.confluent.io/tutorials/dynamic-output-topic/confluent.html).
If you developed your own Kafka Streams Processor, you may want to unit test it as well.
Because the `Processor` forwards its results to the context rather than returning them, unit testing requires a mocked context capable of capturing forwarded data for inspection.
For these purposes, use [MockProcessorContext](https://docs.confluent.io/platform/current/streams/developer-guide/test-streams.html), with an example in [Kafka Streams test](https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/state/internals/TimestampedSegmentsTest.java).

For basic Producers and Consumers, there are mock interfaces useful in unit tests.
JVM Producer and Consumer unit tests can make use of [MockProducer](/platform/current/clients/javadocs/javadoc/org/apache/kafka/clients/producer/MockProducer.html) and [MockConsumer](/platform/current/clients/javadocs/javadoc/org/apache/kafka/clients/consumer/MockConsumer.html), which implement the same interfaces and mock all the I/O operations as implemented in the `KafkaProducer` and `KafkaConsumer`, respectively.
You can refer to a `MockProducer` example in [Build your first Apache KafkaProducer application](https://developer.confluent.io/tutorials/creating-first-apache-kafka-producer-application/confluent.html) and a `MockConsumer` example in [Build your first Apache KafkaConsumer application](https://developer.confluent.io/tutorials/creating-first-apache-kafka-consumer-application/confluent.html).
For non-JVM librdkafka Producer and Consumer, it [varies by language](https://docs.confluent.io/platform/current/clients/index.html).
You could also use [rdkafka_mock](https://github.com/edenhill/librdkafka/blob/master/src/rdkafka_mock.h), a minimal implementation of the Kafka protocol broker APIs with no other dependencies.
Refer to an example in [librdkafka](https://github.com/edenhill/librdkafka/blob/master/tests/0105-transactions_mock.c).


#### IMPORTANT
The `ConsumerTimestampsInterceptor` is a producer to the `__consumer_timestamps` topic on the source
cluster and as such requires appropriate security configurations. These should be provided with the
`timestamps.producer.` prefix. for example, `timestamps.producer.security.protocol=SSL`. For more information on security
configurations see:

- [SSL Encryption](/platform/current/kafka/encryption.html#encryption-ssl-clients)
- [SSL Authentication](/platform/current/kafka/authentication_ssl.html#authentication-ssl-clients)
- [SASL/SCRAM](/platform/current/kafka/authentication_sasl/authentication_sasl_scram.html#sasl-scram-clients)
- [SASL/GSSAPI](/platform/current/kafka/authentication_sasl/authentication_sasl_gssapi.html#sasl-gssapi-clients)
- [SASL/PLAIN](/platform/current/kafka/authentication_sasl/authentication_sasl_plain.html#sasl-plain-clients)

The interceptor also requires ACLs for the
`__consumer_timestamps` topic. The consumer principal requires WRITE and DESCRIBE operations on the `__consumer_timestamps` topic.

To learn more, see:

- [Understanding Consumer Offset Translation](/platform/current/multi-dc-deployments/replicator/replicator-failover.html#consumer-offset-translation-feature)
- Discussion on consumer offsets and timestamp preservation in the whitepaper on [Disaster Recovery for Multi-Datacenter Apache Kafka Deployments](https://www.confluent.io/white-paper/disaster-recovery-for-multi-datacenter-apache-kafka-deployments/).


# Install Custom Connectors for Confluent Cloud

Learn how to install your custom connector in Confluent Cloud.


* [Overview](overview.md)
* [Quick Start](custom-connector-qs.md)
  * [Getting a connector](custom-connector-qs.md#getting-a-connector)
  * [Packaging a custom connector](custom-connector-qs.md#packaging-a-custom-connector)
  * [Uploading and launching the connector](custom-connector-qs.md#uploading-and-launching-the-connector)
* [Manage Custom Connectors](custom-connector-manage.md)
  * [Search for a custom connector](custom-connector-manage.md#search-for-a-custom-connector)
  * [Get notifications](custom-connector-manage.md#get-notifications)
  * [Modify a custom connector configuration](custom-connector-manage.md#modify-a-custom-connector-configuration)
  * [Override configuration properties](custom-connector-manage.md#override-configuration-properties)
  * [Update networking endpoints](custom-connector-manage.md#update-networking-endpoints)
  * [Custom connector logs](custom-connector-manage.md#custom-connector-logs)
  * [View metrics](custom-connector-manage.md#view-metrics)
  * [Delete a custom connector](custom-connector-manage.md#delete-a-custom-connector)
  * [View a custom connector plugin ID](custom-connector-manage.md#view-a-custom-connector-plugin-id)
  * [Delete a custom connector plugin](custom-connector-manage.md#delete-a-custom-connector-plugin)
* [Limitations and Support](custom-connector-fands.md)
  * [Limitations](custom-connector-fands.md#limitations)
  * [Shared responsibility](custom-connector-fands.md#shared-responsibility)
  * [Confluent and Partner support](custom-connector-fands.md#confluent-and-partner-support)
  * [Certified Partner-built connectors](custom-connector-fands.md#certified-partner-built-connectors)
  * [Supported AWS, Azure and GCP regions](custom-connector-fands.md#supported-aws-az-and-gcp-regions)
  * [Schema Registry integration](custom-connector-fands.md#sr-integration)
  * [App log topic](custom-connector-fands.md#app-log-topic)
* [API and CLI](custom-connector-cli.md)
  * [Custom Connector API](custom-connector-cli.md#custom-connector-api)
  * [Custom Connector CLI](custom-connector-cli.md#custom-connector-cli)
  * [Custom Connector Plugin CLI](custom-connector-cli.md#custom-connector-plugin-cli)
  * [Custom Connector Plugin Version CLI](custom-connector-cli.md#custom-connector-plugin-version-cli)
  * [Unsupported connector CLI commands](custom-connector-cli.md#unsupported-connector-cli-commands)
  * [Command reference](custom-connector-cli.md#command-reference)


## Quick Start

Use this quick start to get up and running with the Confluent Cloud ActiveMQ source
connector.


Prerequisites
: - Authorized access to a [Confluent Cloud](https://www.confluent.io/confluent-cloud/) cluster on Amazon Web Services (AWS), Microsoft Azure (Azure), or Google Cloud.
  - Access to an ActiveMQ message broker.
  - The Confluent CLI installed and configured for the cluster. See [Install the Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html).
  - [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
  - For networking considerations, see [Networking and DNS](overview.md#connect-internet-access-resources). To use a set of public egress IP addresses, see [Public Egress IP Addresses for Confluent Cloud Connectors](static-egress-ip.md#cc-static-egress-ips).


  - Kafka cluster credentials. The following lists the different ways you can provide credentials.
    - Enter an existing [service account](service-account.md#s3-cloud-service-account) resource ID.
    - Create a Confluent Cloud [service account](service-account.md#s3-cloud-service-account) for the connector. Make sure to review the ACL entries required in the [service account documentation](service-account.md#s3-cloud-service-account). Some connectors have specific ACL requirements.
    - Create a Confluent Cloud API key and secret. To create a key and secret, you can use [confluent api-key create](https://docs.confluent.io/confluent-cli/current/command-reference/api-key/confluent_api-key_create.html) *or* you can autogenerate the API key and secret directly in the Cloud Console when setting up the connector.


## Features

The Amazon CloudWatch Logs Source connector provides the following features:

* **At least once delivery**: The connector guarantees that records are delivered at least once to the Kafka topic.
* **Supports multiple tasks**: The connector supports running one or more tasks. More tasks may improve performance. The connector can start at one task to support all import data and can scale up to one task per log stream. One task per log stream raises the performance up to the greatest number of log streams that Amazon supports (10,000 logs per second or 1 MB per second).
* **Customize topic format**: The connector sources data from a single log group and can write to one topic per log stream. There is a Kafka topic format property (CLI property `kafka.topic.format`) you can use to customize the topic names for each log stream.
* **Supported data formats**: The connector supports Avro, String and JSON (schemaless) output formats. [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
* **Provider integration support**: The connector supports IAM role-based authorization using the Confluent Provider Integration.
  For more information about provider integration setup, see the [IAM roles authentication](#cc-cloudwatch-source-setup-connection).
* **Enhanced log stream capacity**: The connector now supports more than 50 log streams, removing the previous limitation and allowing for greater scalability in log ingestion scenarios.

For more information and examples to use with the Confluent Cloud API for Connect,
see the [Confluent Cloud API for Connect Usage Examples](connect-api-section.md#ccloud-connect-api) section.


## Quick Start

Use this quick start to get up and running with the Confluent Cloud Amazon CloudWatch
Metrics Sink connector. The quick start provides the basics of selecting the
connector and configuring it to send records to Amazon CloudWatch.


Prerequisites
: * Authorized access to a [Confluent Cloud](https://www.confluent.io/confluent-cloud/) cluster on AWS.
  * The Confluent CLI installed and configured for the cluster. See [Install the Confluent CLI](https://docs.confluent.io/confluent-cli/current/install.html).
  * [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON_SR (JSON Schema), or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
  * For networking considerations, see [Networking and DNS](overview.md#connect-internet-access-resources). To use a set of public egress IP addresses, see [Public Egress IP Addresses for Confluent Cloud Connectors](static-egress-ip.md#cc-static-egress-ips).
  * An AWS account configured with [Access Keys](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
  * The Amazon CloudWatch Metrics region must in the same region where your Confluent Cloud cluster is located (where you are running the connector). Note that the hard-coded endpoint URL for the connector is set to `https://monitoring.{kafka-cluster-region}.amazonaws.com`. This sets the Amazon CloudWatch region to your Kafka cluster region.


  - Kafka cluster credentials. The following lists the different ways you can provide credentials.
    - Enter an existing [service account](service-account.md#s3-cloud-service-account) resource ID.
    - Create a Confluent Cloud [service account](service-account.md#s3-cloud-service-account) for the connector. Make sure to review the ACL entries required in the [service account documentation](service-account.md#s3-cloud-service-account). Some connectors have specific ACL requirements.
    - Create a Confluent Cloud API key and secret. To create a key and secret, you can use [confluent api-key create](https://docs.confluent.io/confluent-cli/current/command-reference/api-key/confluent_api-key_create.html) *or* you can autogenerate the API key and secret directly in the Cloud Console when setting up the connector.


## Features

* **Auto-created tables**: Tables can be auto-created based on topic names and auto-evolved based on the record schema.
* **Select configuration properties**:
  - `aws.dynamodb.pk.hash`: Defines how the DynamoDB table hash key is extracted from the records. By default, the Kafka partition number where the record is generated is used as the hash key. Other record references can be used to create the hash key. See [DynamoDB hash keys and sort keys](#cc-amazon-dynamodb-sink-hash-sort) for examples.
  - `aws.dynamodb.pk.sort`: Defines how the DynamoDB table sort key is extracted from the records. By default, the record offset is used as the sort key. The sort key can be created from other references. See [DynamoDB hash keys and sort keys](#cc-amazon-dynamodb-sink-hash-sort) for examples.
* **Provider integration support**: The connector supports IAM role-based authorization using Confluent Provider Integration.
  For more information about provider integration setup, see the [IAM roles authentication](#cc-amazon-dynamodb-sink-setup-connection).

For more information and examples to use with the Confluent Cloud API for Connect,
see the [Confluent Cloud API for Connect Usage Examples](connect-api-section.md#ccloud-connect-api) section.


### Configuration

Note that configuration properties that are not shown in the
Cloud Console use the default values. For all property values and
definitions, see [Configuration properties](#cc-amazon-dynamodb-cdc-source-config-properties).

1. Select the output record value format: Avro, JSON Schema, Protobuf. A
   valid schema must be available in [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) to use a schema-based message format (for example,
   Avro, JSON Schema, or Protobuf). For additional information, see
   [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits).
2. In the **AWS DynamoDB API Endpoint** field, enter the AWS DynamoDB API
   endpoint.
3. Select a table sync mode from the **DynamoDB Table Sync Mode** dropdown
   list. Valid values are:
   - `CDC`: Perform CDC only.
   - `SNAPSHOT`: Perform a snapshot only.
   - `SNAPSHOT_CDC` (Default): The connector starts with a
     snapshot and then switches to CDC mode upon completion.
4. Select a table discovery mode from the **Table Discovery Mode**
   dropdown list. Valid values are:
   - `INCLUDELIST`: A comma-separated list of DynamoDB table names to be
     captured. This is required if `dynamodb.table.discovery.mode` is
     set to `INCLUDELIST`.
   - `TAG`: A semi-colon-separated list of pairs in the form
     `<tag-key>:<value-1>,<value-2>` that is used to create tag filters.
     For example, `key1:v1,v2;key2:v3,v4` will include all tags that
     match `key1` key with value of either `v1` or `v2`, and match
     `key2` with value of either `v3` or `v4`. Any `keys` not
     specified will be excluded.
5. In the **Tables Include List** field, enter a comma-separated list of
   DynamoDB table names to be captured. Note that this is required if
   `dynamodb.table.discovery.mode` is set to `INCLUDELIST`.

   ### **Show advanced configurations**

   - **Schema context**: Select a schema context to use for this connector, if using
     a schema-based data format. This property defaults to the **Default** context,
     which configures the connector to use the default schema set up for Schema Registry in your
     Confluent Cloud environment. A schema context allows you to use separate schemas (like
     schema sub-registries) tied to topics in different Kafka clusters that share the
     same Schema Registry environment. For example, if you select a non-default context, a
     **Source** connector uses only that schema context to register a schema and a
     **Sink** connector uses only that schema context to read from. For more
     information about setting up a schema context, see [What are schema contexts and when should you use them?](../sr/faqs-cc.md#faq-schema-contexts).

   **CDC Details**
   - **Prefix for CDC Checkpointing table**: Prefix for CDC Checkpointing
     tables, must be unique per connector. Checkpointing table is used to
     store the last processed record for each shard and is used to resume
     from last processed record in case of connector restart. This is
     applicable only in CDC mode.
   - **CDC Checkpointing Table Billing Mode**: Define billing mode for
     internal checkpoint table created with CDC. Valid values are
     `PROVISIONED` and `PAY_PER_REQUEST`. Default is `PROVISIONED`.
     Use `PAY_PER_REQUEST` for unpredictable application traffic and
     on-demand billing mode. Use `PROVISIONED` for predictable
     application traffic and provisioned billing mode.
   - **Max number of records per DynamoDB Streams poll**: The maximum
     number of records that can be returned in single DynamoDB Stream
     getRecords operation. Only applicable in CDC phase. Default value is
     `5000`.

   **Snapshot Details**
   - **Max records per Table Scan**: Maximum number of records that can be
     returned in single DynamoDB read operation. Only applicable to
     `SNAPSHOT` phase. Note that there is 1 MB size limit as well.
   - **Snapshot Table RCU consumption percentage**: Configure the percentage
     of table read capacity that will be used as a maximum limit of RCU
     consumption rate.

   **DynamoDB Details**
   - **Maximum batch size**: The maximum number of records the connector
     will wait for before publishing the data on the topic. The connector
     may still return fewer records if no additional records are
     available.
   - **Poll linger milliseconds**: The maximum time to wait for a record
     before returning an empty batch. The default is 5 seconds.

   **Processing position**

   Define a specific offset position for this connector to begin
   procession data from by clicking **Set offsets**. For more information
   on managing offsets, [Manage Offsets for Fully-Managed Connectors in Confluent Cloud](offsets.md#connect-custom-offsets)

   **Auto-restart policy**
   - **Enable Connector Auto-restart**: Control the auto-restart behavior of the connector and its
     task in the event of user-actionable errors. Defaults to `true`, enabling the connector to
     automatically restart in case of user-actionable errors. Set this property to `false` to
     disable auto-restart for failed connectors. In such cases, you would need to manually restart
     the connector.

   **Consumer configuration**
   - **Max poll interval(ms)**: Set the maximum delay between subsequent consume requests to Kafka. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 300,000 milliseconds (5 minutes).
   - **Max poll records**: Set the maximum number of records to consume from Kafka in a single request. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 500 records.

   **Transforms**
   - **Single Message Transforms**: To add a new SMT, see [Add transforms](single-message-transforms.md#cc-single-message-transforms-ui).
     For more information about unsupported SMTs, see
     [Unsupported transformations](single-message-transforms.md#cc-single-message-transforms-unsupported-transforms).

   For all property values and definitions, see
   [Configuration properties](#cc-amazon-dynamodb-cdc-source-config-properties).
6. Click **Continue**.


## Features

The AWS Lambda Sink connector provides the following features:

* **Supports multiple Lambda functions**: The connector supports a single AWS
  Lambda function or multiple Lambda functions.
* **Provider integration support**: The connector supports IAM role-based authorization using
  Confluent Provider Integration. For more information about provider integration setup, see
  the [IAM roles authentication](#cc-aws-lambda-sink-setup-connection).
* **Synchronous and Asynchronous Lambda function invocation**: The AWS Lambda function can be invoked by this connector either synchronously or asynchronously.
* **At-least-once delivery**: The connector guarantees at-least-once processing semantics
  in synchronous mode. In asynchronous mode, at-least-once delivery is guaranteed, but it does not
  guarantee at-least-once processing by the AWS Lambda function. This is because AWS Lambda may
  drop async events if it cannot process them after a few retries.

  Under certain circumstances, a record may be processed more than once. You should design
  your AWS Lambda function to be [idempotent](https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-idempotent/).
  If you have configured the connector to log the response from the Lambda function to a Kafka
  topic, the topic can contain duplicate records. You can enable Kafka log compaction on the
  topic to remove duplicate records.  Alternatively, you can write a ksqlDB query to detect
  duplicate records in a time window.
* **Supports multiple tasks**: The connector supports running one or more tasks. More tasks may improve performance.
* **Results topics**: In synchronous mode, AWS Lambda results are stored in the `success-<connector-id>` and `error-<connector-id>` topics.
* **Input Data Format with or without a Schema**: The connector supports input data from Kafka topics in Avro, JSON Schema (JSON_SR), Protobuf, JSON (schemaless), or Bytes format. [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format.

  #### NOTE
  If no schema is defined, values are encoded as plain strings. For example,  `"name": "Kimberley Human"` is encoded as `name=Kimberley Human`.
* **Backward compatibility**: The API for this connector is compatible with
  earlier versions.
* **Supports AWS Lambda function versions and aliases**: The connector supports invoking specific AWS Lambda function versions or aliases by appending a colon and the desired version or alias to the function name (for example, `function:1` for a version or `function:alias` for an alias).

For more information and examples to use with the Confluent Cloud API for Connect,
see the [Confluent Cloud API for Connect Usage Examples](connect-api-section.md#ccloud-connect-api) section.


# Tables and Topics in Confluent Cloud for Apache Flink

Apache Flink® and the Table API use the concept of dynamic tables to facilitate
the manipulation and processing of streaming data. Dynamic tables represent an
abstraction for working with both batch and streaming data in a unified manner,
offering a flexible and expressive way to define, modify, and query structured
data. In contrast to the static tables that represent batch data, dynamic
tables change over time. But like static batch tables, systems can execute
queries over dynamic tables.

Confluent Cloud for Apache Flink® implements ANSI-Standard SQL and has the familiar concepts of
catalogs, databases, and tables. Confluent Cloud maps a Flink catalog to an environment
and *vice-versa*. Similarly, Flink databases and tables are mapped to Apache Kafka®
clusters and topics. For more information, see
[Metadata mapping between Kafka cluster, topics, schemas, and Flink](../overview.md#ccloud-flink-overview-metadata-mapping).


# Get Started with Confluent Cloud for Apache Flink

Welcome to Confluent Cloud for Apache Flink®. This section guides you through the steps to get your
queries running using the Confluent Cloud Console (browser-based) and the Flink SQL
shell (CLI-based).


If you’re currently using Confluent Cloud in a region that doesn’t yet support Flink,
so you can’t use your data in existing Apache Kafka® topics, you can still try out
Flink SQL by using sample data generators or the
[Example catalog](../reference/example-data.md#flink-sql-example-data), which are used in the
quick starts and [How-to Guides for Confluent Cloud for Apache Flink](../how-to-guides/overview.md#flink-sql-how-to-guides).

Choose one of the following quick starts to get started with Flink SQL on
Confluent Cloud:

- [Flink SQL Quick Start with Confluent Cloud Console](quick-start-cloud-console.md#flink-sql-quick-start-cloud-console)
- [Flink SQL Shell Quick Start](quick-start-shell.md#flink-sql-quick-start-shell)

Also, you can access Flink by using the [REST API](../operate-and-deploy/flink-rest-api.md#flink-rest-api) and the
[Confluent Terraform Provider](../operate-and-deploy/deploy-flink-sql-statement.md#flink-deploy-sql-statement).

- [REST API-based data streams](https://github.com/confluentinc/demo-scene/tree/master/http-streaming)
- [Sample Project for Confluent Terraform Provider](https://registry.terraform.io/providers/confluentinc/confluent/latest/docs/guides/sample-project)

If you get stuck, have a question, or want to provide feedback or feature
requests, don’t hesitate to reach out. Check out [Get Help with Confluent Cloud for Apache Flink](../get-help.md#ccloud-flink-help)
for our support channels.


### Cloud

| Command                                                                                              | Description                                                |
|------------------------------------------------------------------------------------------------------|------------------------------------------------------------|
| [confluent ai](confluent_ai.md#confluent-ai)                                                         | Start an interactive AI shell.                             |
| [confluent api-key](api-key/index.md#confluent-api-key)                                              | Manage API keys.                                           |
| [confluent asyncapi](asyncapi/index.md#confluent-asyncapi)                                           | Manage AsyncAPI document tooling.                          |
| [confluent audit-log](audit-log/index.md#confluent-audit-log)                                        | Manage audit log configuration.                            |
| [confluent billing](billing/index.md#confluent-billing)                                              | Manage Confluent Cloud billing.                            |
| [confluent byok](byok/index.md#confluent-byok)                                                       | Manage your keys in Confluent Cloud.                       |
| [confluent ccpm](ccpm/index.md#confluent-ccpm)                                                       | Manage custom Connect plugin management (CCPM).            |
| [confluent cloud-signup](confluent_cloud-signup.md#confluent-cloud-signup)                           | Sign up for Confluent Cloud.                               |
| [confluent completion](confluent_completion.md#confluent-completion)                                 | Print shell completion code.                               |
| [confluent configuration](configuration/index.md#confluent-configuration)                            | Configure the Confluent CLI.                               |
| [confluent connect](connect/index.md#confluent-connect)                                              | Manage Kafka Connect.                                      |
| [confluent context](context/index.md#confluent-context)                                              | Manage CLI configuration contexts.                         |
| [confluent environment](environment/index.md#confluent-environment)                                  | Manage and select Confluent Cloud environments.            |
| [confluent feedback](confluent_feedback.md#confluent-feedback)                                       | Submit feedback for the Confluent CLI.                     |
| [confluent flink](flink/index.md#confluent-flink)                                                    | Manage Apache Flink.                                       |
| [confluent iam](iam/index.md#confluent-iam)                                                          | Manage RBAC and IAM permissions.                           |
| [confluent kafka](kafka/index.md#confluent-kafka)                                                    | Manage Apache Kafka.                                       |
| [confluent ksql](ksql/index.md#confluent-ksql)                                                       | Manage ksqlDB.                                             |
| [confluent local](local/index.md#confluent-local)                                                    | Manage a local Confluent Platform development environment. |
| [confluent login](confluent_login.md#confluent-login)                                                | Log in to Confluent Cloud or Confluent Platform.           |
| [confluent logout](confluent_logout.md#confluent-logout)                                             | Log out of Confluent Cloud.                                |
| [confluent network](network/index.md#confluent-network)                                              | Manage Confluent Cloud networks.                           |
| [confluent organization](organization/index.md#confluent-organization)                               | Manage your Confluent Cloud organizations.                 |
| [confluent plugin](plugin/index.md#confluent-plugin)                                                 | Manage Confluent plugins.                                  |
| [confluent prompt](confluent_prompt.md#confluent-prompt)                                             | Add Confluent CLI context to your terminal prompt.         |
| [confluent provider-integration](provider-integration/index.md#confluent-provider-integration)       | Manage Confluent Cloud provider integrations.              |
| [confluent schema-registry](schema-registry/index.md#confluent-schema-registry)                      | Manage Schema Registry.                                    |
| [confluent service-quota](service-quota/index.md#confluent-service-quota)                            | Look up Confluent Cloud service quota limits.              |
| [confluent shell](confluent_shell.md#confluent-shell)                                                | Start an interactive shell.                                |
| [confluent stream-share](stream-share/index.md#confluent-stream-share)                               | Manage stream shares.                                      |
| [confluent tableflow](tableflow/index.md#confluent-tableflow)                                        | Manage Tableflow.                                          |
| [confluent unified-stream-manager](unified-stream-manager/index.md#confluent-unified-stream-manager) | Manage Unified Stream Manager clusters.                    |
| [confluent update](confluent_update.md#confluent-update)                                             | Update the Confluent CLI.                                  |
| [confluent version](confluent_version.md#confluent-version)                                          | Show version of the Confluent CLI.                         |


## Add a connector (non-RBAC environment)

Follow these steps to configure a source or sink connector by completing the
applicable UI fields. You can also add a connector by
[uploading a connector configuration file](#c3-upload-connector-config).
For details about connector settings common and unique to
source and sink connectors, see [Configuring
Connectors](/kafka-connectors/self-managed/configuring.html).

This procedure is applicable to a non-RBAC workflow. For an role-based access control (RBAC) workflow, see
[Add a connector (RBAC environment)](#c3-add-connector-rbac-workflow)).

There are two steps (tabs) to complete in this workflow:

- Set up the connection.
- Test and verify.

**To add a connector in a non-RBAC environment**

1. Select a cluster from the navigation bar and click the **Connect** menu. The
   [All Connect Clusters page](#c3-all-connect-clusters-page) opens.
   ![image](images/c3-all-connect-clusters-page.png)
2. In the **Cluster name** column, click the **connect-default** link (or the
   link for your Connect cluster). The [Connectors page](#c3-connectors-page) opens.
   ![image](images/c3-no-connectors.png)
3. Click **Add connector**. The Browse page for selecting connectors opens. The
   connectors that initially appear in this page are
   [bundled](/kafka-connectors/self-managed/supported.html) with Confluent Platform.  To narrow the available
   selections, select either **Sources** or **Sinks** from the **Filter by type** menu.
   ![image](images/c3-connect-select-connector.png)
4. Click the tile for the connector you want to configure. The **Add Connector**
   page opens to the **01 Setup Connection** tab. Use the shortcut panel to the right to
   navigate the list of configurations.
   ![image](images/c3-add-connector.png)
5. Complete the fields as appropriate for the connector. Required fields are
   indicated with an asterisk.

   ### Generalized GCS Source connector configuration

   When configuring the [Generalized Google Cloud Storage Source connector](https://docs.confluent.io/kafka-connectors/gcs-source/current/generalized/overview.html),
   you won’t be able to add the `topic.regex.list` configuration parameter
   if the mode for the connector is set to `RESTORE_BACKUP`, which is the
   default mode. If you set the mode to `GENERIC`, you will see
   `topic.regex.list` listed as an option under **Kafka Topic Regex** in
   the **Topic** section. For more details about each of these parameters,
   see the [Generalized GCS Source Connector Configuration Properties](https://docs.confluent.io/kafka-connectors/gcs-source/current/configuration_options.html#generalized-connector-parameters)
   page.


6. (Optional) If there are additional properties you need to add, click **Add a
   property**. The **Additional Properties** dialog opens for you to enter the
   property name. After entering the property name, enter the value for the
   property.
   ![image](images/c3-connector-add-property.png)![image](images/c3-connector-add-prop-modal.png)![image](images/c3-connector-addl-props.png)

   To delete a property, click the trash icon. You can undo the operation.
7. Click **Continue**. The **02 Test and verify** page opens. (If the 02
   Security page opens, see [RBAC workflow](#c3-add-connector-rbac-workflow).)
   ![image](images/c3-connect-download-config-link.png)
8. (Optional) Click **Download connector config file**. See
   [Download a connector configuration file](#c3-download-connector-config) for details.
9. Review the information and click **Launch**. The information displayed is
   sent to the [Connect REST API](/platform/current/connect/references/restapi.html#connect-userguide-rest).
   - If the configuration was successful, the connector appears in the
     connectors table within the [Connectors page](#c3-connectors-page). Green bars indicate
     the connector is running.
   - If the configuration was unsuccessful, the **Status** column indicates
     Failed. Red bars indicate the connector is not running. In the **Name**
     column, click the link for the connector and edit the configuration fields.
     Repeat the process as necessary.


#### NOTE
- These are properties for the self-managed connector. If you are using
  Confluent Cloud, see [Google BigQuery Sink Connector for Confluent
  Cloud](/cloud/current/connectors/cc-gcp-bigquery-sink.html).
- New tables and updated schemas take a few minutes to be detected by the
  Google Client Library. For more information see the Google Cloud [BigQuery API
  guide](https://cloud.google.com/bigquery/docs/error-messages#metadata-errors-for-streaming-inserts).

`defaultDataset`
: The default dataset to be used


  * Type: string
  * Importance: high


  #### NOTE
  `defaultDataset` replaced the `datasets` parameter of older versions of this connector.

`project`
: The BigQuery project to write to.


  * Type: string
  * Importance: high

`topics`
: A list of Kafka topics to read from.


  * Type: list
  * Importance: high

`autoCreateTables`
: Create BigQuery tables if they don’t already exist. This property should only
  be enabled for Schema Registry-based inputs: Avro, Protobuf, or JSON Schema (JSON_SR).
  Table creation is not supported for JSON input.


  * Type: boolean
  * Default: false
  * Importance: high

`gcsBucketName`
: The name of the bucket where Google Cloud Storage (GCS) blobs are located. These blobs are used to batch-load to BigQuery. This is applicable only if `enableBatchLoad` is configured.


  * Type: string
  * Default: “”
  * Importance: high

`queueSize`
: The maximum size (or -1 for no maximum size) of the worker queue for BigQuery write requests before all topics are paused. This is a soft limit; the size of the queue can go over this before topics are paused. All topics resume once a flush is triggered or the size of the queue drops under half of the maximum size.


  * Type: long
  * Default: -1
  * Valid Values: [-1,…]
  * Importance: high

`bigQueryRetry`
: The number of retry attempts made for a BigQuery request that fails with a backend error or a quota exceeded error.


  * Type: int
  * Default: 0
  * Valid Values: [0,…]
  * Importance: medium

`bigQueryRetryWait`
: The minimum amount of time, in milliseconds, to wait between retry attempts for a BigQuery backend or quota exceeded error.


  * Type: long
  * Default: 1000
  * Valid Values: [0,…]
  * Importance: medium

`bigQueryMessageTimePartitioning`
: Whether or not to use the message time when inserting records. Default uses the connector processing time.


  * Type: boolean
  * Default: false
  * Importance: high

`bigQueryPartitionDecorator`
: Whether or not to append partition decorator to BigQuery table name when inserting records. Default is true. Setting this to true appends partition decorator to table name (e.g. table$yyyyMMdd depending on the configuration set for bigQueryPartitionDecorator). Setting this to false bypasses the logic to append the partition decorator and uses raw table name for inserts.


  * Type: boolean
  * Default: true
  * Importance: high

`timestampPartitionFieldName`
: The name of the field in the value that contains the timestamp to partition by in BigQuery and enable timestamp partitioning for each table. Leave this configuration blank, to enable ingestion time partitioning for each table.


  * Type: string
  * Default: null
  * Importance: low

`clusteringPartitionFieldNames`
: Comma-separated list of fields where data is clustered in BigQuery.


  * Type: list
  * Default: null
  * Importance: low

`timePartitioningType`
: The time partitioning type to use when creating tables. Existing tables will not be altered to use this partitioning type.


  * Type: string
  * Default: DAY
  * Valid Values: (case insensitive) [MONTH, YEAR, HOUR, DAY]
  * Importance: low

`keySource`
: Determines whether the keyfile configuration is the path to the credentials JSON file or to the JSON itself. Available values are `FILE` and `JSON`. This property is available in BigQuery sink connector version 1.3 (and later).


  * Type: string
  * Default: FILE
  * Importance: medium

`keyfile`
: `keyfile` can be either a string representation of the Google credentials file or the path to the Google credentials file itself. The string representation of the Google credentials file is supported in BigQuery sink connector version 1.3 (and later).


  * Type: string
  * Default: null
  * Importance: medium

`sanitizeTopics`
: Designates whether to automatically sanitize topic names before using them as table names. If not enabled, topic names are used as table names.


  * Type: boolean
  * Default: false
  * Importance: medium

`schemaRetriever`
: A class that can be used for automatically creating tables and/or updating
  schemas. Note that in version 2.0.0, SchemaRetriever API changed to retrieve
  the schema from each SinkRecord, which will help support multiple schemas per
  topic. `SchemaRegistrySchemaRetriever` has been removed as it retrieves
  schema based on the topic.


  * Type: class
  * Default: `com.wepay.kafka.connect.bigquery.retrieve.IdentitySchemaRetriever`
  * Importance: medium

`threadPoolSize`
: The size of the BigQuery write thread pool. This establishes the maximum number of concurrent writes to BigQuery.


  * Type: int
  * Default: 10
  * Valid Values: [1,…]
  * Importance: medium

`allBQFieldsNullable`
: If true, no fields in any produced BigQuery schema are REQUIRED. All non-nullable Avro fields are translated as `NULLABLE` (or `REPEATED`, if arrays).


  * Type: boolean
  * Default: false
  * Importance: low

`avroDataCacheSize`
: The size of the cache to use when converting schemas from Avro to Kafka Connect.


  * Type: int
  * Default: 100
  * Valid Values: [0,…]
  * Importance: low

`batchLoadIntervalSec`
: The interval, in seconds, in which to attempt to run GCS to BigQuery load jobs. Only relevant if `enableBatchLoad` is configured.


  * Type: int
  * Default: 120
  * Importance: low

`convertDoubleSpecialValues`
: Designates whether +Infinity is converted to Double.MAX_VALUE and whether -Infinity and NaN are converted to Double.MIN_VALUE to ensure successful delivery to BigQuery.


  * Type: boolean
  * Default: false
  * Importance: low

`enableBatchLoad`
: **Beta Feature** Use with caution. The sublist of topics to be batch loaded through GCS.


  * Type: list
  * Default: “”
  * Importance: low

`includeKafkaData`
: Whether to include an extra block containing the Kafka source topic, offset,
  and partition information in the resulting BigQuery rows.


  * Type: boolean
  * Default: false
  * Importance: low


`upsertEnabled`
: Enable upsert functionality on the connector through the use of record keys,
  intermediate tables, and periodic merge flushes. Row-matching will be
  performed based on the contents of record keys. This feature won’t work
  with SMTs that change the name of the topic and doesn’t support JSON input.


  * Type: boolean
  * Default: false
  * Importance: low


`deleteEnabled`
: Enable delete functionality on the connector through the use of record keys,
  intermediate tables, and periodic merge flushes. A delete will be performed
  when a record with a null value (that is–a tombstone record) is read. This
  feature will not work with SMTs that change the name of the topic and doesn’t
  support JSON input.


  * Type: boolean
  * Default: false
  * Importance: low


`intermediateTableSuffix`
: A suffix that will be appended to the names of destination tables to create
  the names for the corresponding intermediate tables. Multiple intermediate
  tables may be created for a single destination table, but their names will
  always start with the name of the destination table, followed by this suffix,
  and possibly followed by an additional suffix.


  * Type: string
  * Default: “tmp”
  * Importance: low


`mergeIntervalMs`
: How often (in milliseconds) to perform a merge flush, if upsert/delete is
  enabled. Can be set to `-1` to disable periodic flushing.


  * Type: long
  * Default: 60_000L
  * Importance: low


`mergeRecordsThreshold`
: How many records to write to an intermediate table before performing a merge
  flush, if upsert/delete is enabled. Can be set to `-1` to disable record
  count-based flushing.


  * Type: long
  * Default: -1
  * Importance: low

`autoCreateBucket`
: Whether to automatically create the given bucket, if it does not exist.


  * Type: boolean
  * Default: true
  * Importance: medium

`allowNewBigQueryFields`
: If true, new fields can be added to BigQuery tables during subsequent schema updates.


  * Type: boolean
  * Default: false
  * Importance: medium

`allowBigQueryRequiredFieldRelaxation`
: If true, fields in BigQuery Schema can be changed from `REQUIRED` to
  `NULLABLE`. Note that `allowNewBigQueryFields` and
  `allowBigQueryRequiredFieldRelaxation` replaced the `autoUpdateSchemas`
  parameter of older versions of this connector.


  * Type: boolean
  * Default: false
  * Importance: medium

`allowSchemaUnionization`
: If true, the existing table schema (if one is present) will be unionized with
  new record schemas during schema updates. If false, the record of the last
  schema in a batch will be used for any necessary table creation and schema
  update attempts.


  Note that setting `allowSchemaUnionization` to `false` and
  `allowNewBigQueryFields` and `allowBigQueryRequiredFieldRelaxation` to
  `true` is equivalent to setting `autoUpdateSchemas` to `true` in older
  (pre-2.0.0) versions of this connector. This should only be enabled for
  Schema Registry-based inputs: Avro, Protobuf, or JSON Schema (JSON_SR). Table schema
  updates are not supported for JSON input.


  If you set `allowSchemaUnionization`  to `false` and
  `allowNewBigQueryFields` and `allowBigQueryRequiredFieldRelaxation` to
  `true` if BigQuery raises a schema validation exception or a table
  doesn’t exist when a writing a batch, the connector will try to remediate by
  required field relaxation and/or adding new fields.


  If `allowSchemaUnionization`, `allowNewBigQueryFields`, and
  `allowBigQueryRequiredFieldRelaxation` are `true`, the connector will
  create or update tables with a schema whose fields are a union of the existing
  table schema’s fields and the ones present in all of the records of the
  current batch.


  The key difference is that with unionization disabled, new record schemas have
  to be a superset of the table schema in BigQuery.


  In general when enabled, `allowSchemaUnionization` is useful to make things
  work. For instance, if you’d like to remove fields from data upstream, the
  updated schemas still work in the connector. Similarly it is useful when
  different tasks see records whose schemas contain different fields that are
  not in the table. However note with caution that if
  `allowSchemaUnionization` is set and some bad records are in the topic, the
  BigQuery schema may be permanently changed. This presents two issues: first,
  since BigQuery doesn’t allow columns to be dropped from tables, they’ll add
  unnecessary noise to the schema. Second, since BigQuery doesn’t allow column
  types to be modified, they could completely break pipelines down the road
  where well-behaved records have schemas whose field names overlap with the
  accidentally-added columns in the table, but use a different type.


  * Type: boolean
  * Default: false
  * Importance: medium

`kafkaDataFieldName`
: The Kafka data field name. The default value is null, which means the Kafka Data
  field will not be included.


  * Type: string
  * Default: null
  * Importance: low

`kafkaKeyFieldName`
: The Kafka key field name. The default value is null, which means the Kafka Key field
  will not be included.


  * Type: string
  * Default: null
  * Importance: low

`topic2TableMap`
: Map of topics to tables (optional). Format: comma-separated tuples,
  e.g. <topic-1>:<table-1>,<topic-2>:<table-2>,.. Note that topic name should
  not be modified using regex SMT while using this option. Also note that
  SANITIZE_TOPICS_CONFIG would be ignored if this config is set. Lastly, if the
  topic2table map doesn’t contain the topic for a record, a table with the same
  name as the topic name would be created.


  * Type: string
  * Default: “”
  * Importance: low


# Hybrid Deployment of Confluent Platform and Confluent Cloud using Confluent for Kubernetes

Confluent for Kubernetes (CFK) provides cloud-native automation for deploying and managing
Confluent in many hybrid scenarios.

When you are deploying Confluent Platform components to be connected to Confluent Cloud, provide the
basic configuration required in the Confluent Platform component CRs:

* The Confluent Cloud component endpoints
* The Confluent Cloud key and password in the format that each respective
  Confluent Cloud component requires for authentication credentials

See the example GitHub scenarios listed below for details.

There might be additional information required, such as TLS certificates,
depending on your deployment settings.

Refer to the following configuration examples of the hybrid deployment of Confluent Platform
connecting to Confluent Cloud:

* [CFK managed Connectors, ksqlDB, and REST Proxy against a Kafka and a Schema
  Registry in the Confluent Cloud](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/hybrid/ccloud-integration)
* [CFK managed Connect cluster connected to Confluent Cloud, installing and
  managing the JDBC source connector plugin through the declarative Connector CR](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/hybrid/ccloud-connect-confluent-hub)
* [CFK managed Replicator cloning topics from a source Confluent Cloud cluster
  to a destination Confluent Cloud cluster and CFK managed Control Center
  monitoring the end to end flow](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/hybrid/replicator-cloud2cloud)
* [CFK managed Replicator cloning topics from a source Confluent Cloud cluster
  to a CFK managed destination cluster](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/hybrid/replicator-source-ccloud-destCFK-tls)
* [Cluster Linked Confluent Platform to Confluent Cloud](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/hybrid/clusterlink)


# Manage Confluent Platform with Confluent for Kubernetes


* [Overview](co-manage-overview.md)
* [Manage Flink](co-manage-flink.md)
* [Manage Kafka Admin REST Class](co-manage-rest-api.md)
* [Manage Kafka Topics](co-manage-topics.md)
* [Manage Schemas](co-manage-schemas-index.md)
* [Manage Connectors](co-manage-connectors.md)
* [Scale Clusters](co-scale-cluster.md)
* [Scale Storage](co-scale-storage.md)
* [Link Kafka Clusters](co-link-clusters.md)
* [Manage Security](co-manage-security.md)
* [Restart Confluent Components](co-roll-cluster.md)
* [Delete Confluent Deployment](co-delete-deployment.md)
* [Manage Confluent Cloud](co-manage-ccloud.md)


## Confluent APIs

The following links open the associated Confluent API docs.

- [Confluent REST Proxy](../kafka-rest/index.md#kafkarest-intro)
- [Connect REST API](../connect/references/restapi.md#connect-userguide-rest)
- [Flink REST API](../flink/clients-api/rest.md#af-rest-api)
- [Schema Registry API](../schema-registry/develop/api.md#schemaregistry-api)
- [ksqlDB REST API](../ksqldb/developer-guide/ksqldb-rest-api/overview.md#ksqldb-rest-api)
- [Metadata API](../security/authorization/rbac/mds-api.md#mds-api)


## Offset management

After the consumer receives its assignment from
the coordinator, it must determine the initial position for each
assigned partition. When the group is first created, before any
messages have been consumed, the position is set according to a
configurable offset reset policy (`auto.offset.reset`). Typically,
consumption starts either at the earliest offset or the latest offset.

As a consumer in the group reads messages from the partitions assigned
by the coordinator, it must commit the offsets corresponding to the
messages it has read. If the consumer crashes or is shut down, its
partitions will be re-assigned to another member, which will begin
consumption from the last committed offset of each partition. If the
consumer crashes before any offset has been committed, then the
consumer which takes over its partitions will use the reset policy.

The offset commit policy is crucial to providing the message delivery
guarantees needed by your application. By default, the consumer is
configured to use an automatic commit policy, which triggers a commit
on a periodic interval. The consumer also supports a commit API which
can be used for manual offset management. Correct offset management
is crucial because it affects [delivery semantics](/platform/current/streams/concepts.html#streams-concepts-processing-guarantees).

By default, the consumer is configured
to auto-commit offsets. The `auto.commit.offset.interval`  property sets the upper time bound of the
commit interval.
Using auto-commit offsets can give you “at-least-once” delivery, but you must consume all data
returned from a `ConsumerRecords<K, V> poll(Duration timeout)` call before any subsequent `poll` calls, or before closing the consumer.

To explain further; when auto-commit is enabled, every time the `poll` method is called and data is fetched,
the consumer is ready to automatically commit the offsets of messages that have been returned by the poll.
If the processing of these messages is not completed before the next auto-commit interval,
there’s a risk of losing the message’s progress if the consumer crashes or is otherwise restarted.
In this case, when the consumer restarts, it will begin consuming from the last committed offset.
When this happens, the last committed position can be as old as the auto-commit interval.
Any messages that have arrived since the last commit are read again.

If you want to reduce the window for duplicates, you can
reduce the auto-commit interval, but some users may want even finer
control over offsets. The consumer therefore supports a commit API
which gives you full control over offsets. Note that when you use the commit API directly, you should first
disable auto-commit in the configuration by setting the
`enable.auto.commit` property to `false`.

Each call to the commit API results in an offset commit request being
sent to the broker. Using the synchronous API, the consumer is blocked
until that request returns successfully. This may reduce overall
throughput since the consumer might otherwise be able to process
records while that commit is pending.

One way to deal with this is to increase the amount of data that is returned
when polling. The consumer has a configuration setting `fetch.min.bytes` which
controls how much data is returned in each fetch. The broker will hold on to
the fetch until enough data is available (or `fetch.max.wait.ms` expires).
The tradeoff, however, is that this also increases the amount of duplicates
that have to be dealt with in a worst-case failure.

A second option is to use asynchronous commits. Instead of waiting for
the request to complete, the consumer can send the request and return
immediately by using asynchronous commits.

So if it helps performance, why not always use asynchronous commits? The main
reason is that the consumer does not retry the request if the commit fails.
This is something that committing synchronously gives you for free; it will
will retry indefinitely until the commit succeeds or an unrecoverable
error is encountered. The problem with asynchronous commits is dealing
with commit ordering. By the time the consumer finds out that a commit
has failed, you may already have processed the next batch of messages
and even sent the next commit. In this case, a retry of the old commit
could cause duplicate consumption.

Instead of complicating the consumer internals to try and handle this
problem in a sane way, the API gives you a callback which is invoked
when the commit either succeeds or fails. If you like, you can use
this callback to retry the commit, but you will have to deal with the
same reordering problem.

Offset commit failures are merely annoying if the following commits
succeed since they won’t actually result in duplicate reads. However,
if the last commit fails before a rebalance occurs or before the
consumer is shut down, then offsets will be reset to the last commit
and you will likely see duplicates. A common pattern is therefore to
combine async commits in the poll loop with sync commits on rebalances
or shut down. Committing on close is straightforward, but you need a way
to hook into rebalances.

Each rebalance has two phases: partition revocation and partition
assignment. The revocation method is always called before a rebalance
and is the last chance to commit offsets before the partitions are
re-assigned. The assignment method is always called after the
rebalance and can be used to set the initial position of the assigned
partitions. In this case, the revocation hook is used to commit the
current offsets synchronously.

In general, asynchronous commits should be considered less safe than
synchronous commits. Consecutive commit failures before a crash will
result in increased duplicate processing. You can mitigate this danger
by adding logic to handle commit failures in the callback or by mixing
occasional synchronous commits, but you shouldn’t add too
much complexity unless testing shows it is necessary. If you need more
reliability, synchronous commits are there for you, and you can still
scale up by increasing the number of topic partitions and the number
of consumers in the group. But if you just want to maximize throughput
and you’re willing to accept some increase in the number of
duplicates, then asynchronous commits may be a good option.

A somewhat obvious point, but one that’s worth making is that
asynchronous commits only make sense for “at least once” message
delivery. To get “at most once,” you need to know if the commit
succeeded before consuming the message. This implies a synchronous
commit unless you have the ability to “unread” a message after you
find that the commit failed.

In the examples, we
show several detailed examples of the commit API and discuss the
tradeoffs in terms of performance and reliability.

When writing to an external system, the consumer’s position must be coordinated
with what is stored as output. That is why the consumer stores its offset in the
same place as its output. For example, a connector populates data in HDFS along
with the offsets of the data it reads so that it is guaranteed that either data
and offsets are both updated, or neither is. A similar pattern is followed for
many other data systems that require these stronger semantics, and for which the
messages do not have a primary key to allow for deduplication.

This is how Kafka supports [exactly-once processing](/platform/current/streams/concepts.html#streams-concepts-processing-guarantees) in
Kafka Streams, and the transactional producer or consumer can be used generally to
provide exactly-once delivery when transferring and processing data between Kafka
topics. Otherwise, Kafka guarantees at-least-once delivery by default, and you
can implement at-most-once delivery by disabling retries on the producer and
committing offsets in the consumer prior to processing a batch of messages.


## Supported Confluent Platform features for Kafka clients

The following tables describes the client support for various Confluent Platform features.

| Feature                             | C/C++   | Go   | Java   | .NET   | Python   |
|-------------------------------------|---------|------|--------|--------|----------|
| Admin API                           | Yes     | Yes  | Yes    | Yes    | Yes      |
| Control Center  metrics integration | Yes     | Yes  | Yes    | Yes    | Yes      |
| Custom partitioner                  | Yes     | No   | Yes    | No     | No       |
| Exactly Once Semantics              | Yes     | Yes  | Yes    | Yes    | Yes      |
| Idempotent Producer                 | Yes     | Yes  | Yes    | Yes    | Yes      |
| Kafka Streams                       | No      | No   | Yes    | No     | No       |
| Record Headers                      | Yes     | Yes  | Yes    | Yes    | Yes      |
| SASL Kerberos/GSSAPI                | Yes     | Yes  | Yes    | Yes    | Yes      |
| SASL PLAIN                          | Yes     | Yes  | Yes    | Yes    | Yes      |
| SASL SCRAM                          | Yes     | Yes  | Yes    | Yes    | Yes      |
| SASL OAUTHBEARER                    | Yes     | Yes  | Yes    | Yes    | Yes      |
| Simplified installation             | Yes     | Yes  | Yes    | Yes    | Yes      |
| Schema Registry                     | Yes     | Yes  | Yes    | Yes    | Yes      |
| Topic Metadata API                  | Yes     | Yes  | Yes    | Yes    | Yes      |


### Enter connection details

In the **Create a new connection** page, enter the details for your Kafka
cluster.

1. In the **General** section, enter the name of the connection and select the
   type of connection.
   - **Connection Name:** An easy-to-remember name to display in the
     **Resources** view.
   - **Connection Type:** In the dropdown, choose one of the following values:
     - Apache Kafka®
     - Confluent Cloud
     - Confluent Platform
     - Warpstream
     - Other
2. In the **Kafka Cluster** section, enter the bootstrap server and
   authentication details.

   Confluent for VS Code supports authenticating to Kafka with most of the
   commonly used SASL authentication mechanisms.
   - **Bootstrap Server(s):** One or more `host:port` pairs to use for
     establishing the initial connection. For more than one server, use a
     comma-separated list.
   - **Authentication Type:** In the dropdown, choose one of the following
     values:
     - Username & Password (SASL/PLAIN)
     - API Credentials (SASL/PLAIN)
     - SASL/SCRAM (supports both `SCRAM-SHA-256` and `SCRAM-SHA-512`)
     - SASL/OAUTHBEARER
     - Kerberos (SASL/GSSAPI)

     #### NOTE
     To use Mutual TLS (mTLS) authentication, expand the
     **TLS Configuration** section and enter **Key Store** and
     **Trust Store** details.
     - **Verify Server Hostname:** Enable verification of the Kafka/Schema Registry host
       name matching the Distinguished Name (DN) in the broker’s certificate.
     - **Key Store Configuration:** Certificate used by Kafka/Schema Registry to
       authenticate the client. This is used to configure mutual TLS (mTLS) authentication.
       - **Path:** The path of the Key Store file.
       - **Password:** The store password for the Key Store file. Key Store
         password is not supported for PEM format.
       - **Key Password:** The password of the private key in the Key Store
         file.
       - **Type:** The file format of the Key Store file. Choose from PEM,
         PKCS12, or JKS.
     - **Trust Store Configuration:** Certificates for verifying SSL/TLS
       connections to Kafka/Schema Registry. This is required if Kafka/Schema Registry use a
       self-signed or a non-public Certificate Authority (CA).
       - **Path:** The path of the Trust Store file.
       - **Password:** The password for the Trust Store file. If a password
         is not set, the configured Trust Store file is used, but integrity
         checking of the Trust Store file is disabled. Trust Store password
         is not supported for PEM format.
       - **Key Password:** The password of the private key in the Trust Store
         file.
       - **Type:** The file format of the Trust Store file. Choose from PEM,
         PKCS12, or JKS..

       Confluent Cloud uses TLS certificates from
       [Let’s Encrypt](https://letsencrypt.org/), a trusted Certificate
       Authority (CA). Confluent Cloud does not support self-managed certificates
       for TLS encryption. For more information, see
       [Manage TLS Certificates](/cloud/current/cp-component/clients-cloud-config.html#manage-tls-certificates).
3. In the **Schema Registry** section, enter the URL of the Schema Registry to use for
   serialization.
4. Click the **Authentication Type** dropdown and choose one of the following
   values:
   - Username & Password
   - API Credentials
   - OAuth

   To use mutual TLS (mTLS) authentication, expand the **TLS Configuration**
   section and enter **Key Store** and **Trust Store** details, as shown in the
   previous step.


#### IMPORTANT
**Breaking Change:** The Catalogs API has changed from CMF 2.0 to 2.1, since the SQL support is still in Open Preview.  In the new version, catalogs and databases are separate resources. A catalog references a Schema Registry instance, and databases (which reference Kafka clusters) are created as separate resources within a catalog.

If you’re migrating from CMF 2.0, see the [previous documentation](https://docs.confluent.io/platform/8.0/flink/overview.html) for reference. Note that when you upgrade to CMF 2.1, CMF will automatically migrate your existing catalog objects to the new catalogs and databases format.

A core concept of SQL is the table. Tables store data, represented as rows. Users can query
and modify the rows of a table by running SQL queries and Data Definition Language (DDL) statements.
Most database systems store, manage, and process table data internally.
In contrast, Flink SQL is solely a processing engine and not a data store.
Flink accesses external data storage systems to read and write data.

Catalogs and databases bridge the gap between the SQL engine and external data storage systems, enabling users to access and manipulate data stored
in various formats and locations.

Confluent Manager for Apache Flink® features built-in Kafka catalogs to connect to Kafka and Schema Registry.
A Kafka Database exposes Kafka topics as tables and derives their schema from Schema Registry.

**Catalogs and Databases:**
A *catalog* contains one or more *databases*. A catalog references a Schema Registry instance, which is used to derive
table schemas from topic schemas. Each catalog can have multiple databases.
A *database* references a Kafka cluster and contains tables that correspond to the topics in that cluster.
Each topic of a Kafka cluster is represented as a TABLE in the database.

**Hierarchy:**
- CATALOG → references a Schema Registry instance
- DATABASE → references a Kafka cluster (contained within a catalog)
- TABLE → corresponds to a Kafka topic (contained within a database)

Catalogs are accessible from all CMF environments, but there are ways to restrict
access to specific catalogs or databases.


### Management and monitoring features

Confluent Platform provides several features to supplement Kafka’s Admin API, and built-in JMX monitoring.

- [Confluent Control Center](/control-center/current/overview.html), which is a web-based system for managing and monitoring Kafka.
  It allows you to easily manage Kafka Connect, to create, edit, and manage connections to other systems.
  Control Center also enables you to monitor data streams from producer to consumer, assuring that every
  message is delivered, and measuring how long it takes to deliver messages.
  Using Control Center, you can build a production data pipeline based on Kafka without writing a line of code.
- [Health+](../health-plus/index.md#health-plus), also a web-based tool to help ensure the health of your clusters and
  minimize business disruption with intelligent alerts, monitoring, and proactive support.
- [Metrics reporter](../monitor/metrics-reporter.md#metrics-reporter) for collecting various metrics from a Kafka cluster.
  The metrics are produced to a topic in a Kafka cluster.


### ksqlDB Creates the Physical Plan

From the logical plan, the ksqlDB engine creates the physical plan,
which is a Kafka Streams DSL application with a schema.

The generated code is based on the ksqlDB classes, `SchemaKStream` and
`SchemaKTable`:

- A ksqlDB stream is rendered as a
  [SchemaKStream](https://github.com/confluentinc/ksql/blob/master/ksqldb-engine/src/main/java/io/confluent/ksql/structured/SchemaKStream.java)
  instance, which is a
  [KStream](https://docs.confluent.io/current/streams/javadocs/org/apache/kafka/streams/kstream/KStream.html)
  with a
  [Schema](/platform/current/connect/javadocs/javadoc/org/apache/kafka/connect/data/Schema.html).
- A ksqlDB table is rendered as a
  [SchemaKTable](https://github.com/confluentinc/ksql/blob/master/ksqldb-engine/src/main/java/io/confluent/ksql/structured/SchemaKTable.java)
  instance, which is a
  [KTable](https://docs.confluent.io/current/streams/javadocs/org/apache/kafka/streams/kstream/KTable.html)
  with a
  [Schema](/platform/current/connect/javadocs/javadoc/org/apache/kafka/connect/data/Schema.html).
- Schema awareness is provided by the
  [SchemaRegistryClient](https://github.com/confluentinc/schema-registry/blob/master/client/src/main/java/io/confluent/kafka/schemaregistry/client/SchemaRegistryClient.java)
  class.

The ksqlDB engine traverses the nodes of the logical plan and emits
corresponding Kafka Streams API calls:

1. Define the source – a `SchemaKStream` or `SchemaKTable` with info
   from the ksqlDB metastore
2. Filter – produces another `SchemaKStream`
3. Project – `select()` method
4. Apply aggregation – Multiple steps: `rekey()`, `groupby()`, and
   `aggregate()` methods. ksqlDB may re-partition data if it’s not
   keyed with a GROUP BY phrase.
5. Filter – `filter()` method
6. Project – `select()` method for the result

![Diagram showing how the ksqlDB engine creates a physical plan for a SQL statement](ksqldb/images/ksql-statement-physical-plan.gif)

If the DML statement is CREATE STREAM AS SELECT or CREATE TABLE AS
SELECT, the result from the generated Kafka Streams application is a
persistent query that writes continuously to its output topic until the
query is terminated.


## Clients

This release updates client libraries with new authentication methods, improved error handling, and more flexible APIs.

* [KIP-1139 OAuth support enhancements:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1139%3A+Add+support+for+OAuth+jwt-bearer+grant+type) Kafka clients now support the `jwt-bearer` grant type for OAuth, in addition to `client_credentials`. This grant
  type is supported by many identity providers and avoids the need to store secrets in client configuration files.
* [KIP-877 Register metrics for plugins and connectors:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-877%3A+Mechanism+for+plugins+and+connectors+to+register+metrics) With KIP-877,
  your client-side plugins can implement the `Monitorable` interface to register their own metrics.
  Tags that identify the plugin are automatically injected and the metrics use the `kafka.CLIENT:type=plugins`
  naming convention, where `CLIENT` is either `producer`, `consumer`, or `admin`.
* [KIP-1050 Consistent error handling for transactions:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1050%3A+Consistent+error+handling+for+Transactions) KIP-1050 groups
  all transactional errors into five distinct types to ensure consistent error handling across all client SDKs and Producer APIs.
  The five types are as follows:
  - Retriable: retry only
  - Retriable: refresh metadata and retry
  - Abortable
  - Application-Recoverable
  - Invalid-Configuration
* [KIP-1092 Extend Consumer#close for Kafka Streams:](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=321719077) KIP-1092 adds a new `Consumer.close(CloseOptions)` method.
  This new method lets Kafka Streams control whether a consumer explicitly leaves its group on shutdown, which gives you
  finer control over rebalances. The `Consumer.close(Duration)` method is now deprecated.
* [KIP-1142 List non-existent groups with dynamic configurations:](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1142%3A+Allow+to+list+non-existent+group+which+has+dynamic+config) KIP-1142 enables the `ListConfigResources` API
  to retrieve configurations for non-existent consumer groups that have dynamic configurations defined.
* **UAMI support**: You can now configure your client to use an Azure User-Assigned Managed Identity (UAMI) to
  authenticate with an IdP like Microsoft Entra ID. This feature uses Azure’s native identity management to fetch tokens
  automatically, which eliminates the need to manage static client IDs and secrets. If you are a Java client user,
  import the latest `kafka-client-plugin` Maven artifact. If you use Confluent-provided non-Java clients, you can
  use this feature with the latest version of your client. For more information, see [Configure Azure User Assigned Managed Identity OAuth for Confluent Platform](../security/authentication/sasl/oauthbearer/uami.md#oauth-uami-share).


#### Next Steps

- If you are extending to cloud hybrid with continuous migration from a primary self-managed cluster to the cloud, the migration is complete.
- If this is a one-time migration to Confluent Cloud, the next step is to use [Replicator](../../multi-dc-deployments/replicator/replicator-quickstart.md#replicator-quickstart) to migrate your topics to the cloud cluster.
- For information on how to manage schemas and storage space in Confluent Cloud through the REST API, see [Manage Schemas in Confluent Cloud](/cloud/current/sr/index.html).
- Looking for a guide on how to configure and use Schema Registry in Confluent Cloud? See [Quick Start for Schema Management on Confluent Cloud](/cloud/current/get-started/schema-registry.html) and [Quick Start for Apache Kafka using Confluent Cloud](/cloud/current/get-started/index.html).


### How it Works

When a client communicates to the Schema Registry HTTPS endpoint, Schema Registry passes the client
credentials to Metadata Service (MDS) for authentication. MDS is a REST layer
on the Kafka broker within Confluent Server, and it integrates with LDAP to
authenticate end users on behalf of Schema Registry and other Confluent Platform services such as
Connect, Confluent Control Center, and ksqlDB. As shown in [Scripted Confluent Platform Demo](../../tutorials/cp-demo/index.md#cp-demo), clients must have
predefined LDAP entries.

Once a client is authenticated, you must enforce that only authorized entities
have access to the permitted resources. You can use [ACLs](../../confluent-security-plugins/schema-registry/authorization/sracl_authorizer.md#confluentsecurityplugins-sracl-authorizer),
RBAC, or both to do so. While ACLs and RBAC can be used together or independently,
RBAC is the preferred solution as it provides finer-grained authorization and
a unified method for managing access across Confluent Platform.

The combined authentication and authorization workflow for a Kafka client connecting to Schema Registry is shown in the diagram below.

![image](images/sr-rbac-rest-api-request.png)


# Use Multi-Protocol Authentication in Confluent Platform

Confluent Platform clusters support multi-protocol authentication, allowing two or more authentication
protocols to be configured and used simultaneously for Confluent Platform services in your cluster.

Kafka supports SASL/PLAIN, SASL/PLAIN with LDAP server, SASL/OAUTHBEARER (OAuth 2.0),
SASL/GSSAPI (Kerberos), SASL/SCRAM, and mutual TLS (mTLS). All other Confluent Platform services
support only OAuth 2.0, mutual TLS, and HTTP Basic Authentication.

Typically, one authentication protocol is used for a given Confluent Platform service in a deployment.
However, there are scenarios where supporting multiple protocols simultaneously can
be beneficial.

You can use Confluent Platform services to support multiple protocols for authenticating incoming
client requests in certain scenarios. With multi-protocol authentication, a Confluent Platform
service can be configured to authenticate clients that use OAuth 2.0, mTLS
certificates, or HTTP Basic Authentication. You are not constrained to use only
one protocol.

* **Transitioning from HTTP Basic Authentication to OAuth 2.0**: If an application
  is migrating from HTTP Basic Authentication to OAuth 2.0, supporting both
  temporarily allows a smooth transition for clients. Existing clients can
  continue using HTTP Basic Authentication while new clients adopt OAuth 2.0.
* **Supporting diverse clients**: Some legacy clients may only support HTTP Basic
  Authentication, while newer clients can leverage OAuth 2.0. Supporting both
  ensures the application is accessible to a wider range of clients.
* **Providing options**: Offering a choice between simpler HTTP Basic Authentication
  for internal or trusted clients and more secure OAuth 2.0 for external clients can
  be appropriate in some architectures.
* **Different internal and external communication**: If a Confluent Platform cluster needs to
  authenticate with mutual TLS for all internal platform communications (for
  example, Schema Registry to Confluent Server, Connect to Schema Registry, Confluent Server to Schema Registry, or Connect to Confluent Server)
  and use OAuth 2.0 to authenticate client applications to CP services, you can
  configure use multi-protocol authentication to support your requirements.

However, there are important considerations and best practices to keep in mind:

* HTTP Basic Authentication should only be used over HTTPS to protect credentials.
  Because credentials are sent with every request, it is less secure than OAuth 2.0.
* OAuth 2.0 is the recommended modern standard for authorization. It provides
  better security through tokens and allows for [granular access control](../../../_glossary.md#term-granularity).
* The implementation must be carefully designed to avoid conflicts between the
  two authentication methods. The order of authentication filters and proper
  configuration is critical.
* Long-term, fully migrating to OAuth 2.0 and phasing out HTTP Basic Authentication
  lets you reduce complexity and improve overall security.

In summary, while supporting multiple authentication methods simultaneously can
be justified in some cases, it adds complexity. The recommendation is to prefer OAuth
2.0 as the more modern, secure protocol and only introduce HTTP Basic Authentication
support thoughtfully for legacy compatibility when necessary. A clear roadmap to
eventually standardize on OAuth 2.0 should be part of the plan.


## Registering clusters

You can use either [curl commands](#register-clusters-curl) or the
Confluent Platform [CLI](https://docs.confluent.io/confluent-cli/current/command-reference/cluster/confluent_cluster_register.html) to register clusters.

When registering a Confluent Platform cluster in the cluster registry, you must specify following
information:

Cluster name
: The new name of the Confluent Platform cluster to be used in RBAC role bindings and
  centralized audit logs.

Cluster ID
: Refer to [View a cluster ID](authorization/rbac/rbac-get-cluster-ids.md#view-cluster-ids) if you need to locate the cluster ID.

Host name and port number
: The host and ports defined for a cluster should only include ports that
  support [RBAC token authentication](authentication/sasl/oauthbearer/overview.md#rbac-token-auth). For example, in
  Confluent Platform clusters, do not use the interbroker port or external Kerberos or mTLS
  ports. This is most important when using the [Confluent Metadata API Reference for Confluent Platform](authorization/rbac/mds-api.md#mds-api)
  because it leverages port information when pushing configuration updates out
  to known Confluent Platform clusters.

Protocol used by the hosts and ports
: The protocol should be SASL_SSL for Confluent Platform clusters (or SASL_PLAINTEXT for
  non-production Confluent Platform clusters), and HTTP
  or HTTPS for Connect, ksqlDB, and Schema Registry clusters.

Be sure to grant the appropriate [RBAC roles](authorization/rbac/rbac-predefined-roles.md#rbac-predefined-roles)
(ClusterAdmin and SystemAdmin) on newly-registered clusters so that users can
access and use them in other configurations. Also be sure to grant the AuditAdmin
role to principals who will be administering the centralized audit log
configuration. For details about granting roles on registered clusters,
see [Configuring role bindings for registered clusters](#cluster-registry-rolebinding).


# Configure MDS to Manage Centralized Audit Logs

You can use [Centralized audit logging](audit-logs-cli-config.md#audit-log-cli-config)
to dynamically update an audit log configuration. Changes made through
the Confluent CLI are pushed from the MDS (metadata service) out to all
[registered clusters](../../cluster-registry.md#cluster-registry), allowing for the centralized
management of the audit log configuration, and assurance that all registered
clusters publish their audit log events to the same destination Kafka cluster.

The MDS uses an admin client to connect to the destination Kafka cluster and
inspect, create, and alter destination topics in response to certain API requests.

Before you can use centralized audit logging, you must configure one of your
Kafka clusters to run the metadata service (MDS), which provides API endpoints to
register a list of the Kafka clusters in your organization and to centrally manage
the audit log configurations of those clusters. This audit log configuration
API pushes out to all registered clusters the rules governing which events are
captured and where they are sent. It also creates missing destination topics,
and keeps the retention time policies of the destination topics in sync with
the audit log configuration policy.

Until configured otherwise, the MDS operates on the assumption that it is a lone
cluster, using its own internal admin client to configure destination topics on
itself, and leaving the bootstrap servers unspecified so that audit log destination
topics are on the same cluster. Because the default behavior requires less
configuration, it is useful for the initial setup in a development environment.

In a production setting, you should have all of the Kafka clusters publish their
audit logs to a single, central destination cluster. The configuration file for
each managed cluster must include the destination cluster’s connection and
credential information. However, it should disable auto-creation of destination
topics, and leave `confluent.security.event.router.config` unspecified.

The following sections explain how to configure Kafka clusters and the MDS to manage
centralized audit logs.


## Security deployment profiles

The following table defines the different security options that make up a
security deployment profile for Confluent Platform clusters. The security options that make up
a deployment profile are:

- **Authentication**
  - **Kafka client**: Options for Kafka clients authenticating to Confluent Server brokers
  - **Kafka client to non-Kafka component**: Options for Kafka clients
    authenticating to Schema Registry, REST Proxy, and ksqlDB
  - **Service-to-service**: Options for authentication between any two Confluent Platform services, for
    example Schema Registry to Confluent Server, Connect to Schema Registry, and so forth
  - **User**: Options for users authenticating using Confluent Control Center or Confluent CLI
- **Authorization**: Options for controlling access to resources in your Confluent Platform cluster
- **Encryption**: Options for encrypting data in motion (or data in transit)
- **Identity provider protocols**: Options for integrating with external identity providers

<style type="text/css">
 .tg  {border-collapse:collapse;border-spacing:0;}
 .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
   overflow:hidden;padding:10px 5px;word-break:normal;}
 .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
   font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
 .tg
 .tg-ihjb{background-color:#f5f7ff;font-size:12px;font-weight:bold;text-align:left;vertical-align:bottom}
 .tg-ihjbr{background-color:#f5f7ff;font-size:12px;font-weight:bold;text-align:left;vertical-align:bottom:transform: rotate(90deg); /\* Rotate the text 90 degrees counterclockwise \*/ writing-mode: vertical-lr; /\* Ensure proper vertical text alignment \*/ text-align: center; /\* Center the text \*/ vertical-align: middle; /\* Center text vertically \*/}
 .tg .tg-icm3{background-color:#f8f8f8;font-size:12px;text-align:left;vertical-align:top}
 .tg .tg-gknu{background-color:#f5f7ff;font-size:12px;font-weight:bold;text-align:left;vertical-align:bottom}
 .tg .tg-r6a2{font-size:12px;text-align:left;vertical-align:top}
 .tg .tg-z9od{font-size:12px;text-align:left;vertical-align:top}
 .tg
 .tg-pui5{background-color:#f5f7ff;font-size:14px;font-weight:bold;text-align:center;vertical-align:bottom}
 .tg ul.custom-ul li {font-size: 12px !important;margin-bottom: 2px !important;}
</style>
<table class="tg"><thead>
  <tr>
  <th class="tg-ihjbr" rowspan="2">Profile</th>
  <th class="tg-pui5" colspan="4">Authentication</th>
  <th class="tg-ihjbr" rowspan="2">Authorization</th>
  <th class="tg-ihjbr" rowspan="2">Encryption</th>
  <th class="tg-ihjb" rowspan="2">Identity provider protocols</th>
  </tr>
  <tr>
  <th class="tg-gknu">Kafka clients</th>
  <th class="tg-gknu">Client to non-Kafka component</th>
  <th class="tg-gknu">Service to service</th>
  <th class="tg-gknu">User</th>
  </tr></thead>
<tbody>
  <tr>
  <td class="tg-icm3">1</td>
  <td class="tg-z9od">
  mTLS or SASL with one of:
  <ul class="custom-ul">
  <li >PLAIN</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-z9od">mTLS</td>
  <td class="tg-z9od">mTLS</td>
  <td class="tg-z9od">HTTP Basic Authentication</td>
  <td class="tg-z9od">ACLs</td>
  <td class="tg-z9od">TLS</td>
  <td class="tg-z9od"> </td>
  </tr>
  <tr>
  <td class="tg-icm3">2</td>
  <td class="tg-icm3">
  mTLS or SASL with one of:
  <ul class="custom-ul">
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-icm3">GSSAPI or SCRAM</td>
  <td class="tg-icm3">mTLS</td>
  <td class="tg-icm3">HTTP Basic Authentication</td>
  <td class="tg-icm3">ACLs</td>
  <td class="tg-icm3">TLS</td>
  <td class="tg-icm3"> </td>
  </tr>
  <tr>
  <td class="tg-z9od">3</td>
  <td class="tg-z9od">
  mTLS or SASL with one of:
  <ul class="custom-ul">
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">PLAIN with LDAP server</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-z9od">HTTP Basic Authentication</td>
  <td class="tg-z9od">OAuthBearer (powered by LDAP and MDS-issued tokens)</td>
  <td class="tg-z9od">HTTP Basic Authentication</td>
  <td class="tg-z9od">RBAC</td>
  <td class="tg-z9od">TLS</td>
  <td class="tg-z9od">LDAP</td>
  </tr>
  <tr>
  <td class="tg-icm3">4</td>
  <td class="tg-icm3">
  mTLS or SASL with one of:
  <ul class="custom-ul">
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">PLAIN with LDAP server</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
   </td>
  <td class="tg-icm3">HTTP Basic Authentication</td>
  <td class="tg-icm3">OAuthBearer powered by LDAP and MDS-issued tokens</td>
  <td class="tg-icm3">OIDC (SSO)</td>
  <td class="tg-icm3">RBAC</td>
  <td class="tg-icm3">TLS</td>
  <td class="tg-icm3">Both OIDC and LDAP</td>
  </tr>
  <tr>
  <td class="tg-z9od">5</td>
  <td class="tg-z9od">
  mTLS or SASL with one of:
  <ul class="custom-ul">
  <li class="custom-ul">OAUTHBEARER with IdP-issued tokens</li>
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">PLAIN with LDAP server</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-z9od">HTTP Basic Authentication</td>
  <td class="tg-z9od">OAuthBearer with IdP-issued tokens</td>
  <td class="tg-z9od">OIDC (SSO)</td>
  <td class="tg-z9od">RBAC</td>
  <td class="tg-z9od">TLS</td>
  <td class="tg-z9od">Both OIDC and LDAP</td>
  </tr>
  <tr>
  <td class="tg-icm3">6</td>
  <td class="tg-icm3">
  mTLS or SASL with one of:
  <ul class="custom-ul">
  <li class="custom-ul">OAUTHBEARER with IdP-issued tokens</li>
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-icm3">mTLS</td>
  <td class="tg-icm3">mTLS</td>
  <td class="tg-icm3">HTTP Basic Authentication</td>
  <td class="tg-icm3">ACLs</td>
  <td class="tg-icm3">TLS</td>
  <td class="tg-icm3">OIDC</td>
  </tr>
  <tr>
  <td class="tg-z9od">7</td>
  <td class="tg-z9od">
  mTLS or SASL with one of:
  <ul class="custom-ul">
  <li class="custom-ul">OAUTHBEARER with IdP-issued tokens</li>
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">PLAIN with LDAP server</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-z9od">
  OAuthBearer with IdP-issued tokens or HTTP Basic Authentication</td>
  <td class="tg-z9od">OAuthBearer with IdP-issued tokens</td>
  <td class="tg-z9od">OIDC (SSO)</td>
  <td class="tg-z9od">RBAC</td>
  <td class="tg-z9od">TLS</td>
  <td class="tg-z9od">Both OIDC and LDAP</td>
  </tr>
  <tr>
  <td class="tg-icm3">8</td>
  <td class="tg-icm3">
  mTLS or SASL with one of:
  <ul class="custom-ul">
  <li class="custom-ul">OAUTHBEARER with IdP-issued tokens</li>
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-icm3">mTLS or OAuthBearer with IdP-issued tokens </td>
  <td class="tg-icm3">mTLS</td>
  <td class="tg-icm3">HTTP Basic Authentication</td>
  <td class="tg-icm3">ACLs</td>
  <td class="tg-icm3">TLS</td>
  <td class="tg-icm3">OIDC</td>
  </tr>
  <tr>
  <td class="tg-z9od">9</td>
  <td class="tg-z9od">
  mTLS or SASL with one of:
 <ul class="custom-ul">
  <li class="custom-ul">OAUTHBEARER with IdP-issued tokens</li>
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-z9od">OAuthBearer with IdP-issued tokens</td>
  <td class="tg-z9od">OAuth</td>
  <td class="tg-z9od">OIDC (SSO)</td>
  <td class="tg-z9od">RBAC</td>
  <td class="tg-z9od">TLS</td>
  <td class="tg-z9od">OIDC</td>
  </tr>
  <tr>
  <td class="tg-icm3">10</td>
  <td class="tg-icm3">
  mTLS or SASL with one of:
 <ul class="custom-ul">
  <li class="custom-ul">OAUTHBEARER with IdP-issued tokens</li>
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-icm3">OAuthBearer with IdP-issued tokens</td>
  <td class="tg-icm3">mTLS</td>
  <td class="tg-icm3">OIDC (SSO)</td>
  <td class="tg-icm3">RBAC</td>
  <td class="tg-icm3">TLS</td>
  <td class="tg-icm3">OIDC</td>
  </tr>
   <tr>
  <td class="tg-z9od">11</td>
  <td class="tg-z9od">
  mTLS or SASL with one of:
 <ul class="custom-ul">
  <li class="custom-ul">OAUTHBEARER with IdP-issued tokens</li>
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-z9od">mTLS</td>
  <td class="tg-z9od">mTLS</td>
  <td class="tg-z9od">OIDC (SSO)</td>
  <td class="tg-z9od">RBAC</td>
  <td class="tg-z9od">TLS</td>
  <td class="tg-z9od">OIDC</td>
  </tr>
  <tr>
  <td class="tg-icm3">12</td>
  <td class="tg-icm3">
  mTLS or SASL with one of:
 <ul class="custom-ul">
  <li class="custom-ul">OAUTHBEARER with IdP-issued tokens</li>
  <li class="custom-ul">PLAIN</li>
  <li class="custom-ul">GSSAPI</li>
  <li class="custom-ul">SCRAM</li>
  </ul>
  </td>
  <td class="tg-icm3">mTLS</td>
  <td class="tg-icm3">mTLS</td>
  <td class="tg-icm3">Basic username, password (file-based user identity management)</td>
  <td class="tg-icm3">RBAC</td>
  <td class="tg-icm3">TLS</td>
  <td class="tg-icm3">Not applicable</td>
  </tr>
</tbody></table>


You can also deploy Apache Flink® within the Confluent Platform. Deployments that include Flink make
use of mTLS authentication with one of the following client to non-Kafka
component:

- PLAIN
- GSSAPI
- SCRAM

Service to service and user authentication also use mTLS in Apache Flink®
deployments. Authorization uses HTTP basic authentication with encryption
through ACLS. TLS is the supported provider protocol for deployments that use Flink.


## Background and context

Here we recap some fundamental building blocks that will be useful for the rest of this section.
This is mostly a summary of aspects of the Kafka Streams
[architecture](architecture.md#streams-architecture) that impact performance.
You might want to refresh your understanding of sizing just for Kafka first, by revisiting notes on [Production Deployment](../kafka/deployment.md#cp-production-recommendations)
and how to [choose the number of topics/partitions](https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/).

**Kafka Streams uses Kafka’s producer and consumer APIs**: under the hood a Kafka Streams
application has Kafka producers and consumers, just like a typical Kafka client.
So when it will come time to add more Kafka Streams instances, think of that as adding more
producers and consumers to your app.

**Unit of parallelism is a task**: In Kafka Streams the basic unit of parallelism is a stream task.
Think of a task as consuming from a single Kafka partition per topic
and then processing those records through a graph of processor nodes.
If the processing is stateful, then the task writes to state stores
and produces back to one or more Kafka partitions.
To improve the potential parallelism, there is just one tuning knob: choose a higher number
of [partitions](https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/) for your topics. That will automatically lead to a proportional increase in number of tasks.

**Task placement matters**: Increasing the number of partitions/tasks increases the potential for parallelism,
but we must still decide where to place those tasks physically. There are two options: scale up, by putting all the tasks
on a single server. This is useful when the app is CPU bound and one server has a lot of
CPUs. You can do this by having an app with lots of threads (`num.stream.threads` config option, with
a default of 1) or equivalently have clones of the app running on the same machine, each with 1 thread. There
should not be any performance difference between the two.
The second option is to scale out, by spreading the tasks across more than
one machine. This is useful when the app is network, memory or disk bound, or if a single server has a
limited number of CPU cores.

**Load balancing is automatic**: Once you decide how many partitions you need and where to start
Kafka Streams instances, the rest is automatic. The load is balanced across the tasks with no
user involvement because of the consumer group management [feature](https://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0-9-consumer-client/) that is part of Kafka. Kafka
Streams benefits from it, because, as we mentioned earlier, it is a client of Kafka too in this context.

Armed with this information, let’s look at a couple of key scenarios.


## Configuration prerequisites

Before you configure hybrid setup, ensure the following:

* Your Confluent Cloud organization has the [Advanced Governance](https://docs.confluent.io/cloud/current/stream-governance/packages.html#stream-gov-packages) package.
* You have [registered at least one Confluent Platform cluster with Confluent Cloud](https://docs.confluent.io/cloud/current/usm/register/overview.html) so it is visible in the Confluent Cloud UI.
* Your Schema Registry on Confluent Platform has its mode set to `READWRITE` at the global level `:__GLOBAL:` context. Ensure no mode overrides are set at the subject or custom context level. You can remove any specific overrides by using the `DELETE` mode API against all relevant subjects and custom contexts.
* You can perform this configuration using Confluent for Kubernetes, Ansible Playbooks for Confluent Platform or by making calls directly to the Confluent REST API. The subsequent workflows provide separate instructions for each approach.
* You have the necessary credentials and endpoints for both your Confluent Platform and Confluent Cloud Schema Registry instances.
  You can obtain the private endpoints by logging into Confluent Cloud Console and navigating to **Environment** > **Schema Registry** > **Endpoints**.
* Create an API key that uses a service account assigned the `DataSteward` role.
  This ensures the service account has the permissions that it needs to manage schemas and enforce governance
  policies across both Confluent Platform and Confluent Cloud environments. For instructions, see [Add an API key](https://docs.confluent.io/cloud/current/security/authenticate/workload-identities/service-accounts/api-keys/manage-api-keys.html#add-an-api-key).

  #### NOTE
  Before exporting schemas from Confluent Platform to Confluent Cloud, you must run the [schema-compatibility-check script](https://github.com/confluentinc/schema-registry/blob/master/bin/schema-compatibility-check) if the target Confluent Cloud environment already contains existing schemas. This script verifies that the schemas in both environments are compatible.

  This export process is only recommended when your Confluent Cloud and Confluent Platform environments are replicas, or when your Confluent Cloud environment is a direct subset of your Confluent Platform environment.


## Enable RBAC with parallel restart of Confluent Platform

1. Set the following and provide the required properties for RBAC in your hosts
   inventory file:
   ```yaml
   rbac_enabled: true
   ```

   For a list of all the RBAC-related properties, refer to
   [Role-based access control](ansible-authorize.md#ansible-authz-rbac).

   Below is an example snippet:
   ```yaml
   all:
     vars:
       ssl_enabled: true
       rbac_enabled: true
       mds_ssl_client_authentication: required
       # super user credentials for bootstrapping RBAC within Confluent Platform
       mds_super_user: mds
       mds_super_user_password: password
       # LDAP users for Confluent Platform components
       kafka_broker_ldap_user: kafka_broker
       kafka_broker_ldap_password: password
       schema_registry_ldap_user: schema_registry
       schema_registry_ldap_password: password
       kafka_connect_ldap_user: connect_worker
       kafka_connect_ldap_password: password
       ksql_ldap_user: ksql
       ksql_ldap_password: password
       kafka_rest_ldap_user: rest_proxy
       kafka_rest_ldap_password: password
       control_center_next_gen_ldap_user: control_center
       control_center_next_gen_ldap_password: password

   kafka_broker:
     vars:
       kafka_broker_custom_properties:
         ldap.java.naming.factory.initial: com.sun.jndi.ldap.LdapCtxFactory
         ldap.com.sun.jndi.ldap.read.timeout: 3000
         ldap.java.naming.provider.url: ldap://ldap1:389
         ldap.java.naming.security.principal: uid=mds,OU=rbac,DC=example,DC=com
         ldap.java.naming.security.credentials: password
         ldap.java.naming.security.authentication: simple
         ldap.user.search.base: OU=rbac,DC=example,DC=com
         ldap.group.search.base: OU=rbac,DC=example,DC=com
         ldap.user.name.attribute: uid
         ldap.user.memberof.attribute.pattern: CN=(.*),OU=rbac,DC=example,DC=com
         ldap.group.name.attribute: cn
         ldap.group.member.attribute.pattern: CN=(.*),OU=rbac,DC=example,DC=com
         ldap.user.object.class: account
   ```
2. Run the `confluent.platform.all` playbook:
   ```bash
   ansible-playbook -i <your hosts file> confluent.platform.all \
     --skip-tags package \
     -e deployment_strategy=parallel
   ```

   Include the `--skip-tags package` option to skip the package installation
   tasks and to ensure no upgrade happens. The option also speeds up the
   reconfiguration process.


### Produce Records

1. Build the producer and consumer binaries:
   ```bash
   cargo build
   ```

   You should see:
   ```text
   Compiling rust_kafka_client_example v0.1.0 (/path/to/repo/examples/clients/cloud/rust)
   Finished dev [unoptimized + debuginfo] target(s) in 2.85s
   ```
2. Run the producer, passing in arguments for:
   - the local file with configuration parameters to connect to your Kafka cluster
   - the topic name

   ```bash
   ./target/debug/producer --config $HOME/.confluent/librdkafka.config --topic test1
   ```
3. Verify the producer sent all the messages. You should see:
   ```text
   Preparing to produce record: alice 0
   Preparing to produce record: alice 1
   Preparing to produce record: alice 2
   Preparing to produce record: alice 3
   Preparing to produce record: alice 4
   Preparing to produce record: alice 5
   Preparing to produce record: alice 6
   Preparing to produce record: alice 7
   Preparing to produce record: alice 8
   Successfully produced record to topic test1 partition [5] @ offset 117
   Successfully produced record to topic test1 partition [5] @ offset 118
   Successfully produced record to topic test1 partition [5] @ offset 119
   Successfully produced record to topic test1 partition [5] @ offset 120
   Successfully produced record to topic test1 partition [5] @ offset 121
   Successfully produced record to topic test1 partition [5] @ offset 122
   Successfully produced record to topic test1 partition [5] @ offset 123
   Successfully produced record to topic test1 partition [5] @ offset 124
   Successfully produced record to topic test1 partition [5] @ offset 125
   ```
4. View the [producer code](https://github.com/confluentinc/examples/tree/latest/clients/cloud/rust/src/producer.rs).


#### Oracle XStream CDC Source connector

The [Source connector service account](#cloud-service-account-source-connectors) section provides basic ACL
entries for source connector service accounts. Oracle XStream CDC Source connector
require additional ACL entries. Add the following ACL entries for Oracle XStream
CDC Source connector:

* ACLs to create and write to change event topics prefixed with `<topic.prefix>`. Use the
  following commands to set these ACLs:
  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations create --prefix --topic "<topic.prefix>"
  ```

  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations write --prefix --topic "<topic.prefix>"
  ```
* ACLs to describe configurations at the cluster scope level. Use the following commands
  to set these ACLs:
  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --cluster-scope --operations describe
  ```

  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --cluster-scope --operations describe_configs
  ```
* ACLs to read, create, and write to schema history topics prefixed with
  `__orcl-schema-changes.<topic.prefix>.lcc-`. Use the following commands to set these ACLs:
  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations read --prefix --topic "__orcl-schema-changes.<topic.prefix>.lcc-"
  ```

  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations create --prefix --topic "__orcl-schema-changes.<topic.prefix>.lcc-"
  ```

  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations write --prefix --topic "__orcl-schema-changes.<topic.prefix>.lcc-"
  ```
* ACLs to read schema history consumer group named `<topic.prefix>-schemahistory`. Use the
  following commands to set these ACLs:
  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations "read" --consumer-group "<topic.prefix>-schemahistory"
  ```

The following additional ACL entries are required if heartbeats are enabled for the connector
using the `heartbeat.interval.ms` configuration property.

* ACLs to read, create, and write to heartbeat topics prefixed with `__orcl-heartbeat.lcc-`.
  Use the following commands to set these ACLs:
  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations read --prefix --topic "__orcl-heartbeat.lcc-"
  ```

  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations create --prefix --topic "__orcl-heartbeat.lcc-"
  ```

  ```none
   confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations write --prefix --topic "__orcl-heartbeat.lcc-"
  ```

The following additional ACL entries are required if signaling using Kafka topic is enabled
and configured for the connector using the `signal.enabled.channels` and `signal.kafka.topic`
configuration properties.

* ACLs to read from the signaling topic. Use the following commands to set these ACLs:
  ```none
  confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations read --topic "<signal.kafka.topic>"
  ```
* ACLs to read Kafka signaling consumer group named `kafka-signal`. Use the following commands to set these ACLs:
  ```none
  confluent kafka acl create --allow --service-account "<service-account-id>" \
  --operations "read" --consumer-group "kafka-signal"
  ```


## Generate the Delta Configurations

1. Run the script, passing in the configuration file `/tmp/myconfig.properties` you defined above. Reminder: you cannot use the `~/.ccloud/config.json` generated by Confluent Cloud CLI for other Confluent Platform components or clients, which is why you need to manually create your own key=value properties file in the previous section.
   ```bash
   ./ccloud-generate-cp-configs.sh /tmp/myconfig.properties
   ```
2. Verify that your output resembles:
   ```bash
   Confluent Platform Components:
   delta_configs/schema-registry-ccloud.delta
   delta_configs/replicator-to-ccloud-producer.delta
   delta_configs/ksql-server-ccloud.delta
   delta_configs/ksql-datagen.delta
   delta_configs/control-center-ccloud.delta
   delta_configs/connect-ccloud.delta
   delta_configs/connector-ccloud.delta
   delta_configs/ak-tools-ccloud.delta

   Kafka Clients:
   delta_configs/java_producer_consumer.delta
   delta_configs/java_streams.delta
   delta_configs/python.delta
   delta_configs/dotnet.delta
   delta_configs/go.delta
   delta_configs/node.delta
   delta_configs/cpp.delta
   delta_configs/env.delta
   ```
3. Add the delta configuration output to the respective component’s properties file. Remember that these are the *delta* configurations, not the complete configurations.


## Step 2: Apply the Deduplicate Topic action

In the previous step, you created a Flink table that had duplicate rows. In this
step, you apply the Deduplicate Topic action to create an output table that has
only unique rows.

1. In the navigation menu, click **Data portal**.
2. In the **Data portal** page, click the **Environment** dropdown menu and
   select the environment for your workspace.
3. In the **Recently created** section, find your **users** topic and click it
   to open the details pane.
4. Click **Actions**, and in the Actions list, click **Deduplicate topic** to
   open the **Deduplicate topic** dialog.
5. In the **Fields to deduplicate** dropdown, select **user_id**.

   Flink uses the deduplication field as the output message key. This means that
   the output topic’s row key may be different from the input topic’s row key,
   because the deduplication statement’s DISTRIBUTED BY clause determines the
   output topic’s key.

   For this example, the output message key is the `user_id` field.
6. In the **Compute pool** dropdown, select the compute pool you want to use.
7. (Optional) In the **Runtime configuration** section, select
   **Run with a service account** to run the deduplicate query with a
   service account principal. Use this option for production queries.

   #### NOTE
   The service account you select must have the DeveloperManage and
   DeveloperWrite roles to create topics, schemas, and run Flink statements.
   For more information, see [Grant Role-Based Access](../operate-and-deploy/flink-rbac.md#flink-rbac).
8. Click the **Show SQL** toggle to view the statement that the action will
   run.

   For this example, the deduplication query depends on the `registertime` field,
   so you must modify the generated statement to use the `registertime` field
   as the field to sort on.
9. Click **Open SQL editor** to modify the statement.

   A Flink workspace opens with the generated statement in the cell.
10. In the cell, replace `$rowtime` with `registertime` in the `ORDER BY`
    clause.
    ```sql
    CREATE TABLE `<your-environment>`.`<your-kafka-cluster>`.`users_deduplicate` (
           PRIMARY KEY (`user_id`) NOT ENFORCED
    ) DISTRIBUTED BY HASH(
           `user_id`
    ) WITH (
           'changelog.mode' = 'upsert',
           'value.format'='avro-registry',
           'key.format'='avro-registry'
    ) AS SELECT `user_id`, `registertime`, `gender`, `regionid` FROM (
           SELECT *,
                  ROW_NUMBER() OVER (PARTITION BY `user_id` ORDER BY registertime ASC) AS row_num
           FROM `<your-environment>`.`<your-kafka-cluster>`.`users`) WHERE row_num = 1;
    ```
11. Click **Run** to execute the deduplication query.

    The CREATE TABLE AS SELECT statement creates the `users_deduplicate` table
    and populates it with rows from the `users` table using a
    [deduplication query](../reference/queries/deduplication.md#flink-sql-deduplication).
12. When the **Statement status** changes to **Running**, you can query the
    `users_deduplicate` table.


### On-Premises

```none
    --role string                      REQUIRED: Role name of the new role binding.
    --principal string                 REQUIRED: Principal type and identifier using "Prefix:ID" format.
    --kafka-cluster string             Kafka cluster ID for the role binding.
    --schema-registry-cluster string   Schema Registry cluster ID for the role binding.
    --ksql-cluster string              ksqlDB cluster ID for the role binding.
    --connect-cluster string           Kafka Connect cluster ID for the role binding.
    --cmf string                       Confluent Managed Flink (CMF) ID, which specifies the CMF scope.
    --flink-environment string         Flink environment ID, which specifies the Flink environment scope.
    --cluster-name string              Cluster name to uniquely identify the cluster for role binding listings.
    --context string                   CLI context name.
    --resource string                  Resource type and identifier using "Prefix:ID" format.
    --prefix                           Whether the provided resource name is treated as a prefix pattern.
    --client-cert-path string          Path to client cert to be verified by MDS. Include for mTLS authentication.
    --client-key-path string           Path to client private key, include for mTLS authentication.
-o, --output string                    Specify the output format as "human", "json", or "yaml". (default "human")
```


### On-Premises

```none
    --role string                      REQUIRED: Role name of the existing role binding.
    --principal string                 REQUIRED: Principal type and identifier using "Prefix:ID" format.
    --force                            Skip the deletion confirmation prompt.
    --kafka-cluster string             Kafka cluster ID for the role binding.
    --schema-registry-cluster string   Schema Registry cluster ID for the role binding.
    --ksql-cluster string              ksqlDB cluster ID for the role binding.
    --connect-cluster string           Kafka Connect cluster ID for the role binding.
    --cmf string                       Confluent Managed Flink (CMF) ID, which specifies the CMF scope.
    --flink-environment string         Flink environment ID, which specifies the Flink environment scope.
    --cluster-name string              Cluster name to uniquely identify the cluster for role binding listings.
    --context string                   CLI context name.
    --resource string                  Resource type and identifier using "Prefix:ID" format.
    --prefix                           Whether the provided resource name is treated as a prefix pattern.
    --client-cert-path string          Path to client cert to be verified by MDS. Include for mTLS authentication.
    --client-key-path string           Path to client private key, include for mTLS authentication.
-o, --output string                    Specify the output format as "human", "json", or "yaml". (default "human")
```


### On-Premises

```none
    --principal string                 Principal ID, which limits role bindings to this principal. If unspecified, list all principals and role bindings.
    --current-user                     List role bindings assigned to the current user.
    --role string                      Predefined role assigned to "--principal". If "--principal" is unspecified, list all principals assigned the role.
    --kafka-cluster string             Kafka cluster ID, which specifies the Kafka cluster scope.
    --schema-registry-cluster string   Schema Registry cluster ID, which specifies the Schema Registry cluster scope.
    --ksql-cluster string              ksqlDB cluster ID, which specifies the ksqlDB cluster scope.
    --connect-cluster string           Kafka Connect cluster ID, which specifies the Connect cluster scope.
    --cmf string                       Confluent Managed Flink (CMF) ID, which specifies the CMF scope.
    --flink-environment string         Flink environment ID, which specifies the Flink environment scope.
    --client-cert-path string          Path to client cert to be verified by MDS. Include for mTLS authentication.
    --client-key-path string           Path to client private key, include for mTLS authentication.
    --context string                   CLI context name.
    --cluster-name string              Cluster name, which specifies the cluster scope.
    --resource string                  Resource type and identifier using "Prefix:ID" format. If specified with "--role" and no principals, list all principals and role bindings.
    --inclusive                        List role bindings for specified scopes and nested scopes. Otherwise, list role bindings for the specified scopes. If scopes are unspecified, list only organization-scoped role bindings.
-o, --output string                    Specify the output format as "human", "json", or "yaml". (default "human")
```


### Step 2: Produce and consume with Confluent CLI

The following is an example CLI command to produce to `test-topic`:

```text
confluent kafka topic produce test-topic \
  --protocol SASL_SSL \
  --sasl-mechanism PLAIN \
  --bootstrap ":19091,:19092" \
  --username admin --password secret \
  --ca-location scripts/security/snakeoil-ca-1.crt
```

- Specify `--protocol SASL_SSL` for the SASL_SSL/PLAIN authentication.
- Specify `--sasl-mechanism PLAIN` is the mechanism used for SASL_SSL protocol.
  The default is `PLAIN`, so it can be omitted in this scenario.
- `--bootstrap` is the list of hosts that the producer/consumer talks to.
  The list should be the same as what you configured in Step 1. Hosts should be
  separated by commas.
- `--username` and `--password` are the credentials you have set up in the
  JAAS configuration. They can be passed as flags, or you could wait for CLI to
  prompt for it. The second option is more secure.
- `--ca-location` is the path to the CA certificate verifying the broker’s
  key, and it’s required for SSL verification. For more information about
  setting up this flag, refer to [this document](https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#ssl).


## Configure cluster for client monitoring

Use the following configurations to add and update the properties file of every Kafka broker in the cluster.

Considerations:
: - KRaft properties file: `server.properties`
  - File system location: `<kafka_path>/kafka_2.13-3.8.0/config/kraft/server.properties`

1. Add the following configurations to the properties file of every Kafka broker.

   Telemetry Reporter configurations to add:
   ```none
   confluent.telemetry.external.client.metrics.push.enabled=true
   confluent.telemetry.external.client.metrics.delta.temporality=false
   confluent.telemetry.external.client.metrics.subscription.interval.ms.list=60000
   confluent.telemetry.external.client.metrics.subscription.metrics.list=org.apache.kafka.consumer.fetch.manager.fetch.latency.avg,org.apache.kafka.consumer.connection.creation.total,org.apache.kafka.consumer.fetch.manager.fetch.total,org.apache.kafka.consumer.fetch.manager.bytes.consumed.rate,org.apache.kafka.producer.bufferpool.wait.ratio,org.apache.kafka.producer.record.queue.time.avg,org.apache.kafka.producer.request.latency.avg,org.apache.kafka.producer.produce.throttle.time.avg,org.apache.kafka.producer.connection.creation.total,org.apache.kafka.producer.request.total,org.apache.kafka.producer.topic.byte.rate
   ```
2. Update the following configuration in the properties file of every Kafka broker.

   Telemetry Reporter configurations to update:
   ```none
   confluent.telemetry.exporter._c3.metrics.include=io.confluent.kafka.server.request.(?!.*delta).*|io.confluent.kafka.server.socket_server.connections|io.confluent.kafka.server.server.broker.state|io.confluent.kafka.server.replica.manager.leader.count|io.confluent.kafka.server.request.queue.size|io.confluent.kafka.server.broker.topic.failed.produce.requests.rate.1.min|io.confluent.kafka.server.tier.archiver.total.lag|io.confluent.kafka.server.request.total.time.ms.p99|io.confluent.kafka.server.broker.topic.failed.fetch.requests.rate.1.min|io.confluent.kafka.server.broker.topic.total.fetch.requests.rate.1.min|io.confluent.kafka.server.partition.caught.up.replicas.count|io.confluent.kafka.server.partition.observer.replicas.count|io.confluent.kafka.server.tier.tasks.num.partitions.in.error|io.confluent.kafka.server.broker.topic.bytes.out.rate.1.min|io.confluent.kafka.server.request.total.time.ms.p95|io.confluent.kafka.server.controller.active.controller.count|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.total|io.confluent.kafka.server.request.total.time.ms.p999|io.confluent.kafka.server.controller.active.broker.count|io.confluent.kafka.server.request.handler.pool.request.handler.avg.idle.percent.rate.1.min|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.rate.1.min|io.confluent.kafka.server.controller.unclean.leader.elections.rate.1.min|io.confluent.kafka.server.replica.manager.partition.count|io.confluent.kafka.server.controller.unclean.leader.elections.total|io.confluent.kafka.server.partition.replicas.count|io.confluent.kafka.server.broker.topic.total.produce.requests.rate.1.min|io.confluent.kafka.server.controller.offline.partitions.count|io.confluent.kafka.server.socket.server.network.processor.avg.idle.percent|io.confluent.kafka.server.partition.under.replicated|io.confluent.kafka.server.log.log.start.offset|io.confluent.kafka.server.log.tier.size|io.confluent.kafka.server.log.size|io.confluent.kafka.server.tier.fetcher.bytes.fetched.total|io.confluent.kafka.server.request.total.time.ms.p50|io.confluent.kafka.server.tenant.consumer.lag.offsets|io.confluent.kafka.server.session.expire.listener.zookeeper.expires.rate.1.min|io.confluent.kafka.server.log.log.end.offset|io.confluent.kafka.server.broker.topic.bytes.in.rate.1.min|io.confluent.kafka.server.partition.under.min.isr|io.confluent.kafka.server.partition.in.sync.replicas.count|io.confluent.telemetry.http.exporter.batches.dropped|io.confluent.telemetry.http.exporter.items.total|io.confluent.telemetry.http.exporter.items.succeeded|io.confluent.telemetry.http.exporter.send.time.total.millis|io.confluent.kafka.server.controller.leader.election.rate.(?!.*delta).*|io.confluent.telemetry.http.exporter.batches.failed|org.apache.kafka.consumer.(fetch.manager.fetch.latency.avg|connection.creation.total|fetch.manager.fetch.total|fetch.manager.bytes.consumed.rate)|org.apache.kafka.producer.(bufferpool.wait.ratio|record.queue.time.avg|request.latency.avg|produce.throttle.time.avg|connection.creation.total|request.total|topic.byte.rate)
   ```
3. Restart every Kafka broker.


### Connect Configuration

The Connect properties file
(`/CONFLUENT_HOME/etc/schema-registry/connect-avro-distributed.properties`)
must be configured to use the same security protocol as the Kafka broker. For this example, `SASL_PLAINTEXT` is used for the producer, consumer, the producer monitoring interceptor, and the consumer monitoring interceptor.

```bash

## Quick Start

In this quick start, you will configure the Data Diode Connector to replicate
records in the topic `diode` to the topic `dest_diode`.

Start the services with one command using Confluent CLI.

```bash
|confluent_start|
```

Next, create two topics - `diode` and `dest_diode`.

```bash
./bin/kafka-topics --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic diode
./bin/kafka-topics --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic dest_diode
```

Next, start the console producer and import a few records to the `diode` topic.

```bash
./bin/kafka-console-producer --broker-list localhost:9092 --topic diode
```

Then, add records (one per line) in the console producer.

```bash
silicon
resistor
transistor
capacitor
amplifier
```

This publishes five records to the Kafka topic `diode`. Keep the window open.

Next, load the Source connector.

```bash
./bin/confluent local load datadiode-source-connector --config ./etc/kafka-connect-datadiode/DataDiodeSourceConnector.properties
```

Your output should resemble the following:

```bash
{
    "name": "datadiode-source-connector",
    "config": {
        "connector.class": "io.confluent.connect.diode.source.DataDiodeSourceConnector",
        "tasks.max": "1",
        "kafka.topic.prefix": "dest_"
        "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "header.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "diode.port": "3456",
        "diode.encryption.password": "supersecretpassword",
        "diode.encryption.salt": "secretsalt"
    },
    "tasks": [],
    "type": null
}
```

Next, load the Sink connector.

```bash
./bin/confluent local load datadiode-sink-connector --config ./etc/kafka-connect-datadiode/DataDiodeSinkConnector.properties
```

Your output should resemble the following:

```bash
{
    "name": "datadiode-sink-connector",
    "config": {
        "connector.class": "io.confluent.connect.diode.sink.DataDiodeSinkConnector",
        "tasks.max": "1",
        "topics": "diode",
        "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "header.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "diode.host": "10.12.13.15",
        "diode.port": "3456",
        "diode.encryption.password": "supersecretpassword",
        "diode.encryption.salt": "secretsalt"
    },
    "tasks": [],
    "type": null
}
```

View the Connect worker log and verify that the connectors started successfully.

```bash
confluent local services connect log
```

Finally, check that records are now available in `dest_diode` topic.

```bash
./bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic dest_diode --from-beginning
```

You should see five records in the consumer. If you have the console producer running,
you can create additional records. These additional records should be immediately
visible in the consumer.


## Quick Start

In this quick start, you will configure the Data Diode connector to replicate
records in the topic `diode` to the topic `dest_diode`.

Start the services with one command using Confluent CLI.

```bash
|confluent_start|
```

Next, create two topics - `diode` and `dest_diode`.

```bash
./bin/kafka-topics --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic diode
./bin/kafka-topics --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic dest_diode
```

Next, start the console producer and import a few records to the `diode` topic.

```bash
./bin/kafka-console-producer --broker-list localhost:9092 --topic diode
```

Then, add records (one per line) in the console producer.

```bash
silicon
resistor
transistor
capacitor
amplifier
```

This publishes five records to the Kafka topic `diode`. Keep the window open.

Next, load the Source connector.

```bash
./bin/confluent local load datadiode-source-connector --config ./etc/kafka-connect-datadiode/DataDiodeSourceConnector.properties
```

Your output should resemble the following:

```bash
{
    "name": "datadiode-source-connector",
    "config": {
        "connector.class": "io.confluent.connect.diode.source.DataDiodeSourceConnector",
        "tasks.max": "1",
        "kafka.topic.prefix": "dest_"
        "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "header.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "diode.port": "3456",
        "diode.encryption.password": "supersecretpassword",
        "diode.encryption.salt": "secretsalt"
    },
    "tasks": [],
    "type": null
}
```

Next, load the Sink connector.

```bash
./bin/confluent local load datadiode-sink-connector --config ./etc/kafka-connect-datadiode/DataDiodeSinkConnector.properties
```

Your output should resemble the following:

```bash
{
    "name": "datadiode-sink-connector",
    "config": {
        "connector.class": "io.confluent.connect.diode.sink.DataDiodeSinkConnector",
        "tasks.max": "1",
        "topics": "diode",
        "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "header.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
        "diode.host": "10.12.13.15",
        "diode.port": "3456",
        "diode.encryption.password": "supersecretpassword",
        "diode.encryption.salt": "secretsalt"
    },
    "tasks": [],
    "type": null
}
```

View the Connect worker log and verify that the connectors started successfully.

```bash
confluent local services connect log
```

Finally, check that records are now available in `dest_diode` topic.

```bash
./bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic dest_diode --from-beginning
```

You should see five records in the consumer. If you have the console producer running,
you can create additional records. These additional records should be immediately
visible in the consumer.


### Property-based example

1. Create a `gcs-source-connector.properties` file with the following contents.
   This file is included with the connector in `etc/kafka-connect-gcs/gcs-source-connector.properties`.
   This configuration is used typically along with [standalone workers](/platform/current/connect/concepts.html#standalone-workers).:
   ```properties
   name=gcs-source
   tasks.max=1
   connector.class=io.confluent.connect.gcs.GcsSourceConnector

   # enter the bucket name and GCS credentials here
   gcs.bucket.name=<bucket-name>
   gcs.credentials.path=</full/path/to/credentials/keys.json>

   format.class=io.confluent.connect.gcs.format.avro.AvroFormat

   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1

   # for production environments, enter the Confluent license here
   # confluent.license=
   ```
2. Edit the `gcs-source-connector.properties` to add the following properties:
   ```properties
   transforms=AddPrefix
   transforms.AddPrefix.type=org.apache.kafka.connect.transforms.RegexRouter
   transforms.AddPrefix.regex=.*
   transforms.AddPrefix.replacement=copy_of_$0
   ```

   #### IMPORTANT
   Adding this renames the output of topic of the messages to `copy_of_gcs_topic`. This prevents a continuous feedback loop of messages.
3. Load the Backup and Restore GCS Source connector.
   ```bash
   confluent local load gcs-source --config gcs-source-connector.properties
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production environments.
4. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status gcs-source
   ```
5. Confirm that the messages are being sent to Kafka.
   ```bash
   kafka-avro-console-consumer \
       --bootstrap-server localhost:9092 \
       --property schema.registry.url=http://localhost:8081 \
       --topic copy_of_gcs_topic \
       --from-beginning | jq '.'
   ```
6. The response should be 9 records as follows.
   ```bash
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```


### Property-based example

1. Create a `gcs-source-connector.properties` file with the following
   contents. This file is included with the connector in
   `etc/kafka-connect-gcs/gcs-source-connector.properties`. This
   configuration is used typically along with [standalone
   workers](/platform/current/connect/concepts.html#standalone-workers).:
   ```json
   {
      "name" : "GCSSourceConnector",
      "config" : {
        "format.class": "io.confluent.connect.gcs.format.avro.AvroFormat",
        "connector.class" : "io.confluent.connect.gcs.GcsSourceConnector",
        "gcs.bucket.name" : "<bucket-name>",
        "gcs.credentials.path" : "</full/path/to/credentials/keys.json>",
        "tasks.max" : "1",
        "confluent.topic.bootstrap.servers" : "localhost:9092",
        "confluent.topic.replication.factor" : "1",
        "confluent.license" : ""
      }
    }
   ```
2. Edit the `gcs-source-connector.properties` file to add the following
   properties:
   ```json
   {
      "transforms" : "AddPrefix",
      "transforms.AddPrefix.type" : "org.apache.kafka.connect.transforms.RegexRouter",
      "transforms.AddPrefix.regex" : ".*",
      "transforms.AddPrefix.replacement" : "copy_of_$0"
   }
   ```

   Adding the previous properties renames the output topic of the messages to
   `copy_of_gcs_topic` which prevents a continuous feedback loop of messages.
3. Load the Generalized GCS Source connector.
   ```bash
   confluent local load gcs-source --config gcs-source-connector.properties
   ```

   #### IMPORTANT
   Don’t use the local CLI commands in a production environment. The [Confluent
   CLI](https://docs.confluent.io/confluent-cli/current/index.html) is used in production environments. See [confluent local](https://docs.confluent.io/confluent-cli/current/command-reference/local/confluent_local_current.html#confluent-local-currentference/local/confluent_local_current.html#confluent-local-current).
   for more information about local CLI commands.
4. Verify the connector is in a `RUNNING` state.
   ```bash
   confluent local status gcs-source
   ```
5. Verify messages are being sent to Kafka.
   ```bash
   kafka-avro-console-consumer \
       --bootstrap-server localhost:9092 \
       --property schema.registry.url=http://localhost:8081 \
       --topic copy_of_gcs_topic \
       --from-beginning | jq '.'
   ```
6. The response should be 9 records as shown in the following example:
   ```bash
   {"f1": "value1"}
   {"f1": "value2"}
   {"f1": "value3"}
   {"f1": "value4"}
   {"f1": "value5"}
   {"f1": "value6"}
   {"f1": "value7"}
   {"f1": "value8"}
   {"f1": "value9"}
   ```


## Solace Quick Start

This quick start uses the JMS Sink connector to consume records from Kafka and send them to a Solace PubSub+ broker.

1. Start the [Solace PubSub+ Standard](https://solace.com/software/) broker.
   ```bash
   docker run -d --name "solace" \
       -p 8080:8080 -p 55555:55555 \
       --shm-size=1000000000 \
       --ulimit nofile=2448:38048 \
       -e username_admin_globalaccesslevel=admin \
       -e username_admin_password=admin \
       -e system_scaling_maxconnectioncount=100 \
       solace/solace-pubsub-standard:9.1.0.77
   ```
2. Create a Solace Queue in the `default` Message VPN.
   1. Once the solace docker container has started, navigate to [http://localhost:8080](http://localhost:8080) in your browser and login with `admin`/`admin`.
   2. Select the `default` Message VPN on the home screen.
   3. Select “Queues” in the left menu to navigate to the Queues page.
   4. On the Queues page, select the “+ Queue” button in the upper right and name the Queue `connector-quickstart`.
3. Install the connector through the [Confluent Hub Client](/kafka-connectors/self-managed/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-jms-sink:latest
   ```
4. [Download the sol-jms jar](https://mvnrepository.com/artifact/com.solacesystems/sol-jms) and copy it into the JMS Sink connector’s plugin folder. This needs to be done on every Connect worker node and the workers must be restarted to pick up the client jar.
5. Start Confluent Platform.
   ```bash
   confluent local start
   ```
6. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html) test data to the `jms-messages` topic in Kafka.
   ```bash
   seq 10 | confluent local produce jms-messages
   ```
7. Create a `jms-sink.json` file with the following contents:
   ```json
   {
     "name": "JmsSinkConnector",
     "config": {
       "connector.class": "io.confluent.connect.jms.JmsSinkConnector",
       "tasks.max": "1",
       "topics": "jms-messages",
       "java.naming.factory.initial": "com.solacesystems.jndi.SolJNDIInitialContextFactory",
       "java.naming.provider.url": "smf://localhost:55555",
       "java.naming.security.principal": "admin",
       "java.naming.security.credentials": "admin",
       "connection.factory.name": "/jms/cf/default",
       "Solace_JMS_VPN": "default",
       "jms.destination.type": "queue",
       "jms.destination.name": "connector-quickstart",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```
8. Load the JMS Sink connector.
   ```bash
   confluent local load jms --config jms-sink.json
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](/confluent-cli/current/index.html) in production environments.
9. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status jms
   ```
10. Navigate to the [Solace UI](http://localhost:8080) to confirm the messages were delivered to the `connector-quickstart` queue.


### Capturing Redo logs only

1. If not running, start Confluent Platform.
   ```text
   confluent local start
   ```
2. Create the following connector configuration JSON file and save the file
   as `config1.json`.

   Note the following [configuration property](configuration-properties.md#connect-oracle-cdc-source-config) entries:
   * Configure the connector with a new `name`.
   * Set `table.topic.name.template` to an empty string.
   * Set `table.inclusion.regex` to capture several tables.
   * (Optional) Use `redo.log.topic.name` to rename the redo log.
   * (Optional) Set `redo.log.corruption.topic` to specify the topic where you want to record corrupted records.

   ```json
   {
     "name": "SimpleOracleCDC_1",
     "config":{
       "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
       "name": "SimpleOracleCDC_1",
       "tasks.max":1,
       "key.converter": "io.confluent.connect.avro.AvroConverter",
       "key.converter.schema.registry.url": "http://localhost:8081",
       "value.converter": "io.confluent.connect.avro.AvroConverter",
       "value.converter.schema.registry.url": "http://localhost:8081",
       "confluent.topic.bootstrap.servers":"localhost:9092",
       "oracle.server": "<database-url>",
       "oracle.port": 1521,
       "oracle.sid":"<SID of the CDB>",
       "oracle.pdb.name":"<name of the PDB where tables reside>",
       "oracle.username": "<username e.g. C##MYUSER>",
       "oracle.password": "<password>",
       "start.from":"snapshot",
       "redo.log.topic.name": "redo-log-topic-1",
       "table.inclusion.regex":"<regex-expression>",
       "_table.topic.name.template_":"Set to an empty string to disable generating change event records",
       "table.topic.name.template": "",
       "connection.pool.max.size": 20,
       "confluent.topic.replication.factor":1,
       "topic.creation.groups": "redo",
       "topic.creation.redo.include": "redo-log-topic",
       "topic.creation.redo.replication.factor": 3,
       "topic.creation.redo.partitions": 1,
       "topic.creation.redo.cleanup.policy": "delete",
       "topic.creation.redo.retention.ms": 1209600000,
       "topic.creation.default.replication.factor": 3,
       "topic.creation.default.partitions": 5,
       "topic.creation.default.cleanup.policy": "compact"
     }
   }
   ```
3. Enter the following command to start the connector:
   ```text
   curl -s -X POST -H 'Content-Type: application/json' --data @config1.json http://localhost:8083/connectors | jq
   ```
4. Enter the following command to get the connector status:
   ```text
   curl -s -X GET -H 'Content-Type: application/json' http://localhost:8083/connectors/SimpleOracleCDC_1/status | jq
   ```
5. Verify the following connector operations are successful:
   - The connector is started with one running task (see the following note).
   - The connector produces records whenever DML events (`INSERT`, `UPDATE`, AND
     `DELETE`) occur for captured tables.
   - The connector does not produce records
     for tables that were not included in regex or were explicitly excluded with
     `table.exclusion.regex`.
   - If the `redo.log.corruption.topic` is configured, the connector sends
     corrupted records to the specified corruption topic.

   #### NOTE
   If using the property `"start.from":"snapshot"`, the redo log topic
   contains only database operations completed after the connector starts.
6. Enter the following command to check Kafka topics:
   ```text
   kafka-topics --list --zookeeper localhost:2181
   ```

   If there are operations on the tables after the connector starts, you should
   see the topic configured by the `redo-log-topic` property. If no operations
   have occurred, there should be nothing displayed other than internal topics.
7. Consume records using the Avro console consumer.
   ```text
   kafka-avro-console-consumer --topic redo-log-topic-1 \
   --partition 0 --offset earliest --bootstrap-server localhost:9092 \
   --property schema.registry.url=http://localhost:8081
   ```

   If there are operations on the tables after the connector starts, you should
   see records displayed. If no operations have occurred, there should be no
   records.
8. Check for errors in the log:
   ```text
   confluent local services connect log | grep "ERROR"
   ```
9. After you finish testing, enter the following command to clean up the
   running configuration:
   ```text
   confluent local services destroy
   ```


### Capturing Redo logs and Change Event logs

1. If not running, start Confluent Platform.
   ```text
   confluent local start
   ```
2. Create the following connector configuration JSON file and save the file as
   `config2.json`.
   ```json
   {
    "name": "SimpleOracleCDC_2",
    "config":{
      "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
      "name": "SimpleOracleCDC_2",
      "tasks.max":3,
      "key.converter": "io.confluent.connect.avro.AvroConverter",
      "key.converter.schema.registry.url": "http://localhost:8081",
      "value.converter": "io.confluent.connect.avro.AvroConverter",
      "value.converter.schema.registry.url": "http://localhost:8081",
      "confluent.topic.bootstrap.servers":"localhost:9092",
      "oracle.server": "<database-url>",
      "oracle.port": 1521,
      "oracle.sid":"<SID of the CDB>",
      "oracle.pdb.name":"<name of the PDB where tables reside>",
      "oracle.username": "<username e.g C##MYUSER>",
      "oracle.password": "<password>",
      "start.from":"snapshot",
      "redo.log.topic.name": "redo-log-topic-2",
      "redo.log.consumer.bootstrap.servers":"localhost:9092",
      "table.inclusion.regex":"<regex-expression>",
      "_table.topic.name.template_":"Using template vars to set change event topic for each table",
      "table.topic.name.template": "${databaseName}.${schemaName}.${tableName}",
      "connection.pool.max.size": 20,
      "confluent.topic.replication.factor":1,
      "topic.creation.groups": "redo",
      "topic.creation.redo.include": "redo-log-topic-2",
      "topic.creation.redo.replication.factor": 3,
      "topic.creation.redo.partitions": 1,
      "topic.creation.redo.cleanup.policy": "delete",
      "topic.creation.redo.retention.ms": 1209600000,
      "topic.creation.default.replication.factor": 3,
      "topic.creation.default.partitions": 5,
      "topic.creation.default.cleanup.policy": "compact"
    }
   }
   ```
3. Create `redo-log-topic-2`. Make sure the topic name matches the value you put for `"redo.log.topic.name"`.
   ```text
   bin/kafka-topics --create --topic redo-log-topic-2 \
   --bootstrap-server broker:9092 --replication-factor 1 \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```
4. Enter the following command to start the connector:
   ```text
   curl -s -X POST -H 'Content-Type: application/json' --data @config2.json http://localhost:8083/connectors | jq
   ```
5. Enter the following command to get the connector status:
   ```text
   curl -s -X GET -H 'Content-Type: application/json' http://localhost:8083/connectors/SimpleOracleCDC_2/status | jq
   ```
6. Verify the connector starts with three running tasks.
7. Perform `INSERT`, `UPDATE`, and `DELETE` row operations for each table
   and verify the following expected results:
   - The connector creates a redo log topic and change event log topics for each
     captured table.
   - Redo log events are generated starting from current time.
   - The change event log for each table contains snapshot events
     (`op_type=R`) followed by other types of events.
8. Enter the following command to check Kafka topics:
   ```text
   kafka-topics --list --zookeeper localhost:2181
   ```

   You should see the topic configured by the `redo-log-topic` property and
   topics in the form of `${databaseName}.${schemaName}.${tableName}`.
9. Consume records using the Avro console consumer.
   ```text
   kafka-avro-console-consumer --topic redo-log-topic-2 \
   --partition 0 --offset earliest --bootstrap-server localhost:9092 \
   --property schema.registry.url=http://localhost:8081
   ```

   You should see change event records with `op_type=I` (insert),
   `op_type=I` (update), or `op_type=D` (delete).
10. Check for errors in the log.
    ```text
    confluent local services connect log | grep "ERROR"
    ```
11. After you finish testing, enter the following command to clean up the
    running configuration:
    ```text
    confluent local services destroy
    ```


### Starting from a specific SCN (without snapshot)

1. If not running, start Confluent Platform.
   ```text
   confluent local start
   ```
2. Create the following connector configuration JSON file. Save the JSON file
   using the name `config3.json`.

   You have to choose an Oracle System Change Number (SCN) that exists with (at
   minimum) a redo log with the SCN or timestamp. The log has to be applicable
   for one of the included tables. You can use `SELECT CURRENT_SCN FROM
   v$database;` to query the current SCN of the database.
   ```json
   {
    "name": "SimpleOracleCDC_3",
    "config":{
      "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
      "name": "SimpleOracleCDC_3",
      "tasks.max":3,
      "key.converter": "io.confluent.connect.avro.AvroConverter",
      "key.converter.schema.registry.url": "http://localhost:8081",
      "value.converter": "io.confluent.connect.avro.AvroConverter",
      "value.converter.schema.registry.url": "http://localhost:8081",
      "confluent.topic.bootstrap.servers":"localhost:9092",
      "oracle.server": "<database-url>",
      "oracle.port": 1521,
      "oracle.sid":"<SID of the CDB>",
      "oracle.pdb.name":"<name of the PDB where tables reside>",
      "oracle.username": "<username e.g. C##MYUSER>",
      "oracle.password": "<password>",
      "_start.from_":"Set to a proper scn or timestamp to start without snapshotting tables",
      "start.from":"<SCN>",
      "redo.log.topic.name": "redo-log-topic-3",
      "redo.log.consumer.bootstrap.servers":"localhost:9092",
      "table.inclusion.regex":"<regex-expression>",
      "table.topic.name.template": "${databaseName}.${schemaName}.${tableName}",
      "connection.pool.max.size": 20,
      "confluent.topic.replication.factor":1,
      "topic.creation.groups": "redo",
      "topic.creation.redo.include": "redo-log-topic-3",
      "topic.creation.redo.replication.factor": 3,
      "topic.creation.redo.partitions": 1,
      "topic.creation.redo.cleanup.policy": "delete",
      "topic.creation.redo.retention.ms": 1209600000,
      "topic.creation.default.replication.factor": 3,
      "topic.creation.default.partitions": 5,
      "topic.creation.default.cleanup.policy": "compact"
    }
   }
   ```
3. Create `redo-log-topic-3`. Make sure the topic name matches the value you put for `"redo.log.topic.name"`.
   ```text
   bin/kafka-topics --create --topic redo-log-topic-3 \
   --bootstrap-server broker:9092 --replication-factor 1 \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```
4. Enter the following command to start the connector:
   ```text
   curl -s -X POST -H 'Content-Type: application/json' --data @config3.json http://localhost:8083/connectors | jq
   ```
5. Enter the following command to get the connector status:
   ```text
   curl -s -X GET -H 'Content-Type: application/json' http://localhost:8083/connectors/SimpleOracleCDC_3/status | jq
   ```

   Verify that connector is started with three running tasks.
6. Perform `INSERT`, ```UPDATE```, and `DELETE` row operations for each table
   and verify the following expected results:
   - The connector creates a redo log topic and change event log topics for each
     captured table.
   - Redo log events are generated starting from current time.
   - The change event log for each table stars from the specified `start.from`
     values and do not contain snapshot events (`op_type=R`).
7. Enter the following command to check Kafka topics:
   ```text
   kafka-topics --list --zookeeper localhost:2181
   ```

   You should see the topic configured by the `redo-log-topic` property and topics in the form of `${databaseName}.${schemaName}.${tableName}`.
8. Consume records using the Avro console consumer.
   ```text
   kafka-avro-console-consumer --topic redo-log-topic-3 \
   --partition 0 --offset earliest --bootstrap-server localhost:9092 \
   --property schema.registry.url=http://localhost:8081
   ```

   You should see table-specific topics with snapshot records (`op_type=R`).
9. Check for errors in the log.
   ```text
   confluent local services connect log | grep "ERROR"
   ```
10. After you finish testing, enter the following command to clean up the running configuration:
    ```text
    confluent local services destroy
    ```


#### Procedure

1. If not running, start Confluent Platform.
   ```text
   confluent local start
   ```
2. Create the following connector configuration JSON file and save it as
   `config4.json`.

   You have to choose an Oracle System Change Number (SCN) that exists with (at
   minimum) a redo log with the SCN or timestamp. The log has to be applicable
   for one of the included tables. You can use `SELECT CURRENT_SCN FROM
   v$database;` to query the current SCN of the database.
   ```json
   {
          "name": "SimpleOracleCDC_4",
          "config":{
            "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
            "name": "SimpleOracleCDC_4",
            "tasks.max":2,
            "key.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter.schema.registry.url": "http://localhost:8081",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "value.converter.schema.registry.url": "http://localhost:8081",
            "confluent.topic.bootstrap.servers":"localhost:9092",
            "oracle.server": "<database-url>",
            "oracle.port": 1521,
            "oracle.sid":"<SID of the CDB>",
            "oracle.pdb.name":"<name of the PDB where tables reside>",
            "oracle.username": "<username e.g. C##MYUSER>",
            "oracle.password": "<password>",
            "start.from":"snapshot",
            "redo.log.topic.name": "redo-log-topic-4",
            "redo.log.consumer.bootstrap.servers":"localhost:9092",
            "table.inclusion.regex":"<regex-expression>",
            "table.topic.name.template": "${databaseName}.${schemaName}.${tableName}",
            "_lob.topic.name.template_": "Using template vars to set lob topic for each table",
            "lob.topic.name.template": "${tableName}.${columnName}_topic",
            "connection.pool.max.size": 20,
            "confluent.topic.replication.factor":1,
            "topic.creation.groups": "redo",
            "topic.creation.redo.include": "redo-log-topic-4",
            "topic.creation.redo.replication.factor": 3,
            "topic.creation.redo.partitions": 1,
            "topic.creation.redo.cleanup.policy": "delete",
            "topic.creation.redo.retention.ms": 1209600000,
            "topic.creation.default.replication.factor": 3,
            "topic.creation.default.partitions": 5,
            "topic.creation.default.cleanup.policy": "compact"
          }
   }
   ```
3. Create `redo-log-topic-4`. Make sure the topic name matches the value you put for `"redo.log.topic.name"`.
   ```text
   bin/kafka-topics --create --topic redo-log-topic-4 \
   --bootstrap-server broker:9092 --replication-factor 1 \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```
4. Enter the following command to start the connector:
   ```text
   curl -s -X POST -H 'Content-Type: application/json' --data @config4.json http://localhost:8083/connectors | jq
   ```
5. Enter the following command and verify that connector is started with two running tasks.
   ```text
   curl -s -X GET -H 'Content-Type: application/json' http://localhost:8083/connectors/SimpleOracleCDC_4/status | jq
   ```
6. Perform `INSERT`, `UPDATE`, and `DELETE` row operations for each table
   and verify the following expected results:
   - The redo log topic is created.
   - Change event topics are created for each captured table.
   - LOB topics are created for each LOB column.
   - The key of the LOB topic records contain the following information:
     ```text
     {
       "table", "dot.separated.fully.qualified.table.name",
       "column", "column.name.of.LOB.column",
       "primary_key", "primary.key.of.change.event.topic.after.applying.${key.template}"
     }
     ```
   - The value of the LOB topic is the LOB value.
   - When a row is deleted from the table, the corresponding LOB is deleted from
     the LOB topic. The connector writes a tombstone record (null value) to the
     LOB topic.
7. After you finish testing, enter the following command to clean up the running
   configuration:
   ```text
   confluent local services destroy
   ```


### Capturing Redo logs and Snapshot with Supplemental logging only

1. If not running, start Confluent Platform.
   ```text
   confluent local start
   ```
2. Create the following connector configuration JSON file and save the file
   as `config1.json`.

   Note the following [configuration property](configuration-properties.md#connect-oracle-cdc-source-config) entries:
   * Configure the connector with a new `name`.
   * Set `table.inclusion.regex` to capture several tables.
   * (Optional) Set `redo.log.corruption.topic` to specify the topic where you want to record corrupted records.

   ```json
   {
    "name": "SimpleOracleCDC_8",
    "config":{
      "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
      "name": "SimpleOracleCDC_8",
      "tasks.max": 3,
      "key.converter": "io.confluent.connect.avro.AvroConverter",
      "key.converter.schema.registry.url": "http://localhost:8081",
      "value.converter": "io.confluent.connect.avro.AvroConverter",
      "value.converter.schema.registry.url": "http://localhost:8081",
      "confluent.topic.bootstrap.servers": "localhost:9092",
      "oracle.server": "<database-url>",
      "oracle.port": 1521,
      "oracle.sid": "<SID of the CDB>",
      "oracle.pdb.name": "<name of the PDB where tables reside>",
      "oracle.username": "<username e.g C##MYUSER>",
      "oracle.password": "<password>",
      "oracle.supplemental.log.level": "msl",
      "start.from": "snapshot",
      "redo.log.topic.name": "redo-log-topic-2",
      "redo.log.consumer.bootstrap.servers": "localhost:9092",
      "table.inclusion.regex": "<regex-expression>",
      "_table.topic.name.template_": "Using template vars to set change event topic for each table",
      "table.topic.name.template": "${databaseName}.${schemaName}.${tableName}",
      "connection.pool.max.size": 20,
      "confluent.topic.replication.factor": 1,
      "topic.creation.groups": "redo",
      "topic.creation.redo.include": "redo-log-topic-8",
      "topic.creation.redo.replication.factor": 3,
      "topic.creation.redo.partitions": 1,
      "topic.creation.redo.cleanup.policy": "delete",
      "topic.creation.redo.retention.ms": 1209600000,
      "topic.creation.default.replication.factor": 3,
      "topic.creation.default.partitions": 5,
      "topic.creation.default.cleanup.policy": "compact"
    }
   }
   ```
3. Enter the following command to start the connector:
   ```text
   curl -s -X POST -H 'Content-Type: application/json' --data @config1.json http://localhost:8083/connectors | jq
   ```
4. Enter the following command to get the connector status:
   ```text
   curl -s -X GET -H 'Content-Type: application/json' http://localhost:8083/connectors/SimpleOracleCDC_8/status | jq
   ```
5. Verify the following connector operations are successful:
   - The connector is started with three running tasks (see the following note).
   - The connector produces snapshot records for captured tables.

   #### NOTE
   If using the property `"start.from":"snapshot"`, the redo log topic
   contains only database operations completed after the connector starts.
6. Enter the following command to check Kafka topics:
   ```text
   kafka-topics --list --zookeeper localhost:2181
   ```

   You should see the topic configured by the `redo-log-topic` property and
   topics in the form of `${databaseName}.${schemaName}.${tableName}`.
7. Consume records using the Avro console consumer.
   ```text
   kafka-avro-console-consumer --topic redo-log-topic-8 \
   --partition 0 --offset earliest --bootstrap-server localhost:9092 \
   --property schema.registry.url=http://localhost:8081
   ```

   If there are operations on the tables after the connector starts, you should
   see records displayed. If no operations have occurred, there should be no
   records.
8. Check for errors in the log:
   ```text
   confluent local services connect log | grep "ERROR"
   ```
9. After you finish testing, enter the following command to clean up the
   running configuration:
   ```text
   confluent local services destroy
   ```


## PostgreSQL Example

This section includes an example of how to move records from Oracle Database to
PostgreSQL using the Oracle CDC Source and the JDBC Sink connectors.

1. Create an Oracle CDC Source connector.

   The following configuration will create a snapshot and store new changes
   (inserts) to a table-specific topic called `ORCLCDB.C__MYUSER.USERS`
   ```json
     {
         "name": "SimpleOracleCDC_DEMO",
         "config":{
           "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
           "name": "SimpleOracleCDC_DEMO",
           "tasks.max":3,
           "key.converter":"org.apache.kafka.connect.storage.StringConverter",
           "value.converter": "io.confluent.connect.avro.AvroConverter",
           "value.converter.schema.registry.url": "http://localhost:8081",
           "confluent.topic.bootstrap.servers":"localhost:9092",
           "oracle.server": "localhost",
           "oracle.port": 1521,
           "oracle.sid":"ORCLCDB",
           "oracle.username": "C##MYUSER",
           "oracle.password": "mypassword",
           "start.from":"snapshot",
           "redo.log.topic.name": "redo-log-topic",
           "redo.log.consumer.bootstrap.servers":"localhost:9092",
           "table.inclusion.regex": ".*USERS.*",
           "table.topic.name.template": "${databaseName}.${schemaName}.${tableName}",
           "connection.pool.max.size": 20,
           "confluent.topic.replication.factor":1,
           "lob.topic.name.template":"${databaseName}.${schemaName}.${tableName}.${columnName}",
           "redo.log.row.fetch.size":1,
           "numeric.mapping": "best_fit"
     }
   }
   ```

   A sample record in `ORCLCDB.C__MYUSER.USERS`:
   ```json
   {"ID":241,"FIRST_NAME":{"string":"Lettie"},"LAST_NAME":{"string":"Kaplan"},"EMAIL":{"string":"Lettie.Kaplan@utvel.us"},"GENDER":{"string":"male"},"CLUB_STATUS":{"string":"active"},"COMMENTS":{"string":"Confluent"},"UPDATE_TS":{"long":1623831883974},"table":{"string":"ORCLCDB.C##MYUSER.USERS"},"scn":{"string":"1450183"},"op_type":{"string":"I"},"op_ts":{"string":"1623857084000"},"current_ts":{"string":"1623831886610"},"row_id":{"string":"AAAR9JAAHAAAACFAAg"},"username":{"string":"C##MYUSER"}}
   ```
2. Create a JDBC Sink connector.

   You can use a [Single Message Transform
   (SMT)](/platform/current/connect/concepts.html#transforms) to drop prefix,
   `ORCLCDB.C__MYUSER.`, enabling the connector to UPSERT records to a
   `USERS` table in PostgreSQL as shown in the following example:
   ```json
   {
     "name": "jdbc_sink_postgres_demo",
     "config": {
             "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
             "connection.url": "jdbc:postgresql://<postgres-url>:5432/<database-name>",
             "connection.user": "<user>",
             "connection.password": "<password>",
             "tasks.max": "2",
             "topics": "ORCLCDB.C__MYUSER.USERS",
             "auto.create": "true",
             "auto.evolve": "true",
             "dialect.name": "PostgreSqlDatabaseDialect",
             "insert.mode": "upsert",
             "pk.mode": "record_value",
             "pk.fields":"ID",
             "batch.size": 1,
             "key.converter":"org.apache.kafka.connect.storage.StringConverter",
             "value.converter": "io.confluent.connect.avro.AvroConverter",
             "value.converter.schema.registry.url": "http://localhost:8081",
             "transforms":"dropPrefix",
             "transforms.dropPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
             "transforms.dropPrefix.regex":"ORCLCDB.C__MYUSER.(.*)",
             "transforms.dropPrefix.replacement":"$1",
             "errors.tolerance":"all",
             "errors.deadletterqueue.topic.name":"dlq-jdbc-sink",
             "errors.deadletterqueue.context.headers.enable": "true",
             "errors.deadletterqueue.topic.replication.factor":"1"
         }
   }
   ```
3. Check PostgreSQL and verify the data looks simliar to following:
   ![Oracle CDC connector PostgreSQL example](images/PostgreSQL_example.png)


#### NOTE
If the redo log topic updates are not propagated to the table topic and
security is enabled on the Kafka cluster, you must configure
`redo.log.consumer.*` accordingly.

If the topic has at least one record produced by the connector that is currently
running, make sure that the database rows have changed on the corresponding
table since the initial snapshot for the table was taken.

Ensure the `table.inclusion.regex` configuration property matches the fully
qualified table name (for example, `dbo.Users`) and the regular expression in
the `table.exclusion.regex` configuration property does not match the fully
qualified table name.

When `Supplemental Log` is turned on for a database or multiple tables,
it might take time for a connector to catch up on reading a redo log and to find relevant records.
Check the current SCN in the database with `SELECT CURRENT_SCN FROM V$DATABASE` and
compare it with the last SCN the connector processed or saw in a connect-offsets topic
(the topic name could be different depending on a setup) or in TRACE logs.
If there is a huge gap, consider increasing `redo.log.row.fetch.size` to 100, 1000, or even a larger number.

```text
curl -s -X PUT -H "Content-Type:application/json" \
http://localhost:8083/admin/loggers/io.confluent.connect.oracle \
-d '{"level": "TRACE"}' \
| jq '.'
```


## Quick Start

The RabbitMQ Sink connector streams records from Kafka topics to a RabbitMQ
exchange with high throughput. This quick start shows example data production
and consumption setups in detail.

1. Start the [RabbitMQ Server](https://www.rabbitmq.com/download.html) broker, specifying the docker image on basis of required RabbitMQ version.
   ```bash
   docker run -it --rm --name rabbitmq \
       -p 5672:5672 \
       -p 15672:15672 \
       rabbitmq:3.8.4-management
   ```
2. Create a RabbitMQ exchange. To produce messages from Kafka to RabbitMQ, you also create a queue and binding.
   1. Once the RabbitMQ docker container has started, navigate to [http://localhost:15672](http://localhost:15672) in your browser and login with `guest`/`guest`.
   2. In the `Exchanges` tab click on `Add a new exchange`. Name it `exchange1` and leave other options as the default settings.
   3. In the `Queues` tab click on `Add a new queue`. Name it `queue1` and leave other options as the default settings.
   4. In the `Exchanges` tab click on the exchange created `exchange1`. In the `Bindings` section add a binding in the field `To queue` to `queue1` with routing key `rkey1`.
3. Install the connector through the [Confluent Hub Client](/kafka-connectors/self-managed/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-rabbitmq-sink:latest
   ```
4. Start Confluent Platform.
   ```bash
   confluent local start
   ```
5. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html) test data to a pre-created `rabbitmq-messages` topic in Kafka.
   ```bash
   seq 10 | confluent local produce rabbitmq-messages
   ```
6. Create a `rabbitmq-sink.json` file with the following contents:
   ```json
   {
     "name": "RabbitMQSinkConnector",
     "config": {
       "connector.class": "io.confluent.connect.rabbitmq.sink.RabbitMQSinkConnector",
       "tasks.max": "1",
       "topics": "rabbitmq-messages",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1",
       "rabbitmq.host": "localhost",
       "rabbitmq.port": "5672",
       "rabbitmq.username": "guest",
       "rabbitmq.password": "guest",
       "rabbitmq.exchange": "exchange1",
       "rabbitmq.routing.key": "rkey1",
       "rabbitmq.delivery.mode": "PERSISTENT"
     }
   }
   ```
7. Load the RabbitMQ Sink connector.
   ```bash
   confluent local load RabbitMQSinkConnector --config rabbitmq-sink.json
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production environments.
8. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status RabbitMQSinkConnector
   ```
9. Navigate to the [RabbitMQ UI](http://localhost:15672) to confirm the messages were delivered to the `queue1` queue.


## Quick Start

1. Install [Redis](https://redis.io/topics/quickstart).
2. Start the Redis server so it can start listening for Redis connections. This
   starts Redis using the default port 6379 and no password (for testing purposes
   only).
   > ```bash
   > redis-server
   > ```
3. Use the Redis CLI (`redis-cli`) to view any insertions being made. You can
   use the `MONITOR` command if the instance is being used only for this quick
   start test (see the note below).
   ```text
   redis-cli MONITOR
   ```

   #### IMPORTANT
   The `MONITOR` CLI command is a debugging command that streams back every
   command processed by the Redis server. It assists you in understanding
   what is happening to the database. However, using it comes at a
   performance cost. **Do not use this in production environments.**
4. Install the connector. See [installation instructions](#redis-sink-connector-install) for details.
5. Start the Confluent Platform.
   ```bash
   confluent local start
   ```

   #### IMPORTANT
   Ensure your start the Confluent Platform after installing the connector. If not, you
   must restart the Connect workers to register the installation and to
   add the new connector location to the path.
6. Ensure the installed connector has been identified by the Confluent Platform.
   ```bash
   confluent local services connect plugin list
   ```
7. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html)
   test data to the `users` topic in Kafka.
   ```bash
   echo key1,value1 | confluent local produce users --property parse.key=true --property key.separator=,
   echo key2,value2 | confluent local produce users --property parse.key=true --property key.separator=,
   echo key3,value3 | confluent local produce users --property parse.key=true --property key.separator=,
   ```

   #### IMPORTANT
   This connector expects non-null keys. The `parse.key` and
   `key.separator` properties ensure the exported records have explicit
   keys and values
8. Create a `redis-sink.properties` file with the following properties:
   ```text
   name=kafka-connect-redis
   topics=users
   tasks.max=1
   connector.class=com.github.jcustenborder.kafka.connect.redis.RedisSinkConnector
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.storage.StringConverter
   ```
9. Start the connector.
   ```bash
   confluent local load kafka-connect-redis --config redis-sink.properties
   ```
10. Ensure the connector status is `RUNNING`.
    ```bash
    confluent local status kafka-connect-redis
    ```
11. Observe that data is flowing and the keys and values being inserted into Kafka
    are going to the desired Redis instance.
12. Shut down Confluent Platform.
    ```bash
    confluent local destroy
    ```
13. Stop the `redis-server` and `redis-cli` (use Ctrl+C).


## Quick start

This Quick start uses the Splunk S2S Source connector to receive data from the
Splunk UF and ingests it into Kafka.

1. Install the connector using the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```text
   # run from your CP installation directory
   confluent connect plugin install confluentinc/kafka-connect-splunk-s2s:latest
   ```
2. Start the Confluent Platform.
   ```bash
   confluent local start
   ```
3. Create a `splunk-s2s-source.properties` file with the following contents:
   ```text
   name=splunk-s2s-source
   tasks.max=1
   connector.class=io.confluent.connect.splunk.s2s.SplunkS2SSourceConnector
   splunk.s2s.port=9997
   kafka.topic=splunk-s2s-events
   key.converter=org.apache.kafka.connect.storage.StringConverter
   value.converter=org.apache.kafka.connect.json.JsonConverter
   key.converter.schemas.enable=false
   value.converter.schemas.enable=false
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   ```
4. Load the Splunk S2S Source connector.
   ```bash
   confluent local load splunk-s2s-source --config splunk-s2s-source.properties
   ```

   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production
   environments.
5. Confirm the connector is in a `RUNNING` state.
   ```bash
   confluent local status splunk-s2s-source
   ```
6. Start a Splunk UF by running the Splunk UF Docker container.
   ```bash
   docker run -d -p 9998:9997 -e "SPLUNK_START_ARGS=--accept-license" -e "SPLUNK_PASSWORD=password" --name splunk-uf splunk/universalforwarder:9.0.0
   ```
7. Create a `splunk-s2s-test.log` file with the following sample log events:
   ```text
   log event 1
   log event 2
   log event 3
   ```
8. Copy the `splunk-s2s-test.log` file to the Splunk UF Docker container using
   the following command:
   ```bash
   docker cp splunk-s2s-test.log splunk-uf:/opt/splunkforwarder/splunk-s2s-test.log
   ```
9. Configure the UF to monitor the `splunk-s2s-test.log` file:
   ```bash
   docker exec -it splunk-uf sudo ./bin/splunk add monitor -source /opt/splunkforwarder/splunk-s2s-test.log -auth admin:password
   ```
10. Configure the UF to connect to Splunk S2S Source connector:
    - **For Mac/Windows systems**:
      ```bash
      docker exec -it splunk-uf sudo ./bin/splunk add forward-server host.docker.internal:9997
      ```
    - **For Linux systems**:
      ```bash
      docker exec -it splunk-uf sudo ./bin/splunk add forward-server 172.17.0.1:9997
      ```
11. Verify the data was ingested into the Kafka topic.

    To look for events from a monitored file (`splunk-s2s-test.log`) in the
    Kafka topic, run the following command:
    ```text
    kafka-console-consumer --bootstrap-server localhost:9092 --topic splunk-s2s-events --from-beginning | grep 'log event'
    ```

    #### NOTE
    When you use the previous command without `grep`, you will see many
    Splunk internal events get ingested in the Kafka topic as Splunk UF sends
    internal Splunk log events to connector by default.
12. Shut down Confluent Platform.
    ```bash
    confluent local destroy
    ```
13. Shut down the Docker container.
    ```bash
    docker stop splunk-uf
    docker rm splunk-uf
    ```


#### IMPORTANT
The default port used by a Splunk HEC is `8088`. However, the ksqlDB
component of Confluent Platform also uses that port. For this quick start, since both
Splunk and Confluent Platform will be running, we configure the HEC to use port `8889`.
If that port is in use by another process, change `8889` to a different,
open port.

1. Start a Splunk Enterprise instance by running the Splunk Docker container.
   ```bash
   docker run -d -p 8000:8000 -p 8889:8889 -e "SPLUNK_START_ARGS=--accept-license" -e "SPLUNK_PASSWORD=password" --name splunk splunk/splunk:7.3.0
   ```
2. Open [http://localhost:8000](http://localhost:8000) to access Splunk Web.
   Log in with username `admin` and password `password`.
3. Configure a Splunk HEC using Splunk Web.
   - Click **Settings** > **Data Inputs**.
   - Click **HTTP Event Collector**.
   - Click **Global Settings**.
   - In the All Tokens toggle button, select **Enabled**.
   - Ensure **SSL disabled** is checked.
   - Change the HTTP Port Number to **8889**.
   - Click **Save**.
   - Click **New Token**.
   - In the **Name** field, enter a name for the token: `kafka`
   - Click **Next**.
   - Click **Review**.
   - Click **Submit**.

   #### IMPORTANT
   Note the token value on the **Token has been created successfully** page.
   This token value is needed for the connector configuration later.
4. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin installsplunk/kafka-connect-splunk:latest
   ```
5. Start Confluent Platform.
   ```bash
   confluent local start
   ```
6. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html)
   test data to the `splunk-qs` topic in Kafka.
   ```bash
   echo event 1 | confluent local produce splunk-qs
   echo event 2 | confluent local produce splunk-qs
   ```
7. Create a `splunk-sink.properties` file with the properties below.
   Substitute `<HEC_TOKEN>` with the Splunk HEC token created earlier.
   ```properties
   name=SplunkSink
   topics=splunk-qs
   tasks.max=1
   connector.class=com.splunk.kafka.connect.SplunkSinkConnector
   splunk.indexes=main
   splunk.hec.uri=http://localhost:8889
   splunk.hec.token=<HEC_TOKEN>
   splunk.sourcetypes=my_sourcetype
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   value.converter=org.apache.kafka.connect.storage.StringConverter
   ```
8. Start the connector.
   ```bash
   confluent local load splunk --config splunk-sink.properties
   ```
9. In the Splunk user interface, verify that data is flowing into your Splunk
   platform instance by searching using the search parameter
   `source="http:kafka"`.
10. Shut down Confluent Platform.
    ```bash
    confluent local destroy
    ```
11. Shut down the Docker container.
    ```bash
    docker stop splunk
    docker rm splunk
    ```


## Configure KRaft controllers

To create a KRaft controller, create and configure a KRaftController CR. The
following shows the key CR settings:

```yaml
kind: KRaftController
metadata:
  name:                                    --- [1]
  namespace:                               --- [2]
spec:
  replicas:                                --- [3]
  listeners:
    controller:                            --- [4]
      authentication:                      --- [5]
      tls:
        enabled:                           --- [6]
      externalAccess:                      --- [7]
        type:                              --- [8]
  controllerQuorumVoters:                  --- [9]
  configOverrides:                         --- [10]
    server:
      - default.replication.factor=        --- [11]
      - listener                           --- [12]
  dependencies:
    schemaRegistry:                        --- [13]
```

* [1] Required. The name of this KRaft controller.
* [2] The namespace of this KRaft controller.
* [3] The desired number of replicas. Must be an odd number that is 3 or higher.

  A change to this setting will roll the cluster.
* [4] Required. Communication to and among the KRaft controller nodes happens
  over this controller listener.
* [5] See [Authentication for Kafka and KRaft](co-authenticate-kafka.md#co-authenticate-kafka) for
  configuring authentication.
* [6] Set to `true` to enable TLS. See [Network encryption](co-network-encryption.md#co-network-encryption) for
  configuring TLS certificates.
* [7] defines the external access configuration for the Kafka cluster.
* [8] Required if `externalAccess` ([7]) is specified. Set to the Kubernetes
  service for external access. Valid options are `loadBalancer`, `nodePort`,
  `route`, `staticForPortBasedRouting`, and `staticForHostBasedRouting`.

  For details on external access configuration, see
  [Network encryption](co-networking-overview.md#co-networking-overview).
* [9] Required for multi-region deployment. Follow the further configuration
  steps in [Set up MRC with KRaft](co-multi-region.md#co-mrc-kraft).
* [10] Required. Use the `configOverrides` to set the matching properties as
  set in the Kafka CR. The following properties are required: [11],
  [12]
* [11] Required. The default replication factor is the number of Kafka replicas.
  This parameter is needed for the KRaft controller to interact with the
  Kafka brokers for some features, such as Self-Balancing and metrics reporter. You must
  explicitly set it for KRaft to match the number of Kafka replicas
  (`spec.replicas` in the Kafka CR).

  Use `configOverrides` to set the property because the property is not
  directly supported in the KRaft CR. For example:
  ```yaml
  spec:
    configOverrides:
      server:
        - default.replication.factor=3
  ```
* [12] Required for when authentication is enabled for Kafka replication
  listeners. Set the same properties set in the Kafka CR, under
  `spec.listeners.replication.authentication`.

  The following are sample configurations for mTLS authentication among
  replication listeners:
  ```yaml
  spec:
    configOverrides:
      server:
        - listener.name.replication.ssl.client.auth=required
        - listener.name.replication.ssl.key.password=${file:/vault/secrets/kafka-tls/jksPassword.txt:jksPassword}
        - listener.name.replication.ssl.keystore.location=/vault/secrets/kafka-tls/keystore.jks
        - listener.name.replication.ssl.keystore.password=${file:/vault/secrets/kafka-tls/jksPassword.txt:jksPassword}
        - listener.name.replication.ssl.principal.mapping.rules=RULE:.*CN[\s]?=[\s]?([a-zA-Z0-9._]*)?.*/$1/
        - listener.name.replication.ssl.truststore.location=/vault/secrets/kafka-tls/truststore.jks
        - listener.name.replication.ssl.truststore.password=${file:/vault/secrets/kafka-tls/jksPassword.txt:jksPassword}
        - listener.security.protocol.map=CONTROLLER:SSL,REPLICATION:SSL
  ```
* [13] Required when the Kafka CR has a dependency on Schema Registry in the
  `spec.dependencies.schemaRegistry` section. Set the same Schema Registry dependency
  settings you set in the Kafka CR here.

An example KRaftController CR:

```yaml
kind: KRaftController
metadata:
  name: kcontroller
  namespace: operator
spec:
  replicas: 3
  listeners:
    controller:
      authentication:
        type: plain
        jaasConfig:
          secretRef: kraft-secret
      tls:
        enabled: true
  dependencies:
    schemaRegistry:
      authentication:
        basic:
          secretRef: kafka-sr-credential
        type: basic
      tls:
        enabled: true
  configOverrides:
    server:
      - default.replication.factor=3
```


## Options

```none
    --all                          gather confluent-platform information (default true)
    --exclude-kubectl-misc         exclude kubectl misc information.
    --exclude-logs                 exclude all pod logs.
    --exclude-pdb                  exclude pdb information.
    --exclude-pv-pvc               exclude pv and pvc information.
    --follow-logs-duration int     Follow pod logs similar to kubectl logs -f <pod-name> for a given time in second
-h, --help                         help for support-bundle
    --include-kernel-params        gather information about the kernel params.
    --include-namespace            gather information about the namespace.
    --include-nodes                include node information.
    --only-application-resources   gather confluent-platform application resources information
    --only-cluster-resources       gather confluent-platform cluster resources information
    --only-clusterlink             gather only cluster link information.
    --only-confluentrolebinding    gather only confluent role binding information.
    --only-connect                 gather only connect clusters information.
    --only-connector               gather only connector information.
    --only-controlcenter           gather only controlcenter clusters information.
    --only-flink                   gather only flink information.
    --only-gateway                 gather only gateway information.
    --only-kafka                   gather only kafka clusters information.
    --only-kafkarestclass          gather only kafka rest class information.
    --only-kafkarestproxy          gather only kafka rest proxy cluster's information.
    --only-kafkatopic              gather only kafka topic information.
    --only-kraftcontroller         gather only kraft controller cluster's information.
    --only-kraftmigrationjob       gather only kraft migration job information.
    --only-ksqldb                  gather only ksqldb clusters information.
    --only-schema                  gather only schema information.
    --only-schemaexporter          gather only schema exporter information.
    --only-schemaregistry          gather only schemaregistry clusters information.
    --only-usmagent                gather only USM agent information.
    --only-zk                      gather only zookeeper clusters information.
    --out-dir string               directory where the support-bundle will be created; defaults to user's current directory if not configured.
```


### Step 4: Install Confluent Platform

1. Deploy the KRaft controller and the Kafka brokers:
   ```bash
   kubectl apply -f $TUTORIAL_HOME/confluent-platform-c3++.yaml
   ```
2. Install the sample producer app and topic:
   ```bash
   kubectl apply -f $TUTORIAL_HOME/producer-app-data.yaml
   ```
3. Wait until all the Confluent Platform components are deployed and running:
   ```bash
   kubectl get pods
   ```

   In this tutorial, the following components are being deployed: KRaft
   controller,  Kafka, Connect, Schema Registry, ksqlDB, REST Proxy, Control Center .


### Step 4: Install Confluent Platform

1. Install all Confluent Platform components:
   ```bash
   kubectl apply -f $TUTORIAL_HOME/confluent-platform.yaml
   ```
2. Install the sample producer app and topic:
   ```bash
   kubectl apply -f $TUTORIAL_HOME/producer-app-data.yaml
   ```
3. Wait until all the Confluent Platform pods are deployed and running:
   ```bash
   kubectl get pods
   ```

   In this tutorial, the following components are being deployed: ZooKeeper, Kafka,
   Connect, Schema Registry, ksqlDB, REST Proxy, Confluent Control Center (Legacy) .


### Required Configuration Properties

* `bootstrap.servers` - A list of host/port pairs to use for establishing the initial connection to your Apache Kafka® cluster (of the form: `host1:port1,host2:port2,....`). Note that the client will make use of all servers in your cluster irrespective of which servers are specified via this property for bootstrapping. You may want to specify more than one in case one of the servers in your list is down at the time of initialization.
* `client.id` - Under the hood, the Confluent JMS Client makes use of one or more Kafka clients for communication your Kafka cluster. The `client.id` of these clients is set to the value of this configuration property appended with a globally unique id (guid).  The `client.id` string is passed to the server when making requests and is useful for debugging purposes.
* `confluent.license` - A license key string provided to you by Confluent under the terms of a Confluent Enterprise subscription agreement. If not specified, you may use the client for a trial period of 30 days after which it will stop working.
* `confluent.topic` - Name of the Kafka topic used for Confluent configuration, including licensing information. The default name for this topic is `_confluent-command`.
  To learn more, see [License topic configuration](/platform/current/connect/license.html#license-topic-configuration)
  and [License topic ACLs](/platform/current/connect/license.html#license-topic-acls).
* `confluent.topic.replication.factor` - The replication factor for the Kafka topic used for Confluent
  configuration, including licensing information. This is used only if the topic does not already
  exist, and the default of three is appropriate for production use. If you are using a development
  environment with less than three brokers, you must set this to the number of brokers (e.g. 1).

Configuration properties are set in the same way as any other Kafka client:

```bash
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("confluent.topic", "foo_confluent-command");
props.put("confluent.topic.replication.factor", "3");
props.put("client.id", "my-jms-client");
```

Optional Configuration Properties
————————————————~

* `allow.out.of.order.acknowledge` - If true, does not throw an exception if a message is acknowledged out of order (which implicitly acknowledges any messages before it). default value is `false`.
* `jms.fallback.message.type` - If the JMS Message type header is not associated with a message, fallback to this message type.
* `consumer.group.id` - A string that uniquely identifies the group of consumer processes to which this client belongs. If not specified, this defaults to `confluent-jms` in the case of queues and `confluent-jms-{uuid}` in the case of topics, where {uuid} is a unique value for each consumer. This naming strategy provides load balancer semantics in the case of queues and publish-subscribe semantics in the case of topics, as required by the JMS Specification.
* `jms.consumer.poll.timeout.ms` - The maximum length of time Kafka consumers should block when retrieving records from Kafka. You should not need to adjust this value.
* `jms.consumer.close.timeout.ms` - The maximum number of milliseconds to wait for a clean shutdown when closing a `MessageConsumer`.
* `message.listener.null.wait.ms` - The number of milliseconds to wait before polling Kafka for new messages if no messages were retrieved in a message listener poll loop. Reducing this value will improve consume latency in low throughput scenarios at the expense of higher network/CPU overhead.
* `connection.stop.timeout.ms` - The maximum number of milliseconds to wait for the message listener threads to cleanly shutdown when connection.stop() has been called.
* `jms.create.connection.ignore.authenticate` - If true, connection creation methods on ConnectionFactory that have username and password parameters will fall through to the corresponding methods that do not have these parameters (the parameters will be ignored). If false, use of these methods will result in a JMSException being thrown.
* `message.listener.max.redeliveries` - The maximum number of times a message will be redelivered to a MessageConsumer listener when the session is in AUTO_ACKNOWLEDGE mode. Default value is 10.

Standard Kafka Configuration Properties (Optional)
————————————————————————~~

All of the configuration properties of the underlying Java Kafka client library may be specified. Simply prefix the desired property with `producer.` or `consumer.` as appropriate. For example:

```bash
props.put("producer.linger.ms", "1");
props.put("consumer.heartbeat.interval.ms", "1000");
```

Enabling TLS Encryption (Optional)
————————————————~~

Security settings match those of the native Kafka java producer and consumer. Security settings are applied to both production and consumption of messages (you do not need to prefix security settings with `consumer.` or `producer.`).

If client authentication is not required in the broker, then the following is a minimal configuration example:

```bash
props.put("security.protocol", "SSL");
props.put("ssl.truststore.location", "/var/private/ssl/kafka.client.truststore.jks");
props.put("ssl.truststore.password", "test1234");
```

If client authentication is required, then a keystore must be created like in step 1 and the following must also be configured:

```bash
props.put("ssl.keystore.location", "/var/private/ssl/kafka.client.keystore.jks");
props.put("ssl.keystore.password", "test1234");
props.put("ssl.key.password", "test1234");
```


### Broker removal cannot complete due to offline partitions

Broker removal can also fail in cases where taking a broker down will result in
having fewer online brokers than the number of replicas required in your configurations.

The broker status (available with [kafka-remove-brokers](configuration-options.md#sbc-command-remove-brokers) `--describe`)
will remain as follows, until you restart one or more of the offline brokers:

```bash
[2020-09-17 23:40:53,743] WARN [AdminClient clientId=adminclient-1] Connection to node -5 (localhost/127.0.0.1:9096) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)Broker 1 removal status:
Partition Reassignment: IN_PROGRESS
Broker Shutdown: COMPLETE
```

A short-hand way of troubleshooting this is to ask “how many brokers are down?”
and “how many replicas/replication factors must the cluster support?”

Partition reassignment (the [last phase in broker removal](configuration-options.md#sbc-broker-removal-phases))
will fail to complete in any case where you have `n` brokers down, and your configuration
requires `n + 1` or more replicas.

Alternatively, you can consider how many online brokers you need to support the required number of replicas.
If you have `n` brokers online, these can support at most a total of `n` replicas.

**Solution:** The solution is to restart the down brokers, and perhaps modify the cluster
configuration as a whole. This might include both adding brokers and modifying
replicas/replication factors (see example below).

Scenarios that lead to this problem can be a combination of under-replicated
topics and topics with too many replicas for the number of online brokers.
Having a topic with a replication factor of 1 does not necessarily lead to a
problem in and of itself.

A quick way to get an overview of configured replicas on a running cluster is to
use `kafka-topics --describe` on a specified topic, or on the whole cluster
(with no topic specified). For system topics, you can scan the replication
factors and replicas on system properties (which generate system topics). The
[Tutorial: Add and Remove Brokers with Self-Balancing in Confluent Platform](sbc-tutorial.md#sbc-tutorial) covers these commands, replicas/replication factors, and
the impact of these configurations.


### Sink tasks

The previous section described how to implement a simple `SourceTask`. Unlike
`SourceConnector` and `SinkConnector`, `SourceTask` and `SinkTask` have very different
interfaces because `SourceTask` uses a pull interface and `SinkTask` uses a push interface.
Both share the common lifecycle methods, but the `SinkTask` interface is quite different:

```java
public abstract class SinkTask implements Task {
  ... [ lifecycle methods omitted ] ...

  public void initialize(SinkTaskContext context) {
      this.context = context;
  }

  public abstract void put(Collection<SinkRecord> records);
  public abstract void flush(Map<TopicPartition, Long> offsets);

  public void open(Collection<TopicPartition> partitions) {}
  public void close(Collection<TopicPartition> partitions) {}
}
```

The [SinkTask documentation](/platform/current/connect/javadocs/javadoc/org/apache/kafka/connect/sink/SinkTask.html)
contains full details, but this interface is nearly as simple as the `SourceTask`. The
`put()` method should contain most of the implementation, accepting sets of `SinkRecords`,
performing any required translation, and storing them in the destination system. This process
does not need to ensure the data has been fully written to the destination system before
returning. In fact, in many cases some internal buffering will be useful so an entire batch of
records can be sent at once (much like Kafka’s producer),
reducing the overhead of inserting events into the downstream data
store. The `SinkRecords` contain essentially the same information as `SourceRecords`: Kafka
topic, partition, and offset and the event key and value.

The `flush()` method is used during the offset commit process, which allows tasks to recover from
failures and resume from a safe point such that no events will be missed. The method should push
any outstanding data to the destination system and then block until the write has been
acknowledged. The `offsets` parameter can often be ignored, but is useful in some cases where
implementations want to store offset information in the destination store to provide exactly-once
delivery. For example, an HDFS connector could do this and use atomic move operations to make
sure the `flush()` operation atomically commits the data and offsets to a final location in
HDFS.

Internally, `SinkTask` uses a Kafka consumer to poll data. The consumer instances used
in tasks for a connector belong to the same consumer group. Task reconfiguration or failures will
trigger rebalance of the consumer group. During rebalance, the topic partitions will be reassigned
to the new set of tasks. For more explanations of the Kafka consumer rebalance, see the
[Consumer](../clients/consumer.md#kafka-consumer) section.

Note that as the consumer is single threaded and you should make sure that `put()` or `flush()`
will not take longer than the consumer session timeout. Otherwise, the consumer will be kicked out
of the group, which triggers a rebalancing of partitions that stops all other tasks from making
progress until the rebalance completes.

To ensure that the resources are properly released and allocated during rebalance,
`SinkTask` provides two additional methods: `close()` and `open()`, which are tied to the
underlying rebalance callbacks of the `KafkaConsumer` that is driving the `SinkTask`.

The `close()` method is used to close writers for partitions assigned to the `SinkTask`.
This method will be called before a consumer rebalance operation starts and after the `SinkTask`
stops fetching data.  After being closed, Connect will not write any records to the task until a
new set of partitions has been opened. The `close()` method has access to all topic partitions
assigned to the `SinkTask` before rebalance starts. In general, Confluent recommends to close writers for
all topic partitions and ensures that the states for all topic partitions are properly maintained.
However, you can choose to close writers for a subset of topic partitions in your implementation.
In this case, you need to carefully reason about the state before and after rebalance in order
to achieve the desired delivery guarantee.

The `open()` method is used to create writers for newly assigned partitions in case of consumer
rebalance. This method will be called after partition re-assignment completes and before the
`SinkTask` starts fetching data.

Note that any errors raised from `close()` or `open()` will cause the task to stop, report
a failure status, and the corresponding consumer instance to close. This consumer shutdown triggers
a rebalance, and topic partitions for this task will be reassigned to other tasks of this connector.


## Separate principals

Within the Connect worker configuration, all properties having a prefix of
`producer.` and `consumer.` are applied to all source and sink connectors
created in the worker. The `admin.` prefix is used for error reporting in sink connectors. The following describes how these prefixes are used:

* The `consumer.` prefix controls consumer behavior for sink connectors.
* The `producer.` prefix controls producer behavior for source connectors.
* Both the `producer.` and `admin.` prefixes control producer and client behavior for sink connector error reporting.

You can override these properties for individual connectors using the
`producer.override.`, `consumer.override.`, and `admin.override.`
prefixes. This includes overriding the worker service principal configuration to
create separate service principals for each connector. Overrides are disabled by
default. They are enabled using the `connector.client.config.override.policy`
worker property. This property sets the per-connector overrides the worker
permits. The out-of-the-box (OOTB) options for the override policy are:

* `connector.client.config.override.policy=None`
  : Default. Does not allow any configuration overrides.
* `connector.client.config.override.policy=Principal`
  : Allows overrides for the `security.protocol`, `sasl.jaas.config`, and `sasl.mechanism` configuration properties, using the `producer.override.`, `consumer.override`, and `admin.override` prefixes.
* `connector.client.config.override.policy=All`
  : Allows overrides for all configuration properties using the `producer.override.`, `consumer.override`, and `admin.override` prefixes.

If your Kafka broker supports client authentication over SSL, you can configure a
separate principal for the worker and the connectors. In this case, you need to
[generate a separate certificate](../security/security_tutorial.md#generating-keys-certs) for each of them
and install them in separate keystores.

The key Connect configuration differences are as follows, notice the unique password, keystore location, and keystore password:

```bash

### Worker configuration properties file

Regardless of the mode used, Kafka Connect workers are configured by passing a
worker configuration properties file as the first parameter. For example:

```bash
bin/connect-distributed worker.properties
```

Sample worker configuration properties files are included with Confluent Platform to help you
get started.  The following list shows the location for Avro sample files:

* `etc/schema-registry/connect-avro-distributed.properties`
* `etc/schema-registry/connect-avro-standalone.properties`

Use one of these files as a starting point. These files contain the necessary
configuration properties to use the Avro converters that integrate with Schema Registry.
They are configured to work well with Kafka and Schema Registry services running locally.
They do not require running more than a single broker, making it easy for you to
test Kafka Connect locally.

The example configuration files can also be modified for production deployments
by using the correct hostnames for Kafka and Schema Registry and acceptable (or default)
values for the internal topic replication factor.

For a list of worker configuration properties, see [Kafka Connect
Worker Configuration Properties](/platform/current/connect/references/allconfigs.html).


### Producer and consumer overrides

You may need to override default settings, other than those described in the
previous section. The following two examples show when this might be required.

**Worker override example**

Consider a standalone process that runs a log file connector. For the logs being
collected, you might prefer low-latency, best-effort delivery. That is, when
there are connectivity issues, minimal data loss may be acceptable for your
application in order to avoid data buffering on the client. This keeps log
collection as lightweight as possible.

To override [producer configuration
properties](/platform/current/installation/configuration/producer-configs.html) and
[consumer configuration
properties](/platform/current/installation/configuration/consumer-configs.html) for all connectors
controlled by the worker, you prefix worker configuration properties with
`producer.` or `consumer.` as shown in the following example:

```json
producer.retries=1
consumer.max.partition.fetch.bytes=10485760
```

The previous example overrides the default producer `retries` property to
retry sending messages only one time. The consumer override increases the
default amount of data fetched from a partition per request to 10 MB. These
configuration changes are applied to all connectors controlled by the worker. Be
careful making any changes to these settings when running distributed mode
workers.

**Per-connector override example**

By default, the producers and consumers used for connectors are created using
the same properties that Connect uses for its own internal topics. This means
that the same Kafka principal must be able to read and write to all the internal
topics and all of the topics used by the connectors.

You may want the producers and consumers used for connectors to use a different
Kafka principal. It is possible for connector configurations to override worker
properties used to create producers and consumers. These are prefixed with
`producer.override.` and `consumer.override.`. For more information about
per-connector overrides, see [Override the Worker
Configuration](/platform/current/connect/references/allconfigs.html#override-the-worker-configuration).

For detailed information about producers and consumers, see [Kafka
Producer](/platform/current/clients/producer.html) and [Kafka
Consumer](/platform/current/clients/consumer.html). For a list of configuration properties, see
[producer configuration
properties](/platform/current/installation/configuration/producer-configs.html) and
[consumer configuration
properties](/platform/current/installation/configuration/consumer-configs.html).


### How to run it

1. Start ZooKeeper.
   ```none
   sudo zookeeper-server-start ${CONFLUENT_HOME}/etc/kafka/zookeeper.properties
   ```
2. Start Kafka with the `etc/kafka/server.properties` you just configured.
   ```none
   kafka-server-start ${CONFLUENT_HOME}/etc/kafka/server.properties
   ```

To learn more, see [Start Confluent Platform](../../installation/installing_cp/zip-tar.md#start-cp-command-line) and [how to install and run Confluent Platform](../../installation/overview.md#installation).

You can configure clients like Schema Registry, Control Center, ksqlDB, and Connect to talk to Kafka and MDS over HTTPS in their respective properties files.

![image](images/security-rbac-mtls.png)


### Rolling restart

If you need to do software upgrades, broker configuration updates, or cluster
maintenance, then you will need to restart all the brokers in your Kafka cluster.
To do this, you can do a rolling restart by restarting one broker at a time.
Restarting the brokers one at a time provides high availability by avoiding
downtime for end users.

Some considerations to avoid downtime include:

* Use [Confluent Control Center](https://docs.confluent.io/control-center/current/overview.html) to monitor broker status during the rolling restart.
* Because one replica is unavailable while a broker is restarting, clients will not experience downtime if the number of remaining in sync replicas is greater than the configured `min.insync.replicas`.
* Run brokers with `controlled.shutdown.enable=true` to migrate topic partition leadership before the broker is stopped.
* The active controller should be the last broker you restart.  This is to ensure that the active controller is not moved on each broker restart, which would slow down the restart.

Before starting a rolling restart:

1. Verify your cluster is healthy and there are no under replicated
   partitions. In Control Center, navigate to **Overview** of the
   cluster, and observe the **Under replicated partitions** value. If
   there are under replicated partitions, investigate why before doing
   a rolling restart.
2. Identify which Kafka broker in the cluster is the active controller.
   The active controller will report `1` for the following
   metric
   `kafka.controller:type=KafkaController,name=ActiveControllerCount`
   and the remaining brokers will report `0`.

Use the following workflow for rolling restart:

1. Connect to one broker, being sure to leave the active
   controller for last, and stop the broker process gracefully.
   Do not send a `kill -9` command. Wait until the broker has
   completely shutdown.
   ```none
   bin/kafka-server-stop
   ```
2. If you are performing a [software upgrade](../installation/upgrade.md#upgrade)
   or making any system configuration changes, follow those
   steps on this broker. (If you are just changing broker
   properties, you could optionally do this before you stop the broker)
3. Start the broker back up, passing in the broker properties
   file.
   ```none
   bin/kafka-server-start etc/kafka/broker.properties
   ```
4. Wait until that broker completely restarts and is caught up before
   proceeding to restart the next broker in your cluster. Waiting is
   important to ensure that leader failover happens as cleanly as
   possible. To know when the broker is caught up, in Control Center,
   navigate to **Overview** of the cluster, and observe the
   **Under replicated partitions** value. During broker restart, this
   number increases because data will not be replicated to topic
   partitions that reside on the restarting broker.
   ![image](kafka/underreplicated-down.png)

   After a broker restarts and is caught up, this number goes back to its original
   value before restart, which should be `0` in a healthy cluster.
   ![image](kafka/underreplicated-recovered.png)
5. Repeat the above steps on each broker until you have restarted all brokers
   but the active controller. Now you can restart the active controller.


### Limiting bandwidth usage during data migration

Kafka lets you apply a throttle to replication traffic, setting an upper bound on the bandwidth used to move replicas from machine to machine. This is useful when rebalancing a cluster, bootstrapping a new broker or adding or removing brokers, as it limits the impact these data-intensive operations will have on users.

There are three interfaces that can be used to engage a throttle. The simplest, and safest, is to apply a throttle when invoking [confluent-rebalancer](../clusters/rebalancer/quickstart.md#rebalancer) or `kafka-reassign-partitions`, but `kafka-configs` can also be used to view and alter the throttle values directly.

So for example, if you were to execute a rebalance, with the below command, it would move partitions at no more than 50 MBps.

```none
bin/kafka-reassign-partitions --bootstrap-server myhost:9092 --execute
--reassignment-json-file bigger-cluster.json —throttle 50000000
```

When you execute this script you will see the throttle engage:

```none
…
The throttle limit was set to 50000000 B/s
Successfully started reassignment of partitions.
```

Should you wish to alter the throttle, during a rebalance, say to increase the throughput so it completes quicker, you can do this by re-running the execute command passing the same `reassignment-json-file`:

```none
bin/kafka-reassign-partitions --bootstrap-server localhost:9092  --execute
--reassignment-json-file bigger-cluster.json --throttle 700000000
There is an existing assignment running.
The throttle limit was set to 700000000 B/s
```

After the rebalance completes the administrator can check the status of the rebalance using the `--verify` option. If the rebalance has completed, and `--verify` is run, the throttle will be removed. It is important that administrators remove the throttle in a timely manner after rebalancing completes by running the command with the `--verify` option. Failure to do so could cause regular replication traffic to be throttled.

When the `--verify` option is executed, and the reassignment has completed, the script will confirm that the throttle was removed:

```none
bin/kafka-reassign-partitions --bootstrap-server localhost:9092  --verify
--reassignment-json-file bigger-cluster.json
Status of partition reassignment:
Reassignment of partition [my-topic,1] completed successfully
Reassignment of partition [mytopic,0] completed successfully
Throttle was removed.
```

The administrator can also validate the assigned configs using the `kafka-configs`. There are two pairs of throttle configuration used to manage the throttling process. The throttle value itself. This is configured, at a broker level, using the dynamic properties:

```none
leader.replication.throttled.rate
follower.replication.throttled.rate
```

There is also an enumerated set of throttled replicas:

```none
leader.replication.throttled.replicas
follower.replication.throttled.replicas
```

Which are configured per topic. All four config values are automatically assigned by `kafka-reassign-partitions` (discussed below).

The throttle mechanism works by measuring the received and transmitted rates, for partitions in the `replication.throttled.replicas` lists, on each broker. These rates are compared to the `replication.throttled.rate` config to determine if a throttle should be applied. The rate of throttled replication (used by the throttle mechanism) is recorded in the below JMX metrics, so they can be externally monitored.

```none
MBean:kafka.server:type=LeaderReplication,name=byte-rate
MBean:kafka.server:type=FollowerReplication,name=byte-rate
```

To view the throttle limit configuration:

```none
bin/kafka-configs --describe --bootstrap-server localhost:9092 --entity-type brokers
Configs for brokers '2' are leader.replication.throttled.rate=1000000,follower.replication.throttled.rate=1000000
Configs for brokers '1' are leader.replication.throttled.rate=1000000,follower.replication.throttled.rate=1000000
```

This shows the throttle applied to both leader and follower side of the replication protocol. By default both sides are assigned the same throttled throughput value.

To view the list of throttled replicas:

```none
bin/kafka-configs --describe --bootstrap-server localhost:9092 --entity-type topics
Configs for topic ‘my-topic' are leader.replication.throttled.replicas=1:102,0:101,follower.replication.throttled.replicas=1:101,0:102
```

Here we see the leader throttle is applied to partition 1 on broker 102 and partition 0 on broker 101. Likewise the follower throttle is applied to partition 1 on broker 101 and partition 0 on broker 102.

By default `kafka-reassign-partitions` will apply the leader throttle to all replicas that exist before the rebalance, any one of which might be leader. It will apply the follower throttle to all move destinations. So if there is a partition with replicas on brokers `101,102`, being reassigned to `102,103`, a leader throttle, for that partition, would be applied to `101,102` (possible leaders during rebalance) and a follower throttle would be applied to `103` only (the move destination).

If required, you can also use the `--alter` switch on `kafka-configs` to alter the throttle configurations manually.

Some care should be taken when using throttled replication. In particular:

1. Throttle Removal:

The throttle should be removed in a timely manner after reassignment completes (by running `confluent-rebalancer --finish` or `kafka-reassign-partitions -—verify`).

1. Ensuring Progress:

If the throttle is set too low, in comparison to the incoming write rate, it is possible for replication to not make progress. This occurs when:

```none
max(BytesInPerSec) > throttle
```

Where BytesInPerSec is the metric that monitors the write throughput of producers into each broker.

The administrator can monitor whether replication is making progress, during the rebalance, using the metric:

```none
kafka.server:type=FetcherLagMetrics,name=ConsumerLag,clientId=([-.\w]+),topic=([-.\w]+),partition=([0-9]+)
```

The lag should constantly decrease during replication.  If the metric does not decrease the administrator should increase the throttle throughput as described above.

1. Avoiding long delays during replication:

The throttled throughput should be large enough that replicas cannot be starved for extended periods. A good, conservative rule of thumb is to keep throttle above `#brokers MB/s` where `#brokers` is the number of brokers in your cluster.

Administrators wishing to use lower throttle values can tune the response size used for replication based on the relation:

```none
Worst-Case-Delay = replica.fetch.response.max.bytes x #brokers / throttle
```

Here, the admin should tune the throttle  and/or `replica.fetch.response.max.bytes`  appropriately to ensure the delay is never larger than `replica.lag.time.max.ms` (as it is possible for some partitions, particularly smaller ones, to enter the ISR before the rebalance completes) or the outer throttle window: `(replication.quota.window.size.seconds x replication.quota.window.num)` or the connection timeout `replica.socket.timeout.ms`.

As the default for `replica.fetch.response.max.bytes` is 10MB and the delay should be less than 10s (`replica.lag.time.max.ms`), this leads to the rule of thumb that throttles should never be less than #brokers MBps .

To better understand the relation let’s consider an example. Say we have a 5 node cluster, with default settings. We set a throttle of 10 MBps, cluster-wide, and add a new broker. The bootstrapping broker would replicate from the other 5 brokers with requests of size 10MB (default `replica.fetch.response.max.bytes`). The worst case payload, arriving at the same time on the bootstrapping broker, is 50MB. In this case the follower throttle, on the bootstrapping broker, would delay subsequent replication requests for (50MB / 10 MBps) = 5s, which is acceptable. However if we set the throttle to 1 MBps the worst-case delay would be 50s which is not acceptable.


# Quick Start for Confluent REST Proxy for Kafka

Use the following Quick Start instructions to get up and running with
Confluent REST Proxy for Apache Kafka®.

Prerequisites
: - [Confluent Platform](../installation/index.md#installation-overview)


You should configure and start a KRaft controller and a Kafka broker before you start REST Proxy.
For detailed instructions on how to configure and run Confluent Platform see, [Tutorial: Set Up a Multi-Broker Kafka Cluster](../get-started/tutorial-multi-broker.md#basics-multi-broker-setup).
You will only need to run one Kafka broker and one KRaft controller for this quick start.

```bash
confluent local services kafka-rest start
```

To manually start each service in its own terminal, run instead:

```bash
bin/kafka-server-start ./etc/kafka/controller.properties
bin/kafka-server-start ./etc/kafka/broker.properties
bin/kafka-rest-start ./etc/kafka-rest/kafka-rest.properties


## Add the uberjar to ksqlDB server

In order for ksqlDB to be able to load your UDFs, they need to be
compiled from classes into an uberjar. Run the following command to
build an uberjar:

```bash
gradle shadowJar
```

You should now have a directory, `extensions`, with a file named
`example-udfs-0.0.1.jar` in it.

In order to use the uberjar, you need to make it available to ksqlDB
server. Create the following `docker-compose.yml` file:

```yaml

version: '2'

services:

  broker:
    image: confluentinc/cp-enterprise-kafka:8.1.0
    hostname: broker
    container_name: broker
    ports:
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1

  schema-registry:
    image: confluentinc/cp-schema-registry:8.1.0
    hostname: schema-registry
    container_name: schema-registry
    depends_on:
      - broker
    ports:
      - "8081:8081"
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry

  ksqldb-server:
    image: confluentinc/ksqldb-server:8.1.0
    hostname: ksqldb-server
    container_name: ksqldb-server
    depends_on:
      - broker
      - schema-registry
    ports:
      - "8088:8088"
    volumes:
      - "./extensions/:/opt/ksqldb-udfs"
    environment:
      KSQL_LISTENERS: "http://0.0.0.0:8088"
      KSQL_BOOTSTRAP_SERVERS: "broker:9092"
      KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
      # Configuration for UDFs
      KSQL_KSQL_EXTENSION_DIR: "/opt/ksqldb-udfs"
      KSQL_KSQL_FUNCTIONS_FORMULA_BASE_VALUE: 5

  ksqldb-cli:
    image: confluentinc/ksqldb-cli:8.1.0
    container_name: ksqldb-cli
    depends_on:
      - broker
      - ksqldb-server
    entrypoint: /bin/sh
    tty: true
```

Notice that:

- A volume is mounted from the local `extensions` directory
  (containing your uberjar) to the container `/opt/ksqldb-udfs`
  directory. The latter can be any directory that you like. This
  command effectively puts the uberjar on ksqlDB server’s file system.
- The environment variable `KSQL_KSQL_EXTENSION_DIR` is configured to
  the same path that was set for the container in the volume mount.
  This is the path that ksqlDB will look for UDFs in.
- The environment variable `KSQL_KSQL_FUNCTIONS_FORMULA_BASE_VALUE`
  is set to `5`. Recall that in the UDF example, the function loads
  an external parameter named`ksql.functions.formula.base.value`.
  All `KSQL_` environment variables are converted automatically to
  server configuration properties, which is where UDF parameters are
  looked up.


### Configuring Kafka Encrypted Communication

This configuration enables ksqlDB to connect to a Kafka cluster over
SSL, with a user supplied trust store:

```properties
security.protocol=SSL
ssl.truststore.location=/etc/kafka/secrets/kafka.client.truststore.jks
ssl.truststore.password=confluent
```

The exact settings will vary depending on the security settings of the
Kafka brokers, and how your SSL certificates are signed. For full
details, and instructions on how to create suitable trust stores, please
refer to the [Security Guide](../../../security/overview.md#security).

To use separate trust stores for encrypted communication with
Kafka and external communication with ksqlDB clients, prefix the
SSL truststore configs with `ksql.streams.`:

```properties
security.protocol=SSL
ksql.streams.ssl.truststore.location=/etc/kafka/secrets/kafka.client.truststore.jks
ksql.streams.ssl.truststore.password=confluent
```


### Configure Kafka Authentication

This configuration enables ksqlDB to connect to a secure Kafka cluster
using PLAIN SASL, where the SSL certificates have been signed by a CA
trusted by the default JVM trust store.

```none
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=\
    org.apache.kafka.common.security.plain.PlainLoginModule required \
    username="<ksql-user>" \
    password="<password>";
```

The exact settings will vary depending on what SASL mechanism your Kafka
cluster is using and how your SSL certificates are signed. For more
information, see the [Security Guide](../../../security/overview.md#security).


### Replicated topic with Avro schema causes errors?

Confluent Replicator renames topics during replication, and if there are
associated Avro schemas, they aren’t automatically matched with the renamed
topics.

In the ksqlDB CLI, the `PRINT` statement for a replicated topic works, which shows
that the Avro schema ID exists in Schema Registry, and ksqlDB can deserialize
the Avro message. But `CREATE STREAM` fails with a deserialization error:

```bash
CREATE STREAM pageviews_original (viewtime bigint, userid varchar, pageid varchar) WITH (kafka_topic='pageviews.replica', value_format='AVRO');

[2018-06-21 19:12:08,135] WARN task [1_6] Skipping record due to deserialization error. topic=[pageviews.replica] partition=[6] offset=[1663] (org.apache.kafka.streams.processor.internals.RecordDeserializer:86)
org.apache.kafka.connect.errors.DataException: pageviews.replica
        at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:97)
        at io.confluent.ksql.serde.connect.KsqlConnectDeserializer.deserialize(KsqlConnectDeserializer.java:48)
        at io.confluent.ksql.serde.connect.KsqlConnectDeserializer.deserialize(KsqlConnectDeserializer.java:27)
```

The solution is to register schemas manually against the replicated subject name for the topic:

```bash

### DELIMITED

| Feature                                                                                     | Supported   |
|---------------------------------------------------------------------------------------------|-------------|
| As value format                                                                             | Yes         |
| As key format                                                                               | Yes         |
| Multi-Column Keys                                                                           | Yes         |
| [Schema Registry required](../operate-and-deploy/installation/server-config/avro-schema.md) | No          |
| [Schema inference](/reference/server-configuration#ksqlpersistencedefaultformatkey)         | No          |
| [Single field wrapping](#ksqldb-serialization-formats-single-field-unwrapping)              | No          |
| [Single field unwrapping](#ksqldb-serialization-formats-single-field-unwrapping)            | Yes         |

The `DELIMITED` format supports comma-separated values. You can use
other delimiter characters by specifying the KEY_DELIMITER and/or
VALUE_DELIMITER when you use FORMAT=‘DELIMITED’ in a WITH clause. Only a
single character is valid as a delimiter. The default is the comma
character. For space- and tab-delimited values, use the special values
`SPACE` or `TAB`, not an actual space or tab character.

The delimiter is a Unicode character, as defined in
`java.lang.Character`. For example, the smiley-face character works:

```sql
CREATE STREAM delim_stream (f1 STRING, f2 STRING) with (KAFKA_TOPIC='delim', FORMAT='DELIMITED', VALUE_DELIMITER='☺', ...);
```

The serialized object should be a Kafka-serialized string, which will be
split into columns.

For example, given a SQL statement such as:

```sql
CREATE STREAM x (ORGID BIGINT KEY, ID BIGINT KEY, NAME STRING, AGE INT) WITH (FORMAT='DELIMITED', ...);
```

ksqlDB splits a key of `120,21` and a value of `bob,49` into the
four fields (two keys and two values) with `ORGID KEY` of `120`,
`ID KEY` of `21`, `NAME` of `bob` and `AGE` of `49`.

This data format supports all SQL [data types](sql/data-types.md#ksqldb-reference-data-types)
except `ARRAY`, `MAP` and `STRUCT`.

- `TIMESTAMP` typed data is serialized as a `long` value indicating
  the Unix epoch time in milliseconds.
- `TIME` typed data is serialized as an `int` value indicating the
  number of milliseconds since the beginning of the day.
- `DATE` typed data is serialized as an `int` value indicating the
  number of days since the Unix epoch.
- `BYTES` typed data is serialized as a Base64-encoded string value.


### KAFKA

| Feature                                                                                     | Supported   |
|---------------------------------------------------------------------------------------------|-------------|
| As value format                                                                             | Yes         |
| As key format                                                                               | Yes         |
| Multi-Column Keys                                                                           | No          |
| [Schema Registry required](../operate-and-deploy/installation/server-config/avro-schema.md) | No          |
| [Schema inference](/reference/server-configuration#ksqlpersistencedefaultformatkey)         | No          |
| [Single field wrapping](#ksqldb-serialization-formats-single-field-unwrapping)              | No          |
| [Single field unwrapping](#ksqldb-serialization-formats-single-field-unwrapping)            | Yes         |

The `KAFKA` format supports `INT`, `BIGINT`, `DOUBLE` and
`STRING` primitives that have been serialized using Kafka’s standard
set of serializers.

The format is designed primarily to support primitive message keys. It
can be used as a value format, though certain operations aren’t
supported when this is the case.

Unlike some other formats, the `KAFKA` format does not perform any
type coercion, so it’s important to correctly match the field type to
the underlying serialized form to avoid deserialization errors.

The table below details the SQL types the format supports, including
details of the associated Kafka Java Serializer, Deserializer and
Connect Converter classes you would need to use to write the key to
Kafka, read the key from Kafka, or use to configure Apache Connect to
work with the `KAFKA` format, respectively.

| SQL field type   | Kafka type                     | Kafka serializer                                          | Kafka deserializer                                          | Connect converter                                     |
|------------------|--------------------------------|-----------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------------|
| INT / INTEGER    | A 32-bit signed integer        | `org.apache.kafka.common.serialization.IntegerSerializer` | `org.apache.kafka.common.serialization.IntegerDeserializer` | `orgapache.kafka.connect.converters.IntegerConverter` |
| BIGINT           | A 64-bit signed integer        | `org.apache.kafka.common.serialization.LongSerializer`    | `org.apache.kafka.common.serialization.LongDeserializer`    | `org.apache.kafka.connect.converters.LongConverter`   |
| DOUBLE           | A 64-bit floating point number | `org.apache.kafka.common.serialization.DoubleSerializer`  | `org.apache.kafka.common.serialization.DoubleDeserializer`  | `org.apache.kafka.connect.converters.DoubleConverter` |
| STRING / VARCHAR | A UTF-8 encoded text string    | `org.apache.kafka.common.serialization.StringSerializer`  | `org.apache.kafka.common.serialization.StringDeserializer`  | `org.apache.kafka.connect.storage.StringConverter`    |

Because the format supports only primitive types, you can only use it
when the schema contains a single field.

For example, if your Kafka messages have a `long` key, you can
make them available to ksqlDB by using a statement like:

```sql
CREATE STREAM USERS (ID BIGINT KEY, NAME STRING) WITH (VALUE_FORMAT='JSON', ...);
```

If you integrate ksqlDB with
[Confluent Schema Registry](../../schema-registry/index.md#schemaregistry-intro),
and your ksqlDB application uses a compatible value format (Avro,
JSON_SR, or Protobuf), you can just supply the key column, and ksqlDB
loads the value columns from Schema Registry:

```sql
CREATE STREAM USERS (ID BIGINT KEY) WITH (VALUE_FORMAT='JSON_SR', ...);
```

The key column must be supplied, because ksqlDB supports only keys in
`KAFKA` format.


### Protobuf

| Feature                                                                                                                           | Supported                            |
|-----------------------------------------------------------------------------------------------------------------------------------|--------------------------------------|
| As value format                                                                                                                   | Yes                                  |
| As key format                                                                                                                     | Yes                                  |
| Multi-Column Keys                                                                                                                 | Yes                                  |
| [Schema Registry required](../operate-and-deploy/installation/avro-schema.md#ksqldb-installation-configure-serialization-formats) | `PROTOBUF`: Yes, `PROTOBUF_NOSR`: No |
| [Schema inference](server-configuration.md#ksqldb-reference-server-configuration-persistence-default-format-key)                  | `PROTOBUF`: Yes, `PROTOBUF_NOSR`: No |
| [Single field wrapping](#ksqldb-serialization-formats-single-field-unwrapping)                                                    | Yes                                  |
| [Single field unwrapping](#ksqldb-serialization-formats-single-field-unwrapping)                                                  | No                                   |

Protobuf handles `null` values differently than AVRO and JSON.
Protobuf doesn’t have the concept of a `null` value, so the conversion
between PROTOBUF and Java (Kafka Connect) objects is undefined.
Usually, Protobuf resolves a “missing field” to the default value of its
type.

- **String:** the default value is the empty string.
- **Byte:** the default value is empty bytes.
- **Bool:** the default value is `false`.
- **Numeric type:** the default value is zero.
- **Enum:** the default value is the first defined enum value, which
  must be zero.
- **Message field:** the field is not set. Its exact value is
  language-dependent. See the generated code guide for details.

To enable alternative representations for `null` values in protobuf,
protobuf-specific properties can be passed to `CREATE` statements. For
example, the following `CREATE` statement will create a protobuf
schema that wraps all primitive types into the corresponding standard
wrappers (e.g. `google.protobuf.StringValue` for `string`).

```sql
CREATE STREAM USERS (ID STRING KEY, i INTEGER, s STRING) WITH (VALUE_FORMAT='PROTOBUF', VALUE_PROTOBUF_NULLABLE_REPRESENTATION='WRAPPER');
```

This way, `null` can be distinguished from default values. Similarly,
when `VALUE_PROTOBUF_NULLABLE_REPRESENTATION` is set to `OPTIONAL`,
all fields in protobuf will be declared optional, also allowing `null`
primitive fields to be distinguished from default values.

The same property values can be used with the
`KEY_PROTOBUF_NULLABLE_REPRESENTATION` property to customize the
protobuf serialization of the key.


## Replicated topic with Avro schema causes errors

The Confluent Replicator renames topics during replication. If there are associated Avro
schemas, they are not automatically matched with the renamed topics after
replication completes.

Using the `PRINT` statement for a replicated topic shows that the Avro schema
ID exists in the Schema Registry. ksqlDB can deserialize the Avro message, but the
`CREATE STREAM` statement fails with a deserialization error. For example:

```sql
CREATE STREAM pageviews_original (viewtime bigint, userid varchar, pageid varchar) WITH (kafka_topic='pageviews.replica', value_format='AVRO');
```

Example output with a deserialization error:

```none
[2018-06-21 19:12:08,135] WARN task [1_6] Skipping record due to deserialization error. topic=[pageviews.replica] partition=[6] offset=[1663] (org.apache.kafka.streams.processor.internals.RecordDeserializer:86)
org.apache.kafka.connect.errors.DataException: pageviews.replica
        at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:97)
        at io.confluent.ksql.serde.connect.KsqlConnectDeserializer.deserialize(KsqlConnectDeserializer.java:48)
        at io.confluent.ksql.serde.connect.KsqlConnectDeserializer.deserialize(KsqlConnectDeserializer.java:27)
```

The solution is to register Avro schemas manually against the replicated
subject name for the topic. For example:

```bash

### Creating a mirror topic

A mirror topic is a read-only topic that reflects all the data and metadata in another topic.

Creating a mirror topic with the CLI uses the `kafka-mirrors` tool.
Once a mirror topic is created, the mirror automatically begins
fetching data from the source topic.

For more information, see [Mirror Topics](https://docs.confluent.io/platform/current/multi-dc-deployments/cluster-linking/mirror-topics-cp.html).

**Example Command**

```bash
kafka-mirrors --create --mirror-topic example-topic \
--link demo-link \
--bootstrap-server localhost:9093
```

**Example Output**

```bash
Created topic example-topic.
```

To create a mirror topic, use `kafka-cluster-links` along with [bootstrap-server](#bootstrap-cluster-links) and the following flags.

`--mirror-topic`
: (Required) The name of the mirror topic to create. This must match exactly the name of the source topic to mirror over the cluster link.


  * Type: string

`--link`
: (Required) The name of the cluster link used to pull data from the source topic.


  * Type: string

`--command-config`
: Property file containing configurations to be passed to the [AdminClient](../../installation/configuration/admin-configs.md#cp-config-admin). For example,
  with security credentials for authorization and authentication.

The following are optional configurations when creating a mirror topic:

`--config`
: A comma-separated list of configs to override when creating the mirror topic. Each
  config to override should be specified as `name=value`.
  For more information about which configurations can be set on a mirror topic,
  see [Configurations](https://docs.confluent.io/platform/current/multi-dc-deployments/cluster-linking/mirror-topics-cp.html#configurations) in Mirror Topics.


  * Type: string

`--replication-factor`
: The replication factor of the mirror topic being created. If not supplied, *defaults to the destination
  cluster’s default*, not the source topic’s replication factor.


  * Type: string

`--source-topic`
: The name of the source topic to mirror. Required if the cluster link has a prefix configured.
  To learn more, see [Prefixing Mirror Topics and Consumer Group Names](https://docs.confluent.io/platform/current/multi-dc-deployments/cluster-linking/mirror-topics-cp.html#prefixing-mirror-topics-and-consumer-group-names).


  * Type: string

You must have `ALTER CLUSTER` authorization to create a mirror topic.


#### IMPORTANT
Changing the configuration of a topic does not change the replica assignment for the topic partition.
Changing the replica placement of a topic configuration must be followed by a partition reassignment.
The [confluent-rebalancer](../clusters/rebalancer/configuration-options.md#rebalancer-config-options) command line tool supports reassignment that also accounts for replica placement constraints.
To learn more, see [Quick Start for Auto Data Balancing in Confluent Platform](../clusters/rebalancer/quickstart.md#rebalancer).

For example, run the commands below to start a reassignment that matches the topic’s replica placement constraints. Note you should use Confluent Platform 5.5 or newer,
which now includes `--topics` and `--exclude-internal-topics` flags to limit the set of topics that are eligible for reassignment. This will decrease
the overall rebalance scope and therefore time. `--replica-placement-only` can be used to perform reassignment only on partitions that do not satisfy the
replica placement constraints.

```none
confluent-rebalancer execute --bootstrap-server kafka-west-1:9092 --replica-placement-only --throttle 10000000 --verbose
```

Run this command to monitor the status for the reassignment:

```none
confluent-rebalancer status --bootstrap-server kafka-west-1:9092
```

Run this command to finish the reassignment:

```none
confluent-rebalancer finish --bootstrap-server kafka-west-1:9092
```

For more information and examples, see [Quick Start for Auto Data Balancing in Confluent Platform](../clusters/rebalancer/quickstart.md#rebalancer).


### Consumer lag

Replicator has an embedded consumer that reads data from the origin
cluster, and it commits its offsets only after the connect worker’s
producer has committed the data to the destination cluster (configure
the frequency of commits with the parameter
`offset.flush.interval.ms`). You can monitor the consumer lag of
Replicator’s embedded consumer in the origin cluster (for Replicator
instances that copy data from `dc1` to `dc2`, the origin cluster is
`dc1`). The ability to monitor Replicator’s consumer lag is enabled
when it is configured with `offset.topic.commit=true` (`true` by
default), which allows Replicator to commit its own consumer offsets to
the origin cluster `dc1` after the messages have been written to the
destination cluster.

1. For Replicator copying from `dc1` to `dc2`: Select `dc1`
   (origin cluster) from the menu on the left and then select **Consumers**. Verify that there are two
   consumer groups, one for reach Replicator instance running from
   `dc1` to `dc2`: `replicator-dc1-to-dc2-topic1` and
   `replicator-dc1-to-dc2-topic2`. Replicator’s consumer lag
   information is available in Control Center and
   `kafka-consumer-groups`, but it is not available via JMX.
   1. Click on `replicator-dc1-to-dc2-topic1` to view Replicator’s
      consumer lag in reading `topic1` and `_schemas`. This view is
      equivalent to:
      ```text
      docker-compose exec broker-dc1 kafka-consumer-groups --bootstrap-server broker-dc1:29091 --describe --group replicator-dc1-to-dc2-topic1
      ```

      ![image](images/c3-consumer-lag-dc1-topic1.png)
   2. Click on `replicator-dc1-to-dc2-topic2` to view Replicator’s
      consumer lag in reading `topic2` (equivalent to
      `docker-compose exec broker-dc1 kafka-consumer-groups --bootstrap-server broker-dc1:29091 --describe --group replicator-dc1-to-dc2-topic2`)
      ![image](images/c3-consumer-lag-dc1-topic2.png)
2. For Replicator copying from `dc1` to `dc2`: do not mistakenly
   try to monitor Replicator consumer lag in the destination cluster
   `dc2`. Control Center also shows the Replicator consumer lag for
   topics in `dc2` (i.e., `topic1`, `_schemas`,
   `topic2.replica`) but this does not mean that Replicator is
   consuming from them. The reason you see this consumer lag in `dc2`
   is because by default Replicator is configured with
   `offset.timestamps.commit=true` for which Replicator commits its
   own offset timestamps of its consumer group in the
   `__consumer_offsets` topic in the destination cluster `dc2`. In
   case of disaster recovery, this enables Replicator to resume where it
   left off when switching to the secondary cluster.
3. Do not confuse consumer lag with an MBean attribute called
   `records-lag` associated with Replicator’s embedded consumer. That
   attribute reflects whether Replicator’s embedded consumer can keep up
   with the original data production rate, but it does not take into
   account replication lag producing the messages to the destination
   cluster. `records-lag` is real time, and it is normal for this
   value to be `0.0`.
   ```text
   docker-compose exec connect-dc2 \
       kafka-run-class kafka.tools.JmxTool \
       --object-name "kafka.consumer:type=consumer-fetch-manager-metrics,partition=0,topic=topic1,client-id=replicator-dc1-to-dc2-topic1-0" \
       --attributes "records-lag" \
       --jmx-url service:jmx:rmi:///jndi/rmi://connect-dc2:9892/jmxrmi
   ```


## Use Control Center to monitor replicators

You can use Control Center to monitor the replicators in your current deployment:

1. Stop Replicator and brokers on both the origin and destination clusters.

   Press `Ctl-C` in the each command window to stop the processes, but keep the windows open to make it easy to restart each one.
2. Activate the monitoring extension for Replicator by doing the following, as fully described in [Replicator monitoring extension](replicator-monitoring.md#replicator-monitoring-extension).
   - Add the full path to `replicator-rest-extension-<version>.jar` to your CLASSPATH.
   - Add `rest.extension.classes=io.confluent.connect.replicator.monitoring.ReplicatorMonitoringExtension` to `my-examples/replication.properties`.
3. Uncomment or add the following lines to the Kafka configuration files for both the destination and origin,
   `my-examples/server_destination.properties` and `my-examples/server_origin.properties`, respectively. The configuration for
   `confluent.metrics.reporter.bootstrap.servers` must point to `localhost` on port `9092` in both files,
   so you may need to edit one or both of these port numbers. (Searching on `confluent.metrics` will take you to these lines in the files.)
   ```none
   confluent.metrics.reporter.topic.replicas=1
   metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter
   confluent.metrics.reporter.bootstrap.servers=localhost:9092
   ```

   - The first line indicates to Control Center that your deployment is in development mode, using a replication factor of `1`.
   - The other two lines enable metrics reporting on Control Center, and provide access to the Confluent internal topic that collects and stores the monitoring data.
4. Edit `etc/confluent-control-center/control-center-dev.properties` to add the following two lines that specify origin and destination bootstrap servers for Control Center, as is required for monitoring multiple clusters. (A convenient place to add these lines is near the top of the file under “Control Center Settings”, immediately after the line that specifies `confluent.controlcenter.id`.)
   ```bash
   # multi-cluster monitoring
   confluent.controlcenter.kafka.origin.bootstrap.servers=localhost:9082
   confluent.controlcenter.kafka.destination.bootstrap.servers=localhost:9092
   ```
5. Restart the brokers on the destination and origin clusters with the same commands used above, for example:
   ```none
   ./bin/kafka-server-start my-examples/server_destination.properties
   ```

   ```none
   ./bin/kafka-server-start my-examples/server_origin.properties
   ```
6. Restart Replicator and the Connect worker with the same command as above. For example:
   ```none
   ./bin/replicator --cluster.id replicator --consumer.config my-examples/consumer.properties --producer.config my-examples/producer.properties --replication.config my-examples/replication.properties --whitelist 'test-topic'
   ```
7. Launch Control Center with the following command.
   ```none
   ./bin/control-center-start etc/confluent-control-center/control-center-dev.properties
   ```

   If no port is defined in `control-center-dev.properties`, Control Center runs by default on port `9021`, as described in [Control Center for Confluent Platform](https://docs.confluent.io/control-center/current/overview.html). This is the desired config for this deployment.
8. Open Control Center at [http://localhost:9021/](http://localhost:9021/) in your web browser.

   The clusters are rendered on Control Center with auto-generated names, based on your configuration.
   ![image](images/c3-replicators-multi-cluster.png)
9. (Optional) On Control Center, edit the cluster names to suit your use case, as described in [Origin and Destination clusters](https://docs.confluent.io/control-center/current/replicators.html#origin-and-destination-clusters) in “Replicators” in the Control Center User Guide.
10. On Control Center, select the destination cluster, click **Replicators** on the navigation panel, and use Control Center to monitor replication performance and drill down on source and replicated topics.
    ![image](images/c3-replicators-all.png)

    To see messages produced to both the original and replicated topic on Control Center, try out `kafka-consumer-perf-test`
    in its own command window to auto-generate test data to `test-topic`.
    ```none
    kafka-producer-perf-test \
       --producer-props bootstrap.servers=localhost:9082 \
       --topic test-topic \
       --record-size 1000 \
       --throughput 1000 \
       --num-records 3600000
    ```

    The command provides status output on messages sent, as shown:
    ```none
    4999 records sent, 999.8 records/sec (0.95 MB/sec), 1.1 ms avg latency, 240.0 ms max latency.
    5003 records sent, 1000.2 records/sec (0.95 MB/sec), 0.5 ms avg latency, 4.0 ms max latency.
    5003 records sent, 1000.2 records/sec (0.95 MB/sec), 0.6 ms avg latency, 5.0 ms max latency.
    5001 records sent, 1000.2 records/sec (0.95 MB/sec), 0.3 ms avg latency, 3.0 ms max latency.
    5001 records sent, 1000.0 records/sec (0.95 MB/sec), 0.3 ms avg latency, 4.0 ms max latency.
    5000 records sent, 1000.0 records/sec (0.95 MB/sec), 0.8 ms avg latency, 24.0 ms max latency.
    5001 records sent, 1000.2 records/sec (0.95 MB/sec), 0.6 ms avg latency, 3.0 ms max latency.
    ...
    ```

    Like before, you can consume these messages from the command line, using kafka-console-consumer to verify that the replica topic is receiving them:
    ```none
    ./bin/kafka-console-consumer --from-beginning --topic test-topic.replica --bootstrap-server localhost:9092
    ```

    You can also verify this on Control Center. Navigate to `test-topic` on the origin cluster to view messages on the original topic,
    and to `test-topic.replica` on the destination to view messages on the replicated topic.
    ![image](images/c3-replicator-topic-drilldown-messages.png)
11. To learn more about monitoring Replicators in Control Center, see [“Replicators” in Control Center User Guide](https://docs.confluent.io/control-center/current/replicators.html).
12. When you have completed your experiments with the tutorial, be sure to perform clean up as follows:
    - Stop any producers and consumers using `Ctl-C` in the each command window.
    - Use `Ctl-C` in each command window to stop each service in reverse order to which you started them (stop Control Center first, then Replicator, and finally the Kafka brokers.


#### Test your Replicator

Following is a generic Replicator testing scenario. A similar testing strategy is covered with more context as a part of the
Replicator tutorial in the section [Configure and run Replicator](replicator-quickstart.md#config-and-run-replicator).

1. Create a test topic.

   If you haven’t already, create a topic named `test-topic` in the source cluster with the following command.
   ```none
   ./bin/kafka-topics --create --topic test-topic --replication-factor \
   1 --partitions 4 --bootstrap-server localhost:9082

   ./bin/kafka-topics --describe --topic test-topic.replica --bootstrap-server localhost:9092
   ```

   The `kafka-topics --describe --topic` step in the above command checks whether
   `test-topic.replica` exits. After verifying that the topic exists, confirm
   that four partitions were created. In general, the Replicator makes sure that
   the destination topic has at least as many partitions as the source topic. It is
   fine if it has more, but because the Replicator preserves the partition assignment
   of the source data, any additional partitions will not be utilized.
2. Send data to the source cluster.

   At any time after you’ve created the topic in the source cluster, you
   can begin sending data to it using a Kafka producer to write to
   `test-topic` in the source cluster. You can then confirm that the data
   has been replicated by consuming from `test-topic.replica` in the
   destination cluster.

   For example, to send a sequence of numbers using
   Kafka’s console producer, you can use the following command.
   ```none
   seq 10000 | ./bin/kafka-console-producer --topic test-topic --broker-list localhost:9082
   ```
3. Run a consumer to confirm that the destination cluster got the data.

   You can then confirm delivery in the destination cluster using the console consumer.
   ```none
   ./bin/kafka-console-consumer --from-beginning --topic test-topic.replica \
   --bootstrap-server localhost:9092
   ```


### Handling differences between preregistered and client-derived schemas

The following properties can be configured in any client using a Schema Registry serializer (producers, streams, Connect).
These are described specifically for connectors in [Kafka Connect converters](/platform/current/connect/concepts.html#converters), including full reference documentation
in the section, [Configuration Options](/platform/current/schema-registry/connect.html#configuration-options).

- `auto.register.schemas` - Specify if the serializer should attempt to register the schema with Schema Registry.
- `use.latest.version` - Only applies when `auto.register.schemas` is set to `false`. If `auto.register.schemas` is set to
  `false` and `use.latest.version` is set to `true`, then instead of deriving a schema for the object passed to the client for serialization,
  Schema Registry will use the latest version of the schema in the subject for serialization.
- `latest.compatibility.strict` - The default is `true`, but this only applies when `use.latest.version=true`.  If both properties are `true`,
  a check is performed during serialization to verify that the latest subject version is backward compatible with the schema of the object being serialized.
  If the check fails, an error is thrown. If `latest.compatibility.strict` is `false`, then the latest subject version is used for serialization,
  without any compatibility check. Relaxing the compatibility requirement (by setting `latest.compatibility.strict` to `false`) may be useful, for example,
  when using [schema references](#referenced-schemas).

The following table summarizes serializer behaviors based on the configurations of these three properties.

| auto.register.schemas   | use.latest.version   | latest.compatibility.strict   | Behavior                                                                                                                                                                                                                                                                                                                                               |
|-------------------------|----------------------|-------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **true**                | **(true or false)**  | **(true or false)**           | The serializer will attempt to register the schema with Schema Registry by deriving a schema for the object passed to the client for serialization. When `auto.register.schemas` is set to `true`, `use.latest.version` and `latest.compatibility.strict` are ignored, so it doesn’t matter how those are set; `auto.register.schemas` overrides them. |
| **false**               | **true**             | **false**                     | Schema Registry will use the latest version of the schema in the subject for serialization.                                                                                                                                                                                                                                                            |
| **false**               | **true**             | **true**                      | The serializer performs a check to verify that the latest subject version is backward compatible with the schema of the object being serialized. If the check fails, the serializer throws an error.                                                                                                                                                   |

Here are two scenarios where you may want to disable schema auto-registration, and enable `use.latest.version`:

- **Using schema references to combine multiple events in the same topic** - You can use [Schema references](#referenced-schemas) as a way to combine
  multiple events in the same topic. Disabling schema auto-registration is integral to this configuration for Avro and JSON Schema serializers.
  Examples of configuring serializers to use the latest schema version instead of auto-registering schemas
  are provided in the sections on [combining multiple event types in the same topic (Avro)](serdes-avro.md#multiple-event-types-same-topic-avro)
  and [combining multiple event types in the same topic (JSON)](serdes-json.md#multiple-event-types-same-topic-json).
- **Ramping up production efficiency by disabling schema auto-registration and avoiding “Schema not found” exceptions** - Sometimes subtle
  (but not semantically significant) differences can exist between a pre-registered schema and the schema used by the client
  when using code-generated classes from the pre-registered schema with a Schema Registry aware serializer. An example of this is with Protobuf,
  where a fully-qualified type name such as `google.protobuf.Timestamp` may code-generate a descriptor with the type name
  `.google.protobuf.Timestamp`. Schema Registry considers these two variations of the same type name to be different. With auto-registration enabled,
  this  would result in auto-registering two essentially identical schemas. With auto-registration disabled, this can cause a “Schema not found”.
  To configure the serializer to not register new schemas and ignore minor differences between client and registered schemas which could
  cause unexpected “Schema not found” exceptions, set these properties in your serializer configuration:
  ```properties
  auto.register.schemas=false
  use.latest.version=true
  latest.compatibility.strict=false
  ```

  The `use.latest.version` sets the serializer to retrieve the latest schema
  version for the subject, and use that for validation and serialization, ignoring
  the client’s schema. The assumption is that if there are any differences between
  client and latest registered schema, they are minor and backward compatible.


### Adding security credentials

The [test drive](#sr-test-drive-avro) examples show how to use the producer
and consumer console clients as serializers and deserializers by passing Schema Registry
properties on the command line and in config files. In addition to examples
given in the “Test Drives”, you can pass truststore and keystore credentials for
the Schema Registry, as described in [Additional configurations for HTTPS](/platform/current/schema-registry/security/index.html#additional-configurations-for-https)
Here is an example for the producer on Confluent Platform:

```bash
kafka-avro-console-producer --bootstrap-server localhost:9092 \
--property schema.registry.url=http://localhost:8081 --topic transactions-avro \
--property value.schema='{"type":"record","name":"Transaction","fields":[{"name":"id","type":"string"},{"name": "amount", "type": "double"}]}' \
--property schema.registry.ssl.truststore.location=/etc/kafka/security/schema.registry.client.truststore.jks \
--property schema.registry.ssl.truststore.password=myTrustStorePassword
```


### Adding security credentials

The [test drive](#sr-test-drive-json-schema) examples show how to use the producer
and consumer console clients as serializers and deserializers by passing Schema Registry
properties on the command line and in config files. In addition to examples
given in the “Test Drives”, you can pass truststore and keystore credentials for
the Schema Registry, as described in [Additional configurations for HTTPS](/platform/current/schema-registry/security/index.html#additional-configurations-for-https).
Here is an example for the producer on Confluent Platform:

```bash
kafka-json-schema-console-producer --bootstrap-server localhost:9092 \
--property schema.registry.url=http://localhost:8081 --topic transactions-json \
--property value.schema='{"type":"object", "properties":{"id":{"type":"string"}, "amount":{"type":"number"} }, "additionalProperties": false}' \
--property schema.registry.ssl.truststore.location=/etc/kafka/security/schema.registry.client.truststore.jks \
--property schema.registry.ssl.truststore.password=myTrustStorePassword
```


### Adding security credentials

The [test drive](#sr-test-drive-protobuf) examples show how to use the producer
and consumer console clients as serializers and deserializers by passing Schema Registry
properties on the command line and in config files. In addition to examples
given in the “Test Drives”, you can pass truststore and keystore credentials for
the Schema Registry, as described in [Additional configurations for HTTPS](/platform/current/schema-registry/security/index.html#additional-configurations-for-https).
Here is an example for the producer on Confluent Platform:

```bash
protobuf-console-producer --broker-list localhost:9093 --topic myTopic \
--producer.config ~/ect/kafka/producer.properties\
--property value.schema='syntax = "proto3"; message MyRecord {string id = 1; float amount = 2; string customer_id=3;}' \
--property schema.registry.url=https://localhost:8081 \
--property schema.registry.ssl.truststore.location=/etc/kafka/security/schema.registry.client.truststore.jks \
--property schema.registry.ssl.truststore.password=myTrustStorePassword
```


#### Configure the Confluent Control Center properties files

In the Control Center properties file, you will use the default ports for `bootstrap.servers` and `zookeeper.connect`, but modify and add several other configurations.

1. Copy the default Control Center properties file to use as a basis for a specialized Control Center properties file for this tutorial:
   ```bash
   cp $CONFLUENT_HOME/etc/confluent-control-center/control-center-dev.properties $CONFLUENT_HOME/etc/confluent-control-center/control-center-multi-sr.properties
   ```
2. Append the following lines to the end of the file. These update some defaults and add new configurations to match the server and Schema Registry setups in previous steps:
   ```bash
   echo "confluent.controlcenter.kafka.AK1.bootstrap.servers=localhost:9093" >> $CONFLUENT_HOME/etc/confluent-control-center/control-center-multi-sr.properties
   ```

   ```bash
   echo "confluent.controlcenter.streams.cprest.url=http://0.0.0.0:8090" >> $CONFLUENT_HOME/etc/confluent-control-center/control-center-multi-sr.properties
   ```

   ```bash
   echo "confluent.controlcenter.kafka.AK1.cprest.url=http://0.0.0.0:8091" >> $CONFLUENT_HOME/etc/confluent-control-center/control-center-multi-sr.properties
   ```

   ```bash
   echo "confluent.controlcenter.schema.registry.SR-AK1.url=http://localhost:8082" >> $CONFLUENT_HOME/etc/confluent-control-center/control-center-multi-sr.properties
   ```


#### Example Producer Code


When constructing the producer, configure the message value class to use the
application’s code-generated `Payment` class. For example:

```java
...
import io.confluent.kafka.serializers.KafkaAvroSerializer;
...
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
...
KafkaProducer<String, Payment> producer = new KafkaProducer<String, Payment>(props));
final Payment payment = new Payment(orderId, 1000.00d);
final ProducerRecord<String, Payment> record = new ProducerRecord<String, Payment>(TOPIC, payment.getId().toString(), payment);
producer.send(record);
...
```

Because the `pom.xml` includes `avro-maven-plugin`, the `Payment` class is automatically generated during compile.

In this example, the connection information to the Kafka brokers and Schema Registry is provided by the configuration file that is passed into the code, but if you want to specify the connection information directly in the client application, see [this java template](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs/java_producer_consumer.delta).

For a full Java producer example, refer to [the producer example](https://github.com/confluentinc/examples/tree/latest/clients/avro/src/main/java/io/confluent/examples/clients/basicavro/ProducerExample.java).


## Configure the Kafka broker to connect to Schema Registry

In cases where [broker-side schema validation](../schema-validation.md#schema-validation) is enabled on topics, the Kafka Broker attempts to
connect to Schema Registry. Provide the following configurations in the broker properties file
to allow the broker to connect to Schema Registry for validation. For example, if using [KRaft](../../kafka-metadata/kraft.md#kraft-overview),
you would configure this in one of `$CONFLUENT_HOME/etc/kafka/broker.properties`, `controller.properties`,
or `server.properties`, depending on your [KRaft setup](../../kafka-metadata/config-kraft.md#kraft-config-options).

If role-based access control (RBAC) is enabled, the principal defined here should have appropriate permissions.

```bash

### Configure Schema Registry to communicate with RBAC services

The next set of examples show how to connect a local Schema Registry to a remote
Metadata Service (MDS) running RBAC. The `schema.registry.properties` file
configurations reflect a remote Metadata Service (MDS) URL, location, and Kafka cluster
ID. Also, the examples assume you are using credentials you got from your
Security administrator for a pre-configured schema registry principal user
(”service principal”), as mentioned in the prerequisites.

Define these settings in `CONFLUENT_HOME/etc/schema-registry/schema-registry.properties`:

1. Configure Schema Registry authorization for communicating with the RBAC Kafka cluster.

   The `username` and `password` are RBAC credentials for the Schema Registry service principal, and metadataServerUrls is the location of your RBAC Kafka cluster (for example, a URL to an ec2 server).
   ```bash
   # Authorize Schema Registry to talk to Kafka (security protocol may also be SASL_SSL if using TLS/SSL)
   kafkastore.security.protocol=SASL_PLAINTEXT
   kafkastore.sasl.mechanism=OAUTHBEARER
   kafkastore.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
   kafkastore.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
   username="<username>" \
   password="<password>" \
   metadataServerUrls="<https>://<metadata_server_url>:<port>";
   ```
2. Configure RBAC authorization, and bearer/basic authentication, for the Schema Registry resource.

   These settings can be used as-is, JETTY_AUTH is the recommended authentication mechanism.
   ```bash
   # These properties install the Schema Registry security plugin, and configure it to use RBAC for
   # authorization and OAuth for authentication
   resource.extension.class=io.confluent.kafka.schemaregistry.security.SchemaRegistrySecurityResourceExtension
   confluent.schema.registry.authorizer.class=io.confluent.kafka.schemaregistry.security.authorizer.rbac.RbacAuthorizer
   rest.servlet.initializor.classes=io.confluent.common.security.jetty.initializer.InstallBearerOrBasicSecurityHandler
   confluent.schema.registry.auth.mechanism=JETTY_AUTH
   ```
3. Tell Schema Registry how to communicate with the Kafka cluster running the Metadata Service (MDS) and how to authenticate requests using a public key.
   - The value for `confluent.metadata.bootstrap.server.urls` can be the same as `metadataServerUrls`, depending on your environment.
   - In this step, you need a public key file to use to verify requests with token-based authorization, as mentioned in the prerequisites.

   ```bash
   # The location of the metadata service
   confluent.metadata.bootstrap.server.urls=<https>://<metadata_server_url>:<port>

   # Credentials to use with the MDS, these should usually match those used for talking to Kafka
   confluent.metadata.basic.auth.user.info=<username>:<password>
   confluent.metadata.http.auth.credentials.provider=BASIC

   # The path to public keys that should be used to verify json web tokens during authentication
   public.key.path=<public_key_file_path.pem>
   ```

   For additional configurations available to any client communicating with MDS, see also [REST client configurations](../../kafka/configure-mds/mds-configuration.md#rest-client-mds-config) in the Confluent Platform Security documentation.
4. Specify the `kafkastore.bootstrap.server` you want to use.

   The default is a commented out line for a local server. If you do not change this or uncomment it, the default will be used.
   ```bash
   #kafkastore.bootstrap.servers=PLAINTEXT://localhost:9092
   ```

   Uncomment this line and set it to the address of your bootstrap server. This may be different from the MDS server URL. The standard port for the Kafka bootstrap server is `9092`.
   ```bash
   kafkastore.bootstrap.servers=<rbac_kafka_bootstrap_server>:9092
   ```
5. (Optional) Specify a custom `schema.registry.group.id` (to serve as Schema Registry cluster ID) which is different from the default, **schema-registry**.

   In the example, `schema.registry.group.id` is set to “schema-registry-cool-cluster”.
   ```bash
   # Schema Registry group id, which is the cluster id
   # The default for |sr| cluster ID is **schema-registry**
   schema.registry.group.id=schema-registry-cool-cluster
   ```
6. (Optional) Specify a custom name for the Schema Registry default topic. (The default is **\_schemas**.)

   In the example, `schema.registry.group.id` is set to `_jax-schemas-topic`.
   ```bash
   # The name of the topic to store schemas in
   # The default schemas topic is **_schemas**
   kafkastore.topic=_jax-schemas-topic
   ```
7. (Optional) Enable anonymous access to requests that occur without authentication.

   Any requests that occur without authentication are automatically granted the principal `User:ANONYMOUS`
   ```bash
   # This enables anonymous access with a principal of User:ANONYMOUS
   confluent.schema.registry.anonymous.principal=true
   authentication.skip.paths=/*
   ```

   If you get the following error about not having authorization when you run
   the `curl` command to list subjects as described in
   [Start Schema Registry and test it](#rbac-start-and-test-sr), you can enable anonymous requests to bypass
   the authentication temporarily while you troubleshoot credentials.
   ```bash
   curl localhost:8081/subjects
   <html>
   <head>
   <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
   <title>Error 401 Unauthorized</title>
   </head>
   <body><h2>HTTP ERROR 401</h2>
   <p>Problem accessing /subjects. Reason:
   <pre>    Unauthorized</pre></p><hr>Powered by Jetty:// 9.4.18.v20190429<hr/>

   </body>
   </html>
   ```


## Confluent Replicator

Confluent Replicator is a type of Kafka source connector that replicates data from a source to destination Kafka cluster. An embedded consumer inside Replicator consumes data from the source cluster, and an embedded producer inside the Kafka Connect worker produces data to the destination cluster.

Replicator version 4.0 and earlier requires a connection to ZooKeeper in the origin and destination Kafka clusters. If ZooKeeper is configured for authentication, the client configures the ZooKeeper security credentials via the global JAAS configuration setting `-Djava.security.auth.login.config` on the Connect workers, and the ZooKeeper security credentials in the origin and destination clusters must be the same.

To configure Confluent Replicator security, you must configure the Replicator connector as shown below and additionally you must configure:

* [Kafka Connect](#authentication-ssl-connect)

To add TLS to the Confluent Replicator embedded consumer, modify the Replicator JSON
properties file.

This example is a subset of configuration properties to add for TLS encryption
and authentication. The assumption here is that client authentication is required
by the brokers.

```bash
{
  "name":"replicator",
    "config":{
      ....
      "src.kafka.ssl.truststore.location":"/etc/kafka/secrets/kafka.connect.truststore.jks",
      "src.kafka.ssl.truststore.password":"confluent",
      "src.kafka.ssl.keystore.location":"/etc/kafka/secrets/kafka.connect.keystore.jks",
      "src.kafka.ssl.keystore.password":"confluent",
      "src.kafka.ssl.key.password":"confluent",
      "src.kafka.security.protocol":"SSL"
      ....
    }
  }
}
```


## Schema Registry

Schema Registry uses Kafka to persist schemas, and so it acts as a client to write data to the Kafka cluster. Therefore, if the Kafka brokers are configured for security, you should also configure Schema Registry to use security.  You may also refer to the complete list of [Schema Registry configuration options](../../../schema-registry/installation/config.md#schemaregistry-config).

The following is an example subset of `schema-registry.properties` configuration
parameters to add for TLS encryption and authentication. The assumption
here is that client authentication is required by the brokers.

```bash
kafkastore.bootstrap.servers=SSL://kafka1:9093
kafkastore.security.protocol=SSL
kafkastore.ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
kafkastore.ssl.truststore.password=test1234
kafkastore.ssl.keystore.location=/var/private/ssl/kafka.client.keystore.jks
kafkastore.ssl.keystore.password=test1234
kafkastore.ssl.key.password=test1234
```


## Configure Confluent Replicator

Confluent Replicator is a type of Kafka source connector that replicates data from a source to destination Kafka cluster. An embedded consumer inside Replicator consumes data from the source cluster, and an embedded producer inside the Kafka Connect worker produces data to the destination cluster.

Replicator version 4.0 and earlier requires a connection to ZooKeeper in the origin and destination Kafka clusters. If ZooKeeper is configured for authentication, the client configures the ZooKeeper security credentials via the global JAAS configuration setting `-Djava.security.auth.login.config` on the Connect workers, and the ZooKeeper security credentials in the origin and destination clusters must be the same.

To configure Confluent Replicator security, you must configure the Replicator connector as shown below and additionally you must configure:

* [Kafka Connect](#sasl-plain-connect-workers)

Configure Confluent Replicator to use SASL/PLAIN by adding these properties in the Replicator’s JSON configuration file. The JAAS configuration property defines `username` and `password` used by Replicator to configure the user for connections. In this example, Replicator connects to the broker as user `replicator`.

```bash
{
  "name":"replicator",
    "config":{
      ....
      "src.kafka.security.protocol" : "SASL_SSL",
      "src.kafka.sasl.mechanism" : "PLAIN",
      "src.kafka.sasl.jaas.config" : "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"replicator\" password=\"replicator-secret\";",
      ....
    }
  }
}
```


## Configure Schema Registry

Schema Registry uses Kafka to persist schemas, and so it acts as a client to write data to the Kafka cluster. Therefore, if the Kafka brokers are configured for security, you should also configure Schema Registry to use security.  You may also refer to the complete list of [Schema Registry configuration options](../../../../schema-registry/installation/config.md#schemaregistry-config).

1. Here is an example subset of `schema-registry.properties` configuration parameters to add for SASL authentication:

```bash
kafkastore.bootstrap.servers=kafka1:9093

## Configure Kafka Connect

The following configurations are required for Kafka Connect worker operations,
like group coordination and internal topic management when Confluent Server brokers require
client authentication. Replace the placeholders with your actual values.

```properties
bootstrap.servers=<kafka-host>:9096
security.protocol=SSL
ssl.truststore.location=/path/to/truststore.jks
ssl.truststore.password=<truststore-password>
ssl.keystore.location=/path/to/keystore.jks
ssl.keystore.password=<keystore-password>
ssl.key.password=<key-password>

producer.security.protocol=SSL
producer.ssl.truststore.location=/path/to/truststore.jks
producer.ssl.truststore.password=<truststore-password>
producer.ssl.keystore.location=/path/to/keystore.jks
producer.ssl.keystore.password=<keystore-password>
producer.ssl.key.password=<key-password>

consumer.security.protocol=SSL
consumer.ssl.truststore.location=/path/to/truststore.jks
consumer.ssl.truststore.password=<truststore-password>
consumer.ssl.keystore.location=/path/to/keystore.jks
consumer.ssl.keystore.password=<keystore-password>
consumer.ssl.key.password=<key-password>

listeners.https.ssl.client.auth=required
listeners.https.ssl.truststore.location=/path/to/truststore.jks
listeners.https.ssl.truststore.password=<truststore-password>
listeners.https.ssl.keystore.location=/path/to/keystore.jks
listeners.https.ssl.keystore.password=<keystore-password>
listeners.https.ssl.key.password=<key-password>
```

To allow for request forwarding from follower to leader on mTLS, Kafka Connect
workers need to be configured on the secure impersonation super user list on
MDS.


### Run example

1. Clone the [confluentinc/examples](https://github.com/confluentinc/examples) GitHub repository, and check out the `8.1.0-post` branch.
   ```bash
   git clone https://github.com/confluentinc/examples.git
   cd examples
   git checkout 8.1.0-post
   ```
2. Navigate to `security/rbac/scripts` directory.
   ```bash
   cd security/rbac/scripts
   ```
3. You have two options to run the example.
   - Option 1: run the example end-to-end for all services
     ```bash
     ./run.sh
     ```
   - Option 2: step through it one service at a time
     ```bash
     ./init.sh
     ./enable-rbac-broker.sh
     ./enable-rbac-schema-registry.sh
     ./enable-rbac-connect.sh
     ./enable-rbac-rest-proxy.sh
     ./enable-rbac-ksqldb-server.sh
     ./enable-rbac-control-center.sh
     ```
4. After you run the example, view the configuration files:
   ```bash
   # The original configuration bundled with Confluent Platform
   ls /tmp/original_configs/
   ```

   ```bash
   # Configurations added to each service's properties file
   ls ../delta_configs/
   ```

   ```bash
   # The modified configuration = original + delta
   ls /tmp/rbac_configs/
   ```
5. After you run the example, view the log files for each of the services.
   All logs are saved in the temporary directory `/tmp/rbac_logs/`.

   In that directory, you can step through the configuration properties for each of the services:
   ```bash
   connect
   control-center
   kafka
   kafka-rest
   ksql-server
   schema-registry
   ```
6. In this example, the metadata service (MDS) logs are saved under your Confluent Platform installation directory.
   ```bash
   cat $CONFLUENT_HOME/logs/metadata-service.log
   ```


#### Broker

- Additional RBAC configurations required for [server.properties](https://github.com/confluentinc/examples/tree/latest/security/rbac/delta_configs/server.properties.delta)
  ```none
  # Confluent Authorizer Settings
  # Semi-colon separated list of super users in the format <principalType>:<principalName>
  # For example super.users=User:admin;User:mds
  super.users=User:ANONYMOUS;User:mds

  # MDS Server Settings
  confluent.metadata.topic.replication.factor=1

  # MDS Token Service Settings
  confluent.metadata.server.token.key.path=/tmp/tokenKeypair.pem

  # Configure the RBAC Metadata Service authorizer
  authorizer.class.name=io.confluent.kafka.security.authorizer.ConfluentServerAuthorizer
  confluent.authorizer.access.rule.providers=CONFLUENT,ZK_ACL

  # Bind Metadata Service HTTP service to port 8090
  confluent.metadata.server.listeners=http://0.0.0.0:8090
  # Configure HTTP service advertised hostname. Set this to http://127.0.0.1:8090 if running locally.
  confluent.metadata.server.advertised.listeners=http://127.0.0.1:8090

  # HashLoginService Initializer
  confluent.metadata.server.authentication.method=BEARER
  confluent.metadata.server.user.store=FILE
  confluent.metadata.server.user.store.file.path=/tmp/login.properties

  # Add named listener TOKEN to existing listeners and advertised.listeners
  listeners=TOKEN://:9092,PLAINTEXT://:9093
  advertised.listeners=TOKEN://localhost:9092,PLAINTEXT://localhost:9093

  # Add protocol mapping for newly added named listener TOKEN
  listener.security.protocol.map=PLAINTEXT:PLAINTEXT,TOKEN:SASL_PLAINTEXT

  listener.name.token.sasl.enabled.mechanisms=OAUTHBEARER

  # Configure the public key used to verify tokens
  # Note: username, password and metadataServerUrls must be set if used for inter-broker communication
  listener.name.token.oauthbearer.sasl.jaas.config= \
      org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
          publicKeyPath="/tmp/tokenPublicKey.pem";
  # Set SASL callback handler for verifying authentication token signatures
  listener.name.token.oauthbearer.sasl.server.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerValidatorCallbackHandler
  # Set SASL callback handler for handling tokens on login. This is essentially a noop if not used for inter-broker communication.
  listener.name.token.oauthbearer.sasl.login.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerServerLoginCallbackHandler

  # Settings for Self-Balancing Clusters
  confluent.balancer.topic.replication.factor=1

  # Settings for Audit Logging
  confluent.security.event.logger.exporter.kafka.topic.replicas=1
  ```
- Role bindings:
  ```bash
  # Broker Admin
  confluent iam rbac role-binding create --principal User:$USER_ADMIN_SYSTEM --role SystemAdmin --kafka-cluster $KAFKA_CLUSTER_ID

  # Producer/Consumer
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_A --role ResourceOwner --resource Topic:$TOPIC1 --kafka-cluster $KAFKA_CLUSTER_ID
  confluent iam rbac role-binding create --principal User:$USER_CLIENT_A --role DeveloperRead --resource Group:console-consumer- --prefix --kafka-cluster $KAFKA_CLUSTER_ID
  ```


# These credentials authorize ksqlDB Server to access the Kafka cluster.
sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
    metadataServerUrls="http://<metadata-server-hostname>" username="<ksql-user>" password="<ksql-user-password>";
```

Save the file and restart ksqlDB Server.

Log in to the server by using the ksqlDB CLI.

```bash
ksql --config-file <path_to_config_file> https://<ksqldb_server>:<port> --user <username> --password <password>
```

RBAC for ksqlDB depends on the Confluent Platform [Metadata Service (MDS)](overview.md#metadata-service)
and the Confluent Server Authorizer. The `confluent.metadata` settings configure the Metadata
Service. The `ksql.security.extension.class` setting configures ksqlDB for the
Confluent Server Authorizer. For more information, see [Configure Confluent Server Authorizer in Confluent Platform](../../csa-introduction.md#confluent-server-authorizer).

Use the ksqlDB service principal credentials for the following settings.

- `sasl.jaas.config` for authorizing to the Kafka cluster with Confluent Server Authorizer
- `confluent.metadata.basic.auth.user.info` for authorizing to MDS
- `ksql.schema.registry.basic.auth.user.info` for authorizing to Schema Registry


### POST /security/1.0/principals/{principal}/roles/{roleName}/resources

**Look up the rolebindings for the principal at the given scope/cluster using the given role.**

Callable by Admins.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
  * **roleName** (*string*) – The name of the role.

**Example request:**

```http
POST /security/1.0/principals/{principal}/roles/{roleName}/resources HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusterName": "string",
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    Granted

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    [
        {
            "resourceType": "Topic",
            "name": "clicksTopic1",
            "patternType": "LITERAL"
        },
        {
            "resourceType": "Topic",
            "name": "orders-2019",
            "patternType": "PREFIXED"
        }
    ]
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/lookup/principal/{principal}/resources

**Look up the resource bindings for the principal at the given scope/cluster.**

Includes bindings from groups that the user belongs to.

Callable by Admins+User.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.

**Example request:**

```http
POST /security/1.0/lookup/principal/{principal}/resources HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusterName": "string",
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    Nested map of principal-to-role-to-resources.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "User:alice": {
            "DeveloperRead": [
                {
                    "resourceType": "Topic",
                    "name": "billing-invoices",
                    "patternType": "LITERAL"
                }
            ]
        },
        "Group:Investors": {
            "DeveloperRead": [
                {
                    "resourceType": "Topic",
                    "name": "investing-",
                    "patternType": "PREFIXED"
                }
            ]
        }
    }
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/lookup/principal/{principal}/resource/{resourceType}/operation/{operation}

**Summarizes what resources and rolebindings this principal is allowed to create.**

Callable by Admins+User.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
  * **resourceType** (*string*) – The type of resource to create or the type of resource to specify when creating a new rolebinding.
  * **operation** (*string*) – “Create” for creating an actual resource, “AlterAccess” for creating a rolebinding for a user.

**Example request:**

```http
POST /security/1.0/lookup/principal/{principal}/resource/{resourceType}/operation/{operation} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    A deduped and squashed view of the user’s rolebindings for creating resources or rolebindings.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "result": "SOME",
        "resourcePatterns": [
            {
                "resourceType": "Topic",
                "name": "billing-invoices",
                "patternType": "LITERAL"
            },
            {
                "resourceType": "Topic",
                "name": "investing-",
                "patternType": "PREFIXED"
            }
        ]
    }
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


#### Configure audit log destination

The `destinations` option identifies the audit log cluster, which is provided by the
bootstrap server. Use this setting to identify the communication
channel between your audit log cluster and Kafka.  You can use the `bootstrap_server`
setting to deliver audit log messages to a specific cluster set aside for the
sole purpose of retaining them. This ensures that no one can access or tamper
with your organization’s audit logs, and enables you to selectively conduct more
in-depth auditing of sensitive data, while keeping log volumes down for less
sensitive data.

If you deliver audit logs to another cluster, you must configure the
connection to that cluster. Configure this connection as you would any producer
writing to the cluster, using the prefix `confluent.security.event.logger.exporter.kafka`
for the producer configuration keys, including the appropriate authentication
information.

For example, if you have a Kafka cluster listening on port 9092 of the host
`audit.example.com`, and that cluster accepts SCRAM-SHA-256 authentication and
has a principal named `confluent-audit` that is allowed to connect and produce
to the audit log topics, the configuration would look like the following:

```json
confluent.security.event.router.config=\
{ \
    "destinations": { \
        "bootstrap_servers": ["audit.example.com:9092"], \
        "topics": { \
            "confluent-audit-log-events": { \
                "retention_ms": 7776000000 \
            } \
        } \
    }, \
    "default_topics": { \
        "allowed": "confluent-audit-log-events", \
        "denied": "confluent-audit-log-events" \
    } \
}
confluent.security.event.logger.exporter.kafka.sasl.mechanism=SCRAM-SHA-256
confluent.security.event.logger.exporter.kafka.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
    username="confluent-audit" \
    password="secretP@ssword123";
```

Bootstrap servers may be provided in either the router configuration JSON or a
producer configuration property; if they appear in both places, the router
configuration takes precedence.


#### Configure audit log topic management on the destination cluster

MDS manages the audit log topics on the destination cluster, creating missing
topics, and keeping the retention time policies of those topics in sync with the
audit log configuration policy. For MDS to do this, you must configure the admin
client used by MDS to connect to the destination cluster.

Use the `confluent.security.event.logger.destination.admin.`
prefix when configuring the admin client in the MDS cluster’s `server.properties`
file. Other than the prefix requirement, this configuration is similar to other
admin client configurations. This connection must be consistent with the
producer configuration on this and all of the managed clusters. For details
about the properties specified here, refer to [Kafka AdminClient Configurations for Confluent Platform](../../../installation/configuration/admin-configs.md#cp-config-admin) and
Kafka [AdminClient](/platform/current/clients/javadocs/javadoc/org/apache/kafka/clients/admin/AdminClient.html).

**SASL_SSL Configuration**

```none
confluent.security.event.logger.destination.admin.bootstrap.servers=<logs1.example.com:9092,logs2.example.com:9092>
confluent.security.event.logger.destination.admin.security.protocol=SASL_SSL
confluent.security.event.logger.destination.admin.sasl.mechanism=PLAIN
confluent.security.event.logger.destination.admin.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
      username="<audit-log-admin-username>" \
      password="<audit-log-admin-password>";
confluent.security.event.logger.destination.admin.ssl.truststore.location=<path-to-truststore.jks>
confluent.security.event.logger.destination.admin.ssl.truststore.password=<truststore-password>
```


## Quick start

Prerequisite
: * The [Confluent Platform must be installed](../../../installation/overview.md#installation).
  * The [Confluent CLI](https://docs.confluent.io/confluent-cli/current/installing.html) must be installed.

1. Create a directory for storing the `security.properties` file. For example:
   ```text
   mkdir /usr/secrets/
   ```
2. Generate the master encryption key based on a passphrase.

   Typically, a passphrase is much longer than a password and is easily
   remembered as a string of words (for example,\`\`Data in motion\`\`). You can
   specify the passphrase either in clear text on the command
   line, or store it in a file. A best practice is to enter this passphrase
   into a file and then pass it to the CLI (specified as
   `--passphrase @<passphrase.txt>`). By using a file, you can avoid
   the logging history, which shows the passphrase in plain text.

   Choose a location for the secrets file on your local host (not a location
   where Confluent Platform services run). The secrets file contains encrypted secrets for
   the master encryption key, data encryption key, and configuration
   parameters, along with metadata, such as which cipher was used for
   encryption.
   ```text
   confluent secret master-key generate \
   --local-secrets-file /usr/secrets/security.properties  \
   --passphrase @<passphrase.txt>
   ```

   Your output should resemble:
   ```text
   Save the master key. It cannot be retrieved later.
   +------------+----------------------------------------------+
   | Master Key | abC12DE+3fG45Hi67J8KlmnOpQr9s0Tuv+w1x2y3zab= |
   +------------+----------------------------------------------+
   ```
3. Save the master key because *it cannot be retrieved later*.
4. Export the master key in the environment variable, or add the master key to
   a bash script.

   #### IMPORTANT
   The subsequent [confluent secret](https://docs.confluent.io/confluent-cli/current/command-reference/secret/index.html)
   commands will fail if the environment variable is not set.

   ```text
   export CONFLUENT_SECURITY_MASTER_KEY=abC12DE+3fG45Hi67J8KlmnOpQr9s0Tuv+w1x2y3zab=
   ```
5. Encrypt the specified configuration parameters.

   This step encrypts the properties specified by `--config` in the configuration
   file specified by `--config-file`. The property values are read from the
   configuration file, encrypted, and written to the local secrets file specified
   by `--local-secrets-file`.  In place of the property values, instructions
   that are written into the configuration file allow the configuration resolution
   system to retrieve the secret values at runtime.

   The file path you specify in `--remote-secrets-file` is written into the
   configuration instructions and identifies where the resolution system can
   locate the secrets file at runtime. If you are running the secrets command
   centrally and distributing the secrets file to each node, then specify the
   eventual path of the secrets file in `--remote-secrets-file`. If you plan
   to run the secrets command on each node, then the `remote-secrets-file` should
   match the location specified by `--local-secrets-file`.

   #### NOTE
   Updates specified with `--local-secrets-file` flag modify the
   `security.properties` file. For every broker where you
   specify `--local-secrets-file`, you can store the `security.properties`
   file in a different location, which you specify using the
   `--remote-secrets-file`.

   For example, when encrypting a broker:
   - In `--local-secrets-file`, specify the file where the Confluent CLI
     will add and/or modify encrypted parameters. This modifies the
     `security.properties` file.
   - In `--remote-secrets-file`, specify the location of
     `security.properties` file that the broker will reference.

   If the `--config` flag is not specified, any property that contains the
   string `password` is encrypted in the configuration key. When running `encrypt`
   use a comma to specify multiple keys, for example: `--config "config.storage.replication.factor,config.storage.topic"`.
   This option is not available when using the `add` or `update` commands.

   Use the following example command to encrypt the `config.storage.replication.factor`
   and `config.storage.topic` parameters:
   ```text
   confluent secret file encrypt --config-file /etc/kafka/connect-distributed.properties \
   --local-secrets-file /usr/secrets/security.properties \
   --remote-secrets-file /usr/secrets/security.properties \
   --config "config.storage.replication.factor,config.storage.topic"
   ```

   You should see a similar entry in your `security.properties` file. This example shows the encrypted
   `config.storage.replication.factor` parameter.
   ```text
   config.storage.replication.factor = ${securepass:/usr/secrets/security.properties:connect-distributed.properties/config.storage.replication.factor}
   ```
6. Decrypt the encrypted configuration parameter.
   ```text
   confluent secret file decrypt \
   --local-secrets-file /usr/secrets/security.properties \
   --config-file /etc/kafka/connect-distributed.properties \
   --output-file decrypt.txt
   ```

   You should see the decrypted parameter. This example shows the decrypted `config.storage.replication.factor` parameter.
   ```text
   config.storage.replication.factor=1
   ```


## Configure TLS encryption for Replicator

Confluent Replicator is a type of Kafka source connector that replicates data from a source to destination Kafka cluster. An embedded consumer inside Replicator consumes data from the source cluster, and an embedded producer inside the Kafka Connect worker produces data to the destination cluster.

Replicator version 4.0 and earlier requires a connection to ZooKeeper in the origin and destination Kafka clusters. If ZooKeeper is configured for authentication, the client configures the ZooKeeper security credentials via the global JAAS configuration setting `-Djava.security.auth.login.config` on the Connect workers, and the ZooKeeper security credentials in the origin and destination clusters must be the same.

To configure Confluent Replicator security, you must configure the Replicator connector as shown below and additionally you must configure:

* [Kafka Connect](#encryption-ssl-connect)

To add TLS encryption to the Confluent Replicator embedded consumer, modify the Replicator JSON
properties file.

Here is an example subset of configuration properties to add for TLS encryption:

```bash
{
  "name":"replicator",
    "config":{
      ....
      "src.kafka.ssl.truststore.location":"/etc/kafka/secrets/kafka.connect.truststore.jks",
      "src.kafka.ssl.truststore.password":"confluent",
      "src.kafka.security.protocol":"SSL"
      ....
    }
  }
}
```


## Configure Kafka Connect

From the perspective of the Confluent Server brokers, Kafka Connect is another Kafka client, and this
tutorial configures Kafka Connect for TLS/SSL encryption and SASL/PLAIN authentication.
Enabling Connect for security is simply a matter of passing the security
configurations to the Connect workers, the producers used by source connectors,
and the consumers used by sink connectors.

Take the basic client security configuration:

```bash
security.protocol=SASL_SSL
ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
ssl.truststore.password=test1234
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
    username="client" \
    password="client-secret";
```

And configure Kafka Connect for the following:

* Top-level for Connect workers, with no additional configuration prefix
* Embedded producer for source connectors, with an additional configuration prefix `producer.`
* Embedded consumers for sink connectors, with an additional configuration prefix `consumer.`

Combining these configurations, a Kafka Connect worker configuration for TLS/SSL
encryption and SASL/PLAIN authentication is the following. You may configure
these settings in the `connect-distributed.properties` file.

```bash

## Example 6: JDBC source connector with Avro to ksqlDB -> Key:Long and Value:Avro

- [Kafka Connect JDBC source connector](https://github.com/confluentinc/examples/tree/latest/connect-streams-pipeline/jdbcavroksql-connector.json) produces Avro values, and null keys, to a Kafka topic.

```none
{
  "name": "test-source-sqlite-jdbc-autoincrement-jdbcavroksql",
  "config": {
    "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
    "tasks.max": "1",
    "key.converter": "org.apache.kafka.connect.json.JsonConverter",
    "key.converter.schemas.enable": "false",
    "value.converter": "io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url": "http://schema-registry:8081",
    "value.converter.schemas.enable": "true",

    "connection.url": "jdbc:sqlite:/usr/local/lib/retail.db",
    "mode": "incrementing",
    "incrementing.column.name": "id",
    "topic.prefix": "jdbcavroksql-",
    "table.whitelist": "locations"
  }
}
```

- [ksqlDB](https://github.com/confluentinc/examples/tree/latest/connect-streams-pipeline/jdbcavro_statements.sql) reads from the Kafka topic and then uses `PARTITION BY` to create a new stream of messages with `BIGINT` keys.

![image](streams/images/example_6.jpg)


### Configure consumer connection

1. Define the configuration for the consumer listener on the source cluster.

   The following is an example with TLS and SASL/PLAIN enabled:
   ```yaml
   kafka_connect_replicator_consumer_listener:
     ssl_enabled: true
     sasl_protocol: plain
   ```
2. Define the basic configuration for the consumer client connection:
   ```yaml
   kafka_connect_replicator_consumer_bootstrap_servers: <source cluster hostname:port>
   ```
3. Define the security configuration for the consumer client connection:
   ```yaml
   kafka_connect_replicator_consumer_ssl_ca_cert_path: <path to your CA certificate>
   kafka_connect_replicator_consumer_ssl_cert_path: <path to your signed certificate>
   kafka_connect_replicator_consumer_ssl_key_path: <path to your SSL key>
   kafka_connect_replicator_consumer_ssl_key_password: <SSL key password>
   ```
4. Define custom properties for each client connection:
   ```yaml
   kafka_connect_replicator_consumer_custom_properties:
     <custom property: value>
   ```
5. For RBAC-enabled deployment, define the additional client custom properties.

   Specify either the Kafka cluster id
   (`kafka_connect_replicator_consumer_kafka_cluster_id`) or the cluster name
   (`kafka_connect_replicator_consumer_kafka_cluster_name`).
   ```yaml
   kafka_connect_replicator_consumer_erp_tls_enabled: <true if Confluent REST API has TLS enabled>
   kafka_connect_replicator_consumer_erp_host: <Confluent Rest API host URL>
   kafka_connect_replicator_consumer_erp_admin_user: <mds or your Kafka super user>
   kafka_connect_replicator_consumer_erp_admin_password: <password>
   kafka_connect_replicator_consumer_kafka_cluster_id: <source cluster id>
   kafka_connect_replicator_consumer_kafka_cluster_name: <source cluster name>
   kafka_connect_replicator_consumer_erp_pem_file: <path to oauth pem file>
   ```


### Configure OAuth authentication using client credentials

To enable credential-based OAuth on all Confluent Platform components, where clients
authenticate with server using a client ID and a password, set the following
variables:

```yaml
all:
  vars:
    auth_mode: oauth
    oauth_superuser_client_id: <superuser_client_id>
    oauth_superuser_client_password: <superuser_client_secret>
    oauth_sub_claim: client_id
    oauth_groups_claim: groups
    oauth_token_uri: <idp_token_uri>
    oauth_issuer_url: <idp_issuer_url>
    oauth_jwks_uri: <idp_jwks_uri>
    oauth_expected_audience: Confluent,account,api://default
    schema_registry_oauth_user: <sr_client_id>
    schema_registry_oauth_password: <sr_client_secret>
    kafka_rest_oauth_user: <rp_client_id>
    kafka_rest_oauth_password: <rp_client_secret>
    kafka_connect_oauth_user: <connect_client_id>
    kafka_connect_oauth_password: <connect_client_secret>
    ksql_oauth_user: <ksql_client_id>
    ksql_oauth_password: <ksql_client_secret>
    control_center_next_gen_oauth_user: <c3_client_id>
    control_center_next_gen_oauth_password: <c3_client_secret>
    # Only needed when OAuth IdP server has TLS enabled with custom certificate.
    oauth_idp_cert_path: <cert_path>
```

For an example inventory file for a greenfield credential-based OAuth
configuration, see the sample inventory file at:

```html
https://github.com/confluentinc/cp-ansible/blob/8.1.0-post/docs/sample_inventories/oauth_greenfield.yml
```


### Required settings for RBAC with centralized MDS

To enable and configure RBAC with the centralized MDS, add the additional
mandatory variables in your inventory file to.

**Enable RBAC centralized MDS with Ansible**

```none
all:
  vars:
    external_mds_enabled: true
```

**Provide the centralized MDS bootstrap URLs**

Specify the URL for the MDS REST API on the Kafka cluster hosting MDS:

```none
all:
  vars:
    mds_bootstrap_server_urls:
```

For example:

```none
all:
  vars:
    mds_bootstrap_server_urls: https://ip-172-31-34-246.us-east-1.compute.internal:8090,https://ip-172-31-34-246.us-east-2.compute.internal:8090
```

**Provide the centralized MDS bootstrap servers**

Specify a list of the hostnames and ports for the listeners hosting the MDS
that you wish to connect to:
`<mds-broker-hostname1>:<port>,<mds-broker-hostname2>:<port>`

```none
all:
  vars:
    mds_broker_bootstrap_servers:
```

For example:

```none
all:
  vars:
    mds_broker_bootstrap_servers: ip-172-31-43-14.us-west-1.compute.internal:9093,ip-172-31-43-14.us-west-2.compute.internal:9093
```

**Provide the centralized MDS broker listener security configuration**

Specify the security settings of the remote Kafka broker that the centralized MDS
runs on: (`mds_broker_bootstrap_servers`):

```none
all:
  vars:
    mds_broker_listener:
      ssl_enabled:               --- [1]
      ssl_client_authentication: --- [2]
      ssl_mutual_auth_enabled:   --- [3]
      sasl_protocol:             --- [4]
```

* [1] Set `ssl_enabled` to `true` if the remote MDS uses TLS.
* [2] Set `ssl_mutual_auth_enabled` to `true` if the remote MDS uses mTLS.
* [3] Set `ssl_client_authentication` to `required` if the remote MDS uses mTLS.
* [4] Set `sasl_protocol` to the SASL protocol for the remote MDS. Options are:
  `none`, `kerberos`, `sasl_plain`, `sasl_scram`

  The MDS listener must have an authentication mode, mTLS, Kerberos, SASL/PLAIN,
  or SASL/SCRAM.

  You can set `sasl_protocol` to `none` only if `ssl_enabled` ([1]) is set
  to `true` and `ssl_client_authentication` ([2]) is set to `required`,
  therefor specifying mTLS authentication mode for the listener.

The following example is for mTLS on the centralized MDS brokers:

```none
all:
  vars:
    mds_broker_listener:
      ssl_enabled: true
      ssl_mutual_auth_enabled: true
      sasl_protocol: none
```

**Provide the paths to the centralized MDS server certificates and key pair for OAuth**

```none
all:
  vars:
    create_mds_certs: false
    token_services_public_pem_file:
    token_services_private_pem_file:
```


## Enable RBAC with ACL authorizer

This section describes the workflow to enable RBAC in non-RBAC or ACL-based Confluent Platform
deployments using an ACL authorizer.

You can reference the [sample inventory files](https://github.com/confluentinc/cp-ansible/tree/master/docs/sample_inventories)
for example non-RBAC to RBAC migration setups.

If your clusters have an authorizer and all required ACLs, you can start at
Step 4.

For clusters without an authorizer, Steps 1-3 are needed to first enable
authorization. Then you can migrate to RBAC.

1. Configure ACL authorizer and add super users for broker principals.

   If an ACL authorizer was already configured, you do not need to do a rolling
   restart.

   For KRaft-based clusters, an authorizer must be added in both the KRaft
   controller and broker.
   ```yaml
   kafka_broker_custom_properties:
     authorizer.class.name: org.apache.kafka.metadata.authorizer.StandardAuthorizer
     allow.everyone.if.no.acl.found: "true"
     super.users: "User:admin"

   kafka_controller_custom_properties:
     authorizer.class.name: org.apache.kafka.metadata.authorizer.StandardAuthorizer
     allow.everyone.if.no.acl.found: "true"
     super.users: "User:admin"
   ```

   * `allow.everyone.if.no.acl.found=true` is set for zero downtime after
     enabling the authorizer.
   * You need to add broker’s principal in `super.users`. The `admin` user
     is used as a broker principal example in the above snippet.
2. Perform a rolling restart of the KRaft controllers and Kafka brokers.
   ```bash
   ansible-playbook -i <your hosts file> confluent.platform.all \
     --skip-tags package \
     -e deployment_strategy=rolling \
     --tags kafka_controller,kafka_broker
   ```

   You can skip this step if ACLs are already enabled in the cluster.
3. Create ACLs for broker principals and user principals of all
   applications, including Confluent Platform components.

   When a new ACL is added, all the users who previously had access will lose
   access to that resource since it was previously set to allow all before the
   new ACL is added.

   There might be downtime for clients here between adding an authorizer and
   adding ACLs.
4. Add the custom broker listener and update all Confluent Platform components to communicate
   on that listener.

   An example snippet:
   ```yaml
   kafka_broker_custom_listeners:
       internal_client_listener:
         name: CUSTOM_LISTENER
         port: 9095
         ssl_enabled: false
         sasl_protocol: plain

   schema_registry_kafka_listener_name: internal_client_listener
   kafka_connect_kafka_listener_name: internal_client_listener
   kafka_rest_kafka_listener_name: internal_client_listener
   ksql_kafka_listener_name: internal_client_listener
   control_center_next_gen_kafka_listener_name: internal_client_listener
   ```
5. Run the following command to update the listener used for Kafka to Confluent Platform
   communication:
   ```bash
   ansible-playbook -i <your hosts file> confluent.platform.all \
     --skip-tags package \
     -e deployment_strategy=rolling
   ```
6. Enable RBAC and Metadata Service (MDS) on Kafka brokers.
   1. Remove the simple authorizer properties added in Step 1.
   2. Comment out the `*_kafka_listener_name` variables set in step 4. This
      will ensure Kafka to Confluent Platform communication via OAuthbearer on the internal
      listener once RBAC is enabled.
   3. Add the variables to enable RBAC.

      Example snippet of RBAC with OAuth:
      ```yaml
      rbac_enabled: true
      auth_mode: oauth
      oauth_superuser_client_id: superuser
      oauth_superuser_client_password: my-secret

      oauth_sub_claim: client_id
      oauth_groups_claim: groups
      oauth_token_uri: https://oauth1:8443/realms/cp-ansible-realm/protocol/openid-connect/token
      oauth_issuer_url: https://oauth1:8443/realms/cp-ansible-realm
      oauth_jwks_uri: https://oauth1:8443/realms/cp-ansible-realm/protocol/openid-connect/certs
      oauth_expected_audience: Confluent,account,api://default

      schema_registry_oauth_user: schema_registry
      schema_registry_oauth_password: my-secret

      kafka_rest_oauth_user: kafka_rest
      kafka_rest_oauth_password: my-secret
      ```
   4. Run the command to enable RBAC in all Confluent Platform components.
      ```bash
      ansible-playbook -i <your hosts file> confluent.platform.all \
        --skip-tags package \
        -e deployment_strategy=rolling
      ```
7. Configure RBAC role bindings for resources of other components. This includes
   all the external Kafka clients and the clients of Confluent Platform components.


# Configure and Deploy Unified Stream Manager Using Ansible Playbooks for Confluent Platform

Confluent Unified Stream Manager connects customer-managed on-premises clusters with Confluent Cloud to
enable Confluent Cloud features for Confluent Platform clusters. The Unified Stream Manager Agent acts as a
centralized proxy/gateway for Kafka, and Ansible Playbooks for Confluent Platform (Confluent Ansible) acts as a
tool to deploy the Unified Stream Manager Agent in a virtual environment. The Ansible roles and
playbooks automate Unified Stream Manager Agent deployment, configuration, TLS setup,
authentication, credential handling, health checks, and integration with other
Confluent Platform components (Kafka, KRaft, and Connect).

This topic presents the steps and guidance for deploying Unified Stream Manager with
Confluent Ansible. It is part of the [Registering your Confluent Platform Kafka
cluster in Confluent Cloud](http://docs.confluent.io/platform/current/usm/get-started.html#registration-process-overview)
process. Review the steps described in the registration topic before proceeding
with Unified Stream Manager deployment.

The high-level workflow to deploy Unified Stream Manager with Confluent Ansible is as follows:

1. [Review the prerequisites and considerations](#ansible-usm-requirements).
2. [Register your Confluent Platform cluster in Confluent Cloud](http://docs.confluent.io/platform/current/usm/get-started.html#registration-process-overview).
3. [Configure and deploy Unified Stream Manager Agent](#ansible-usm-configure).
4. [Complete the registration process for the Unified Stream Manager Agent](#ansible-usm-registration).


## Configure and deploy Unified Stream Manager Agent

You can use the sample inventory files for the Unified Stream Manager Agent in the following GitHub repository as references.

```bash
https://github.com/confluentinc/cp-ansible/blob/8.1.0-post/docs/sample_inventory/usm/
```

To configure and deploy Unified Stream Manager Agent:

1. Update your existing inventory file to add the Unified Stream Manager Agent host.

   For example:
   ```yaml
   usm_agent:
     hosts:
       usm-agent1.confluent.io:
   ```
2. Configure the Unified Stream Manager Agent Confluent Cloud settings in the inventory file.
   ```yaml
   all:
     vars:
       ccloud_endpoint:         --- [1]
       ccloud_environment_id:   --- [2]
       ccloud_credential:       --- [3]
         username:
         password:
   ```

   The values are available in the output file generated when you perform the
   first step in the registration process. See [Generate the configuration file](https://docs.confluent.io/cloud/current/usm/register/deploy-agent.html#generate-and-download-the-configuration-file).
   * [1] Specify the Confluent Cloud endpoint. Use the `FRONTDOOR_URL` value in the
     generated output file, and append the port, for example, `https://api.us-west-2.aws.confluent.cloud:443`.
   * [2] Specify the Confluent Cloud Environment ID.
   * [3] Specify the Confluent Cloud credentials.
3. [Configure the Unified Stream Manager Agent authentication settings](#ansible-usm-authentication).
4. [Configure other settings for the Unified Stream Manager Agent in the inventory file](#ansible-usm-configurations).
5. Validate Unified Stream Manager Agent configuration.
   ```bash
   ansible-playbook -i hosts.yml confluent.platform.all --tags=validate_usm_agent_configs
   ```
6. Deploy the Unified Stream Manager Agent.

   Once the inventory file is updated with the Unified Stream Manager Agent host and the required
   configurations, use the Confluent Ansible playbook to deploy the Unified Stream Manager Agent.

   For metadata and metrics to flow from Kafka, KRaft, and Connect clusters
   to the Unified Stream Manager Agent, the following Confluent Platform components also need to be redeployed:
   - KRaft controller
   - Kafka brokers
   - Connect

   Deploy the Unified Stream Manager Agent and the above-mentioned Confluent Platform components together using
   one of the two methods below. The playbooks ensure that the Unified Stream Manager Agent
   is deployed first, followed by the other components in the correct order, to prevent
   “connection refused” errors.

   To install all components together:
   ```bash
   ansible-playbook -i hosts.yml confluent.platform.all
   ```

   To install only Unified Stream Manager Agent and related Confluent Platform components together:
   ```bash
   ansible-playbook -i hosts.yml confluent.platform.all --tags=usm_agent,kafka_controller,kafka_broker,kafka_connect
   ```


## Annual commitments

Confluent Cloud offers the ability to make a commitment to a minimum amount of spend over a specified time period. This
commitment gives you access to discounts and provides the flexibility to use this
commitment across the entire Confluent Cloud stack, including any [Kafka cluster type](../clusters/cluster-types.md#cloud-cluster-types),
[ksqlDB on Confluent Cloud](../ksqldb/overview.md#cloud-ksqldb-create-stream-processing-apps), [Connectors](../connectors/overview.md#kafka-connect-cloud),
and [Support](https://www.confluent.io/confluent-cloud/support/).

With annual commitments, you can view the total amount of accrued usage during the
commitment term and the amount of time left on your commitment.

If you use more than your committed amount, you can continue using Confluent Cloud without
interruption. You will be charged at your discounted rate for usage beyond the committed
amount until the end of your commitment term. Commitments are minimums, and there is no negative
impact to exceeding your committed usage. If you exceed this minimum, overage charges will be
billed to the payment method set for your organization.

[Contact Confluent](https://confluent.io/contact) to learn more about annual
commitments, or review these topics:

- [Get Started with Confluent Cloud on the AWS Marketplace with Commitments](ccloud-aws-ubb.md#ccloud-aws-market-ubb)
- [Get Started with Confluent Cloud on the Azure Marketplace with Commitments](ccloud-azure-ubb.md#ccloud-azure-market-ubb)
- [Get Started with Confluent Cloud on the Google Cloud Marketplace with Commitments](ccloud-gcp-ubb.md#ccloud-gcp-market-ubb)


# Architectural considerations for streaming applications on Confluent Cloud

This guide covers key architectural considerations when building streaming
applications on Confluent Cloud, including cluster planning, event-driven patterns,
and real-time integration strategies. Understanding these concepts helps you
design scalable, resilient streaming applications that leverage Confluent Cloud’s
comprehensive event streaming platform capabilities.

Key topics covered:

- **Cluster configuration planning** - Critical decisions that impact your streaming application.
- **Data schema architecture and governance** - Schema Registry, data contracts,
  and governance patterns for data quality and compatibility.
- **Stream processing integration** - Real-time data processing with Apache Flink®.
- **Serverless architectures** - Event-driven, elastic streaming patterns.
- **Stateless microservices** - Distributed, event-driven service architectures.
- **Cloud-native streaming** - Applications designed for cloud-native event streaming.
- **Network and security architecture** - Networking patterns and access control
  strategies.
- **Multi-tenancy and resource management** - Shared cluster patterns and quota
  management.
- **Observability and monitoring patterns** - Operational monitoring and
  compliance strategies.


## Benchmarking

Benchmark testing is important because there is no one-size-fits-all
recommendation for the configuration parameters you need to develop Kafka
applications to Confluent Cloud. Proper configuration always depends on the use case,
other features you have enabled, the data profile, and more. You should run
benchmark tests if you plan to tune Kafka clients beyond the defaults. Regardless
of your service goals, you should understand what the performance profile of
your application is—it is especially important when you want to optimize for
throughput or latency. Your benchmark tests can also feed into the calculations
for determining the correct number of partitions and the number of producer and
consumer processes.

First, measure your bandwidth using the Kafka tools `kafka-producer-perf-test`
and `kafka-consumer-perf-test`.
For non-JVM clients that wrap [librdkafka](https://github.com/edenhill/librdkafka), you can use the [rdkafka_performance](https://github.com/edenhill/librdkafka/blob/master/examples/rdkafka_performance.c) interface.
This first round of results provides a baseline performance to your
Confluent Cloud instance, taking application logic out of the equation.
Note that these perf tools do not support Schema Registry.

Then test your application, starting with the default Kafka configuration
parameters, and familiarize yourself with the default values.

Determine the baseline input performance profile for a given producer by
removing dependencies on anything upstream from the producer. Rather than
receiving data from upstream sources, modify your producer to generate its own
mock data at high output rates, such that the data generation is not a
bottleneck. Ensure the mock data reflects the type of data used in production to
produce results that more accurately reflect performance in production. Or,
instead of using mock data, consider using copies of production data or cleansed
production data in your benchmarking.

If you test with compression, be aware of how the [mock data](https://www.confluent.io/blog/easy-ways-generate-test-data-kafka/) is
generated. Sometimes mock data is unrealistic, containing repeated substrings or
being padded with zeros, which may result in a better compression performance
than what would be seen in production.

1. Run a single producer client on a single server and measure the resulting
   throughput using the available JMX metrics for the Kafka producer. Repeat the
   producer benchmarking test, increasing the number of producer processes on
   the server in each iteration to determine the number of producer processes
   per server to achieve the highest throughput.
2. Determine the baseline output performance profile for a given
   consumer in a similar way. Run a single consumer client on a single server
   and repeat this test, increasing the number of consumer processes on the
   server in each iteration to determine the number of consumer processes per
   server to achieve the highest throughput.
3. Run benchmark tests for different permutations of configuration parameters
   that reflect your service goals. Focus on a subset of configuration
   parameters, and avoid the temptation to discover and change other parameters
   from their default values without understanding exactly how they impact the
   entire system.

Tune the settings on each iteration, run a test, observe the results, tune
again, and so on, until you identify settings that work for your throughput and
latency requirements.

[Refer to this blog post](https://www.confluent.io/blog/apache-kafka-supports-200k-partitions-per-cluster)
when considering partition count in your benchmark tests.


# Kafka Producer for Confluent Cloud

An Apache Kafka® Producer is a client application that publishes (writes) events to a
Kafka cluster. This section gives an overview of the Kafka producer and an
introduction to the configuration settings for tuning.


The Kafka producer is conceptually much simpler than the consumer
since it does not need group coordination. A producer **partitioner**
maps each message to a topic partition, and the producer sends a produce request to
the leader of that partition. The partitioners shipped with Kafka guarantee that all
messages with the same non-empty key will be sent to the same
partition.


### Run Replicator as a connector

The connector JSON should look like this:

```json
{
"name": "replicate-topic",
"config": {
    "connector.class": "io.confluent.connect.replicator.ReplicatorSourceConnector",
    "key.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
    "value.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
    "src.kafka.ssl.endpoint.identification.algorithm":"https",
    "src.kafka.sasl.mechanism":"PLAIN",
    "src.kafka.request.timeout.ms":"20000",
    "src.kafka.bootstrap.servers":"<source bootstrap server>",
    "src.kafka.retry.backoff.ms":"500",
    "src.kafka.sasl.jaas.config":"org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<api-key>\" password=\"<secret>\";",
    "src.kafka.security.protocol":"SASL_SSL",
    "dest.kafka.ssl.endpoint.identification.algorithm":"https",
    "dest.kafka.sasl.mechanism":"PLAIN",
    "dest.kafka.request.timeout.ms":"20000",
    "dest.kafka.bootstrap.servers":"<destination bootstrap server>",
    "dest.kafka.retry.backoff.ms":"500",
    "dest.kafka.sasl.jaas.config":"org.apache.kafka.common.security.plain.PlainLoginModule required username=\"<api-key>\" password=\"<secret>\";",
    "dest.kafka.security.protocol":"SASL_SSL",
    "dest.topic.replication.factor":"3",
    "topic.regex":".*"
    }
}
```

If you have not already done so, configure the distributed Connect cluster correctly as shown here.

```none
ssl.endpoint.identification.algorithm=https
sasl.mechanism=PLAIN
bootstrap.servers=<dest bootstrap server>
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<api-key>" password="<secret>";
security.protocol=SASL_SSL
producer.ssl.endpoint.identification.algorithm=https
producer.sasl.mechanism=PLAIN
producer.bootstrap.servers=<dest bootstrap server>
producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<api-key>" password="<secret>";
producer.security.protocol=SASL_SSL
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
group.id=connect-replicator
config.storage.topic=connect-configs1
offset.storage.topic=connect-offsets1
status.storage.topic=connect-statuses1
plugin.path=<confluent install dir>/share/java
```

To learn more, see [Run Replicator as a Connector](/platform/current/multi-dc-deployments/replicator/replicator-run.html#run-crep-as-a-connector) in the [Replicator documentation](/platform/current/multi-dc-deployments/index.html).


## Configure properties

There are three config files for consumer, producer, and replication. The minimal configuration changes for these are shown below.

* `consumer.properties`
  ```none
  bootstrap.servers=<source bootstrap server>
  ssl.endpoint.identification.algorithm=https
  sasl.mechanism=PLAIN
  sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<username>" password="<password>";
  security.protocol=SASL_SSL
  ```
* `producer.properties`
  ```none
  bootstrap.servers=<destination bootstrap server>
  ssl.endpoint.identification.algorithm=https
  sasl.mechanism=PLAIN
  sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<username>" password="<password>";
  security.protocol=SASL_SSL
  ```
* `replication.properties`

  Replace “Movies” for the `topic.whitelist` with the topics you want to replicate from the source cluster.
  ```none
  topic.whitelist=Movies

  topic.rename.format=${topic}-replica
  topic.auto.create=true
  topic.timestamp.type=CreateTime

  dest.topic.replication.factor=3
  ```


#### NOTE
Confluent does not support this script. If you encounter any problems, you can
[file an issue](https://github.com/confluentinc/examples/issues) in GitHub.

Running this script will generate delta configurations for:

* Confluent Platform Components:
  * Schema Registry
  * ksqlDB Data Generator
  * ksqlDB
  * Confluent Replicator
  * Confluent Control Center (Legacy)
  * Kafka Connect
  * Kafka connector
  * Kafka command line tools
* Kafka Clients:
  * Java (Producer/Consumer)
  * Java (Streams)
  * Python
  * .NET
  * Go
  * Node.js
  * C++
* OS:
  * ENV file


## Connect a JavaScript application to Confluent Cloud

To configure JavaScript clients for Kafka to connect to a Kafka cluster in Confluent Cloud:

1. Install the Confluent JavaScript client for Kafka:
   ```bash
   npm install @confluentinc/kafka-javascript
   ```
2. Configure your JavaScript application with the connection properties. You can obtain these
   from the Confluent Cloud Console by selecting your cluster and clicking **Clients**.
3. Use the configuration in your producer or consumer code:
   ```javascript
   const { Kafka } = require('@confluentinc/kafka-javascript');

   const kafka = new Kafka({
     kafkaJS: {
       brokers: ['your-bootstrap-servers'],
       ssl: true,
       sasl: {
         mechanism: 'plain',
         username: 'your-api-key',
         password: 'your-api-secret'
       }
     }
   });

   // Create producer or consumer
   const producer = kafka.producer();
   ```
4. See the [JavaScript client examples](https://github.com/confluentinc/confluent-kafka-javascript/tree/master/examples)
   for complete working examples.
5. Integrate with your environment.


#### **Sample configurations for authentication swapping**

*SASL/PLAIN to SASL/OAUTHBEARER example:*

```yaml
gateway:
  routes:
    - name: gateway
      security:
        auth: swap
        ssl:
          ignoreTrust: false
          truststore:
            type: PKCS12
            location: /opt/ssl/client-truststore.p12
            password:
              file: /opt/secrets/client-truststore.password
          keystore:
            type: PKCS12
            location: /opt/ssl/gw-keystore.p12
            password:
              file: /opt/secrets/gw-keystore.password
              keyPassword:
                value: inline-password
          clientAuth: required

        swapConfig:
          clientAuth:
            sasl:
              mechanism: PLAIN
              callbackHandlerClass: "org.apache.kafka.common.security.plain.PlainServerCallbackHandler" # (Optional if mechanism=SSL)
              jaasConfig: # (required if SASL/PLAIN)
                  file: /opt/gateway/gw-users.conf
              connectionsMaxReauthMs: 0 # optional. link: https://docs.confluent.io/platform/current/installation/configuration/broker-configs.html#connections-max-reauth-ms
          secretstore: s1
          clusterAuth:
            sasl:
              mechanism: OAUTHBEARER
              jaasConfig:
                file: /opt/gateway/cluster-login.tmpl.conf
              oauth: # required only if clusterAuth.sasl.mechanism=oauth
                tokenEndpointUri: "https://idp.mycompany.io:8080/realms/cp/protocol/openid-connect/token"
```

*mTLS to SASL/OAUTHBEARER example:*

```yaml
gateway:
  routes:
    - name: gateway
      security:
        auth: swap
        ssl:
          ignoreTrust: false
          truststore:
            type: PKCS12
            location: /opt/ssl/client-truststore.p12
            password:
              file: /opt/secrets/client-truststore.password
          keystore:
            type: PKCS12
            location: /opt/ssl/gw-keystore.p12
            password:
              file: /opt/secrets/gw-keystore.password
              keyPassword:
                value: inline-password
          clientAuth: required
        swapConfig:
          clientAuth:
            ssl:
              principalMappingRules: "RULE:^CN=([a-zA-Z0-9._-]+),OU=.*$/$1/,RULE:^UID=([a-zA-Z0-9._-]+),.*$/$1/,DEFAULT"
          secretStore: "oauth-secrets"
          clusterAuth:
            sasl:
              mechanism: OAUTHBEARER
              callbackHandlerClass: "org.apache.kafka.common.security.oauthbearer.OAuthBearerValidatorCallbackHandler"
              jaasConfig:
                file: "/etc/gateway/cluster-jaas.tmpl.conf"
              oauth:
                tokenEndpointUri: "https://idp.mycompany.io:8080/realms/cp/protocol/openid-connect/token"
```


## Get started

To provision and configure Confluent Gateway, refer to the detailed guides available for
both Docker and Confluent for Kubernetes (CFK) deployments. The documentation
includes step-by-step installation, configuration for streaming domains and
routes, and security recommendations.

* [Configure and Deploy](gateway-deploy-overview.md#gateway-deploy-overview)
* [Migrate Kafka Clusters](gateway-migrate.md#gateway-client-switchover)
* [Set up Network Isolation and Custom Domains](gateway-custom-domains.md#gateway-custom-domains)


### Local install

1. Install [Confluent Platform](/platform/current/installation/index.html).
2. Customize the `/etc/confluent-kafka-mqtt/kafka-mqtt-dev.properties` properties file,
   specifying:
   - The Confluent Cloud Endpoint that you saved earlier for the bootstrap server.
   - Security information including the Confluent Cloud API key and secret you created in the previous section.
   - Topic information.

   ```text
   # add bootstrap server
   bootstrap.servers=pkc-12345.us-west2.gcp.confluent.cloud:9092

   #configure connection to Confluent Cloud
   producer.security.protocol=SASL_SSL
   producer.sasl.mechanism=PLAIN
   producer.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<api-key>" password="<secret>";

   # configure topic settings
   topic.regex.list=temperature:.*temperature
   confluent.topic.replication.factor=3
   ```
3. Start the MQTT proxy, specifying the properties file:
   ```bash
   bin/kafka-mqtt-start etc/confluent-kafka-mqtt/kafka-mqtt-dev.properties
   ```


## Suggested reading

- To learn how to migrate schemas from an on-premises (self-managed) Schema Registry to Confluent Cloud, see [Migrate Schemas](/platform/current/schema-registry/installation/migrate.html).
- To configure and run native Confluent Cloud Schema Registry, see [Quick Start for Schema Management on Confluent Cloud](../get-started/schema-registry.md#cloud-sr-config).
- For more information about running a cluster, see the [Schema Registry](/platform/current/schema-registry/installation/deployment.html) documentation.
- To view a working example of hybrid Apache Kafka® clusters from self-hosted to Confluent Cloud, see [cp-demo](/platform/current/tutorials/cp-demo/docs/index.html).
- For example configs for all Confluent Platform components and clients connecting to Confluent Cloud, see [template examples for components](https://github.com/confluentinc/examples/tree/latest/ccloud/template_delta_configs).
- To look at all the code used in the Confluent Cloud demo, see the [Confluent Cloud demo examples](https://github.com/confluentinc/examples/tree/latest/ccloud).


## Compile and run a Table API program

The following code example shows how to run a “Hello World” statement and how to
query an example data stream.

1. Copy the following project object model (POM) into a file named pom.xml.

   ### pom.xml

   ```xml
   <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
       <modelVersion>4.0.0</modelVersion>

       <groupId>example</groupId>
       <artifactId>flink-table-api-java-hello-world</artifactId>
       <version>1.0</version>
       <packaging>jar</packaging>

       <name>Apache Flink® Table API Java Hello World Example on Confluent Cloud</name>

       <properties>
           <flink.version>2.1.0</flink.version>
           <confluent-plugin.version>2.1-8</confluent-plugin.version>
           <target.java.version>11</target.java.version>
           <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
           <maven.compiler.source>${target.java.version}</maven.compiler.source>
           <maven.compiler.target>${target.java.version}</maven.compiler.target>
           <log4j.version>2.17.1</log4j.version>
       </properties>

       <repositories>
           <repository>
               <id>confluent</id>
               <url>https://packages.confluent.io/maven/</url>
           </repository>
           <repository>
               <id>apache.snapshots</id>
               <name>Apache Development Snapshot Repository</name>
               <url>https://repository.apache.org/content/repositories/snapshots/</url>
               <releases>
                   <enabled>false</enabled>
               </releases>
               <snapshots>
                   <enabled>true</enabled>
               </snapshots>
           </repository>
       </repositories>

       <dependencies>

           <dependency>
               <groupId>org.apache.flink</groupId>
               <artifactId>flink-table-api-java</artifactId>
               <version>${flink.version}</version>
           </dependency>


           <dependency>
               <groupId>io.confluent.flink</groupId>
               <artifactId>confluent-flink-table-api-java-plugin</artifactId>
               <version>${confluent-plugin.version}</version>
           </dependency>


           <dependency>
               <groupId>org.apache.logging.log4j</groupId>
               <artifactId>log4j-slf4j-impl</artifactId>
               <version>${log4j.version}</version>
               <scope>runtime</scope>
           </dependency>
           <dependency>
               <groupId>org.apache.logging.log4j</groupId>
               <artifactId>log4j-api</artifactId>
               <version>${log4j.version}</version>
               <scope>runtime</scope>
           </dependency>
           <dependency>
               <groupId>org.apache.logging.log4j</groupId>
               <artifactId>log4j-core</artifactId>
               <version>${log4j.version}</version>
               <scope>runtime</scope>
           </dependency>
       </dependencies>

       <build>
       <sourceDirectory>./example</sourceDirectory>
           <plugins>


               <plugin>
                   <groupId>org.apache.maven.plugins</groupId>
                   <artifactId>maven-compiler-plugin</artifactId>
                   <version>3.10.1</version>
                   <configuration>
                       <source>${target.java.version}</source>
                       <target>${target.java.version}</target>
                   </configuration>
               </plugin>


               <plugin>
                   <groupId>org.apache.maven.plugins</groupId>
                   <artifactId>maven-shade-plugin</artifactId>
                   <version>3.4.1</version>
                   <executions>

                       <execution>
                           <phase>package</phase>
                           <goals>
                               <goal>shade</goal>
                           </goals>
                           <configuration>
                               <artifactSet>
                                   <excludes>
                                       <exclude>org.apache.flink:flink-shaded-force-shading</exclude>
                                       <exclude>com.google.code.findbugs:jsr305</exclude>
                                   </excludes>
                               </artifactSet>
                               <filters>
                                   <filter>

                                       <artifact>*:*</artifact>
                                       <excludes>
                                           <exclude>META-INF/*.SF</exclude>
                                           <exclude>META-INF/*.DSA</exclude>
                                           <exclude>META-INF/*.RSA</exclude>
                                       </excludes>
                                   </filter>
                               </filters>
                               <transformers>
                                   <transformer
                                           implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                                   <transformer
                                           implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                       <mainClass>example.hello_table_api</mainClass>
                                   </transformer>
                               </transformers>
                           </configuration>
                       </execution>
                   </executions>
               </plugin>
           </plugins>

           <pluginManagement>
               <plugins>


                   <plugin>
                       <groupId>org.eclipse.m2e</groupId>
                       <artifactId>lifecycle-mapping</artifactId>
                       <version>1.0.0</version>
                       <configuration>
                           <lifecycleMappingMetadata>
                               <pluginExecutions>
                                   <pluginExecution>
                                       <pluginExecutionFilter>
                                           <groupId>org.apache.maven.plugins</groupId>
                                           <artifactId>maven-shade-plugin</artifactId>
                                           <versionRange>[3.1.1,)</versionRange>
                                           <goals>
                                               <goal>shade</goal>
                                           </goals>
                                       </pluginExecutionFilter>
                                       <action>
                                           <ignore/>
                                       </action>
                                   </pluginExecution>
                                   <pluginExecution>
                                       <pluginExecutionFilter>
                                           <groupId>org.apache.maven.plugins</groupId>
                                           <artifactId>maven-compiler-plugin</artifactId>
                                           <versionRange>[3.1,)</versionRange>
                                           <goals>
                                               <goal>testCompile</goal>
                                               <goal>compile</goal>
                                           </goals>
                                       </pluginExecutionFilter>
                                       <action>
                                           <ignore/>
                                       </action>
                                   </pluginExecution>
                               </pluginExecutions>
                           </lifecycleMappingMetadata>
                       </configuration>
                   </plugin>
               </plugins>
           </pluginManagement>
       </build>
   </project>
   ```
2. Create a directory named “example”.
   ```bash
   mkdir example
   ```
3. Create a file named `hello_table_api.java` in the `example` directory.
   ```bash
   touch example/hello_table_api.java
   ```
4. Copy the following code into `hello_table_api.java`.
   ```java
   package example;
   import io.confluent.flink.plugin.ConfluentSettings;
   import io.confluent.flink.plugin.ConfluentTools;
   import org.apache.flink.table.api.EnvironmentSettings;
   import org.apache.flink.table.api.Table;
   import org.apache.flink.table.api.TableEnvironment;
   import org.apache.flink.types.Row;
   import java.util.List;

   /**
    * A table program example to get started with the Apache Flink® Table API.
    *
    * <p>It executes two foreground statements in Confluent Cloud. The results of both statements are
    * printed to the console.
    */
   public class hello_table_api {

       // All logic is defined in a main() method. It can run both in an IDE or CI/CD system.
       public static void main(String[] args) {

           // Set up connection properties to Confluent Cloud.
           // Use the fromGlobalVariables() method if you assigned environment variables.
           // EnvironmentSettings settings = ConfluentSettings.fromGlobalVariables();

           // Use the fromArgs(args) method if you want to run with command-line arguments.
           EnvironmentSettings settings = ConfluentSettings.fromArgs(args);

           // Initialize the session context to get started.
           TableEnvironment env = TableEnvironment.create(settings);

           System.out.println("Running with printing...");

           // The Table API centers on 'Table' objects, which help in defining data pipelines
           // fluently. You can define pipelines fully programmatically.
           Table table = env.fromValues("Hello world!");

           // Also, You can define pipelines with embedded Flink SQL.
           // Table table = env.sqlQuery("SELECT 'Hello world!'");

           // Once the pipeline is defined, execute it on Confluent Cloud.
           // If no target table has been defined, results are streamed back and can be printed
           // locally. This can be useful for development and debugging.
           table.execute().print();

           System.out.println("Running with collecting...");

           // Results can be collected locally and accessed individually.
           // This can be useful for testing.
           Table moreHellos = env.fromValues("Hello Bob", "Hello Alice", "Hello Peter").as("greeting");
           List<Row> rows = ConfluentTools.collectChangelog(moreHellos, 10);
           rows.forEach(
                   r -> {
                       String column = r.getFieldAs("greeting");
                       System.out.println("Greeting: " + column);
                   });
       }
   }
   ```
5. Run the following command to build the jar file.
   ```bash
   mvn clean package
   ```
6. Run the jar. If you assigned your cloud configuration to the environment
   variables specified in the [Prerequisites](#flink-java-table-api-quick-start-prerequisites)
   section, and you used the `fromGlobalVariables` method in the
   `hello_table_api` code, you don’t need to provide the command-line options.
   ```bash
   java -jar target/flink-table-api-java-hello-world-1.0.jar \
     --cloud aws \
     --region us-east-1 \
     --flink-api-key key \
     --flink-api-secret secret \
     --organization-id b0b21724-4586-4a07-b787-d0bb5aacbf87 \
     --environment-id env-z3y2x1 \
     --compute-pool-id lfcp-8m03rm
   ```

   Your output should resemble:
   ```none
   Running with printing...
   +----+--------------------------------+
   | op |                             f0 |
   +----+--------------------------------+
   | +I |                   Hello world! |
   +----+--------------------------------+
   1 row in set
   Running with collecting...
   Greeting: Hello Bob
   Greeting: Hello Alice
   Greeting: Hello Peter
   ```


## Step 3. Create a CI/CD workflow in GitHub Actions

The following steps show how to create an Action Workflow for automating the
deployment of a Flink SQL statement on Confluent Cloud using Terraform.

1. In the toolbar at the top of the screen, click **Actions**.

   The **Get started with GitHub Actions** page opens.
2. Click **set up a workflow yourself ->**. If you already have a workflow
   defined, click **new workflow**, and then click **set up a workflow yourself ->**.
3. Copy the following YAML into the editor.

   This YAML file defines a workflow that runs when changes are pushed to the
   main branch of your repository. It includes a job named
   “terraform_flink_ccloud_tutorial” that runs on the latest version of Ubuntu.
   The job includes these steps:
   - Check out the code
   - Set up Terraform
   - Log in to Terraform Cloud using the API token stored in the Action Secret
   - Initialize Terraform
   - Apply the Terraform configuration to deploy changes to your Confluent Cloud account

   ```yaml
   on:
    push:
       branches:
       - main

   jobs:
    terraform_flink_ccloud_tutorial:
       name: "terraform_flink_ccloud_tutorial"
       runs-on: ubuntu-latest
       steps:
         - name: Checkout
           uses: actions/checkout@v4

         - name: Setup Terraform
           uses: hashicorp/setup-terraform@v3
           with:
            cli_config_credentials_token: ${{ secrets.TF_API_TOKEN }}

         - name: Terraform Init
           id: init
           run: terraform init

         - name: Terraform Validate
           id: validate
           run: terraform validate -no-color

         - name: Terraform Plan
           id: plan
           run: terraform plan
           env:
             TF_VAR_confluent_cloud_api_key: ${{ secrets.CONFLUENT_CLOUD_API_KEY }}
             TF_VAR_confluent_cloud_api_secret: ${{ secrets.CONFLUENT_CLOUD_API_SECRET }}

         - name: Terraform Apply
           id: apply
           run: terraform apply -auto-approve
           env:
             TF_VAR_confluent_cloud_api_key: ${{ secrets.CONFLUENT_CLOUD_API_KEY }}
             TF_VAR_confluent_cloud_api_secret: ${{ secrets.CONFLUENT_CLOUD_API_SECRET }}
   ```
4. Click **Commit changes**, and in the dialog, enter a description in the
   **Extended description** textbox, for example, “CI/CD workflow to automate
   deployment on Confluent Cloud”.
5. Click **Commit changes**.

   The file `main.yml` is created in the `.github/workflows` directory in
   your repository.

With this Action Workflow, your deployment of Flink SQL statements on Confluent Cloud
is now automatic.


## Key features

Tableflow offers the following capabilities:

- [Materialize](get-started/overview.md#cloud-tableflow-get-started) Kafka topics or Flink tables
  as Iceberg or Delta Lake tables
- Use [your storage](concepts/tableflow-storage.md#tableflow-storage-amazon-s3) (“Bring Your Own Storage”)
  or [Confluent Managed Storage](concepts/tableflow-storage.md#tableflow-storage-confluent-managed-storage)
  for materialized tables
- Built-in [Iceberg REST Catalog (IRC)](get-started/quick-start-managed-storage.md#cloud-tableflow-quick-start-managed-storage-credentials)
- [Catalog integration](how-to-guides/catalog-integration/overview.md#cloud-tableflow-how-to-guides-catalog-integration)
  with AWS Glue, Apache Polaris, and Snowflake Open Catalog
- Use [Avro, Protobuf, and JSON Schema](concepts/tableflow-schemas.md#cloud-tableflow-schemas) as the
  input data format and support for schematization using Confluent Cloud Schema Registry.
- [Self-managed encryption key (BYOK) support](../../security/encrypt/byok/tableflow-byok.md#tableflow-byok-integration)
  for enhanced security and compliance requirements.
- [Automatic table maintenance](operate/monitor-tableflow.md#tableflow-monitor)

Tableflow enhances data quality and structure by managing data
preprocessing and preparation automatically before materializing streaming data into
Iceberg or Delta Lake tables. Below are the key automated data processing and
preparation capabilities supported in Tableflow.


## Flags

```none
--file string                       Output file name. (default "asyncapi-spec.yaml")
--group string                      Consumer Group ID for getting messages. (default "consumerApplication")
--consume-examples                  Consume messages from topics for populating examples.
--spec-version string               Version number of the output file. (default "1.0.0")
--kafka-api-key string              Kafka cluster API key.
--schema-context string             Use a specific schema context. (default "default")
--topics strings                    A comma-separated list of topics to export. Supports prefixes ending with a wildcard (*).
--schema-registry-endpoint string   The URL of the Schema Registry cluster.
--value-format string               Format message value as "string", "avro", "double", "integer", "jsonschema", or "protobuf". Note that schema references are not supported for Avro. (default "string")
--kafka-endpoint string             Endpoint to be used for this Kafka cluster.
--cluster string                    Kafka cluster ID.
--environment string                Environment ID.
```


## Examples

Create a configuration file with connector configs and offsets.

```none
{
  "name": "MyGcsLogsBucketConnector",
  "config": {
    "connector.class": "GcsSink",
    "data.format": "BYTES",
    "flush.size": "1000",
    "gcs.bucket.name": "APILogsBucket",
    "gcs.credentials.config": "****************",
    "kafka.api.key": "****************",
    "kafka.api.secret": "****************",
    "name": "MyGcsLogsBucketConnector",
    "tasks.max": "2",
    "time.interval": "DAILY",
    "topics": "APILogsTopic"
  },
  "offsets": [
      {
        "partition": {
              "kafka_partition": 0,
              "kafka_topic": "topic_A"
        },
        "offset": {
              "kafka_offset": 1000
        }
      }
  ]
}
```

Create a connector in the current or specified Kafka cluster context.

```none
confluent connect cluster create --config-file config.json
confluent connect cluster create --config-file config.json --cluster lkc-123456
```


### Cloud

```none
--source-cluster string                 Source cluster ID.
--source-bootstrap-server string        Bootstrap server address of the source cluster. Can alternatively be set in the configuration file using key "bootstrap.servers".
--destination-cluster string            Destination cluster ID for source initiated cluster links.
--destination-bootstrap-server string   Bootstrap server address of the destination cluster for source initiated cluster links. Can alternatively be set in the configuration file using key "bootstrap.servers".
--remote-cluster string                 Remote cluster ID for bidirectional cluster links.
--remote-bootstrap-server string        Bootstrap server address of the remote cluster for bidirectional links. Can alternatively be set in the configuration file using key "bootstrap.servers".
--source-api-key string                 An API key for the source cluster. For links at destination cluster this is used for remote cluster authentication. For links at source cluster this is used for local cluster authentication. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--source-api-secret string              An API secret for the source cluster. For links at destination cluster this is used for remote cluster authentication. For links at source cluster this is used for local cluster authentication. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--destination-api-key string            An API key for the destination cluster. This is used for remote cluster authentication links at the source cluster. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--destination-api-secret string         An API secret for the destination cluster. This is used for remote cluster authentication for links at the source cluster. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--remote-api-key string                 An API key for the remote cluster for bidirectional links. This is used for remote cluster authentication. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--remote-api-secret string              An API secret for the remote cluster for bidirectional links. This is used for remote cluster authentication. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--local-api-key string                  An API key for the local cluster for bidirectional links. This is used for local cluster authentication if remote link's connection mode is Inbound. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--local-api-secret string               An API secret for the local cluster for bidirectional links. This is used for local cluster authentication if remote link's connection mode is Inbound. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--config strings                        A comma-separated list of "key=value" pairs, or path to a configuration file containing a newline-separated list of "key=value" pairs.
--dry-run                               Validate a link, but do not create it.
--no-validate                           Create a link even if the source cluster cannot be reached.
--kafka-endpoint string                 Endpoint to be used for this Kafka cluster.
--cluster string                        Kafka cluster ID.
--environment string                    Environment ID.
--context string                        CLI context name.
```


### Why should I upgrade my Confluent CLI to the latest version, v4?

As detailed in the Release Notes, several commands and flags have been renamed or modified
for v4 to provide better functionality and map to feature updates. In particular, the Schema Registry commands
are now aligned with [Always On Stream Governance](/cloud/current/stream-governance/packages.html#getting-started-enable-or-upgrade).
To learn more, see [Deprecation of SRCM v2 clusters and regions APIs and upgrade guide](/cloud/current/stream-governance/packages.html#deprecation-of-srcm-v2-clusters-and-regions-apis-and-upgrade-guide).

In practice, this means that users no longer explicitly create and secure Schema Registry clusters; in fact, these clusters cannot be created manually with the new CLI commands and backing APIs.
The Schema Registry cluster is now auto-created in the environment when the first Kafka cluster is created, and in the same region as the Kafka cluster.
Stream Governance and Schema Registry is always enabled in Confluent Cloud environments; you have the choice of keeping with the default “Essentials” package or upgrading to “Advanced”.
Therefore, the set of `confluent schema-registry cluster` commands have been streamlined to describe existing clusters,
while package upgrades are available on `confluent environment` commands:

```bash
confluent environment update <id> --governance-package advanced
```

Keep in mind that once you upgrade a package associated with an environment, you cannot “downgrade” back to “Essentials”:

```bash
Downgrading the package from "advanced" to "essentials" is not allowed once the Schema Registry cluster is provisioned.
```

Several new commands have been added to support working with Kafka topics and plugins.


## Quick start

To get started, install the latest version of the Confluent CLI, create a Kafka
cluster and topic, and produce and consume messages as described below.

1. [Install the latest version of the CLI](install.md#cli-install) per the instructions for your operating system.
2. Sign up for a free Confluent Cloud account by entering the following command in your
   terminal:
   ```text
   confluent cloud-signup
   ```

   You should be redirected to the [free Confluent Cloud account](https://www.confluent.io/get-started/) sign up page.
3. After you have signed up for a free account, start autocomplete by entering
   the following command in your terminal:
   ```text
   confluent shell
   ```
4. Using the `confluent` interactive shell, enter the following command to log
   in to your Confluent Cloud account:
   ```text
   login
   ```

   If your credentials are not saved locally, you must enter your credentials as
   shown in the following output:
   ```text
   Enter your Confluent Cloud credentials:
   Email:
   Password:
   ```

   #### NOTE
   - If you signed up for a free Confluent Cloud account using your GitHub or Google
     credentials, you must provide your GitHub or Google username and password to
     sign in.
   - You add the `--save` flag if you want to save your credentials
     locally. This prevents you from having to enter them again in the
     future.
5. Create your first Kafka cluster:
   ```text
   kafka cluster create <name> --cloud <cloud provider> --region <cloud region>
   ```

   For example:
   ```text
   kafka cluster create dev0 --cloud aws --region us-east-1
   ```

   You should see output similar to following:
   ```text
   It may take up to 5 minutes for the Kafka cluster to be ready.
   +-----------------------+----------------------------------------------------------+
   | Current              || false                                                    |
   | ID                   || lkc-dfgrt7                                               |
   | Name                 || dev0                                                     |
   | Type                 || BASIC                                                    |
   | Ingress Limit (MB/s) || 250                                                      |
   | Egress Limit (MB/s)  || 750                                                      |
   | Storage              || 5 TB                                                     |
   | Provider             || aws                                                      |
   | Region               || us-east-1                                                |
   | Availability         || single-zone                                              |
   | Status               || PROVISIONING                                             |
   | Endpoint             || SASL_SSL://xxx-xxxx.us-east-1.aws.confluent.cloud:1234   |
   | REST Endpoint        || https://yyy-y11yy.us-east-1.aws.confluent.cloud:345      |
   +-----------------------+----------------------------------------------------------+
   ```
6. Create a topic in the cluster using the cluster ID from the output of the
   previous step:
   ```text
   kafka topic create <name> --cluster <cluster ID>
   ```

   For example:
   ```text
   kafka topic create test_topic --cluster lkc-dfgrt7
   ```

   You should see output confirming that the topic was created:
   ```text
   Created topic "test_topic".
   ```
7. Create an API key for the cluster:
   ```text
   api-key create --resource lkc-dfgrt7
   ```

   You should see output similar to the following:
   ```text
   It may take a couple of minutes for the API key to be ready.
   Save the API key and secret. The secret is not retrievable later.
   +-------------+-------------------------------------------------------------------+
   | API Key     | <YOUR API KEY>                                                    |
   | API Secret  | <YOUR API SECRET>                                                 |
   +-------------+-------------------------------------------------------------------+
   ```
8. Produce messages to your topic:
   ```text
   kafka topic produce <topic name> --api-key <YOUR API KEY> --api-secret <YOUR API SECRET>
   ```

   For example:
   ```text
   kafka topic produce test_topic --api-key <YOUR API KEY> --api-secret <YOUR API SECRET>
   ```

   You should see output similar to:
   ```text
   Starting Kafka Producer. Use Ctrl-C or Ctrl-D to exit.
   ```
9. Once the producer is active, type messages, delimiting them with return. For
   example:
   ```text
   today
   then
   now
   forever
   ```
10. When you’re finished producing, exit with `Ctrl-C` or `Ctrl-D`.
11. Read back your produced messages, from the beginning:
    ```text
    kafka topic consume <topic name> --api-key <YOUR API KEY> --api-secret <YOUR API SECRET> --from-beginning
    ```

    For example:
    ```text
    kafka topic consume test_topic --api-key <YOUR API KEY> --api-secret <YOUR API SECRET> --from-beginning
    ```

    Based on the previous messages entered, you should see output similar to:
    ```text
    Starting Kafka Consumer. Use Ctrl-C to exit.
    forever
    now
    today
    then
    ```


## Working with Confluent Cloud for Government

Use the links in this section to set up and manage your environment.

Install the Confluent CLI:

- [Install the CLI](https://docs.confluent.io/confluent-cli/current/install.html)

Invite users and assign role-based access:

- [Single Sign-on (SSO) Overview](/cloud/current/access-management/authenticate/sso/overview.html)
- [Add an SSO user](/cloud/current/access-management/identity/user-accounts.html#add-an-sso-user)
- [Restrict user access](/cloud/current/access-management/access-control/cloud-rbac.html)

Kafka cluster management:

- [CRUD operations for Kafka clusters](/cloud/current/clusters/create-cluster.html#how-to-work-with-clusters)
- [Resize clusters](/cloud/current/clusters/resize.html)
- [Self-Managed Encryption Keys and AWS](/cloud/current/clusters/byok/byok-aws.html)

Setup network security:

Setting up a private network on Confluent Cloud for Government is a two-step process. First you create the Confluent Cloud for Government network,
then you add the private networking option. AWS includes multiple private networking options, including
AWS PrivateLink, VPC Peering on AWS, and AWS Transit Gateway.

- [Confluent Cloud Network on AWS](/cloud/current/networking/ccloud-network/aws.html#create-ccloud-network-aws)
- [AWS PrivateLink](/cloud/current/networking/private-links/aws-privatelink.html)

Monitoring and logging:

- [Confluent Cloud Audit Log Overview](/cloud/current/monitoring/audit-logging/cloud-audit-log-concepts.html)
- [Audit Log Reference](/cloud/current/monitoring/audit-logging/audit-log-records.html)
- [Audit Log Event Schema](/cloud/current/monitoring/audit-logging/audit-log-schema.html)
- [Access and Consume Audit Logs](/cloud/current/monitoring/audit-logging/configure.html)

Backups and contingency plans:

- [Confluent Replicator to Confluent Cloud Configurations](/cloud/current/get-started/examples/ccloud/docs/replicator-to-cloud-configuration-types.html#ccloud-to-ccloud-with-connect-backed-to-origin)
- [Confluent for Kubernetes and Replicator GitHub Example](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/hybrid/replicator-cloud2cloud)
- [Confluent for Kubernetes](https://docs.confluent.io/operator/current/overview.html)


## Configure Tiered Storage

f
For a complete guide to setting up and working with Tiered Storage, see [Tiered Storage in Confluent Platform](/platform/current/clusters/tiered-storage.html#tiered-storage).

To configure and work with Tiered Storage starting from Control Center:

1. Click the cluster from the cluster navigation bar.
2. Click the **Cluster settings** menu.
3. Click the **Tiered storage** tab.
   ![image](images/c3-storage.png)

   You can hide or show the on-screen setup instructions, which walk through cloud provider setup as fully described in [Tiered Storage in Confluent Platform](/platform/current/clusters/tiered-storage.html#tiered-storage).
4. To view and edit dynamic settings, click **Edit settings**.
   ![image](images/c3-storage-dynamic-configs.png)

   View or change settings and click **Cancel** or **Save changes** as appropriate.
5. To set up storage, choose a cloud provider (click the **GCS** or **S3** tab).

   The S3 configuration options are shown here as an example.
   ![image](images/c3-storage-setup-s3.png)
6. Specify property values and paths to your credentials, then click **Generate configurations**.
   ![image](images/c3-storage-setup-example.png)
7. Copy the generated configurations block and paste it into the properties files for your brokers (for example, `$CONFLUENT_HOME/etc/kafka/server.properties`).
   ![image](images/c3-storage-gen-configs-output.png)

   #### IMPORTANT
   - The same bucket must be used across all brokers within a Tiered Storage enabled cluster. This applies to all supported platforms.
   - The Tiered Storage internal topic defaults to a replication factor of `3`. If you use `confluent local services start`
     to run a single broker cluster such as that described in [Quick Start guides](/platform/current/get-started/platform-quickstart.html#quickstart),’
     you must add an additional line to the broker file, `$CONFLUENT_HOME/etc/kafka/server.properties`:

     `confluent.tier.metadata.replication.factor=1`
   - As a recommended best practice, do not set a retention policy on the cloud storage (such as an AWS S3 bucket) because this may conflict with the Kafka topic retention policy.
8. After you update these configurations to enable Tiered Storage, restart the brokers. This can be done in a [rolling](/platform/current/kafka/post-deployment.html#rolling-restart) fashion.
9. View cluster-wide metrics for **Tiered Storage** are shown on the **Tiered Storage** card on the **Brokers** overview page for the cluster.
   ![Tiered Storage panel enabled](images/c3-tiered-storage-metrics-overview.png)

   Click into these initial stats to view a metrics chart for Tiered Storage.
   ![Tiered Storage metrics chart](images/c3-tiered-storage-metrics.png)

   Hover and slide the cursor over a chart to get details on data at any particular point in time.
   ![Tiered Storage metrics detail on hover](images/c3-tiered-storage-metrics-details.png)
10. To get storage metrics on a specific topic, navigate to the topic (choose **Cluster > Topics**, select a topic from the list).

    The **Storage** card is shown on the Overview page for the topic.
    ![Tiered Storage metrics on a single topic](images/c3-tiered-storage-metrics-on-topic.png)


## Security for Confluent Platform components settings

The following optional settings control TLS encryption between Control Center
and Confluent Platform components or features. You can also configure Basic authentication
for Schema Registry.

You should configure these settings if you have configured your Kafka cluster with
these security features. For TLS, you can choose to configure each component separately, or
set a single store.

- [Streams](#controlcenter-monitoring)
- [Schema Registry](#controlcenter-sr)
- [Connect](#controlcenter-connect)
- [ksqlDB](#controlcenter-ksql)
- [Single Proxy Server Store](#single-store)


### Confluent Platform 7.7 - 8.0

Considerations:
: - You must use a special command to start Prometheus on MacOS.
  - By default Alertmanager and controllers in KRaft mode use port 9093. To run Prometheus and Alertmanager and KRaft mode controllers
    on the same host, you must manually edit the provided Control Center scripts.

1. Download the Confluent Platform archive (7.7 to 8.0 supported) and run these commands:
   ```bash
   wget https://packages.confluent.io/archive/8.0/confluent-8.0.0.tar.gz
   ```

   ```bash
   tar -xvf confluent-8.0.0.tar.gz
   ```

   ```bash
   cd confluent-8.0.0
   ```

   ```bash
   export CONFLUENT_HOME=`pwd`
   ```
2. Update the broker and controller configurations to emit metrics to Prometheus by adding
   the following configurations to: `etc/kafka/controller.properties` and `etc/kafka/broker.properties`

   The fifth line (`confluent.telemetry.exporter._c3.metrics.include=<value>`) is very long. Simply copy
   the code block as provided and append it to the end of the properties files. Pasting the fifth line results in a single line,
   even though it shows as wrapped in the documentation.
   ```bash
   metric.reporters=io.confluent.telemetry.reporter.TelemetryReporter
   confluent.telemetry.exporter._c3.type=http
   confluent.telemetry.exporter._c3.enabled=true
   confluent.telemetry.exporter._c3.metrics.include=io.confluent.kafka.server.request.(?!.*delta).*|io.confluent.kafka.server.server.broker.state|io.confluent.kafka.server.replica.manager.leader.count|io.confluent.kafka.server.request.queue.size|io.confluent.kafka.server.broker.topic.failed.produce.requests.rate.1.min|io.confluent.kafka.server.tier.archiver.total.lag|io.confluent.kafka.server.request.total.time.ms.p99|io.confluent.kafka.server.broker.topic.failed.fetch.requests.rate.1.min|io.confluent.kafka.server.broker.topic.total.fetch.requests.rate.1.min|io.confluent.kafka.server.partition.caught.up.replicas.count|io.confluent.kafka.server.partition.observer.replicas.count|io.confluent.kafka.server.tier.tasks.num.partitions.in.error|io.confluent.kafka.server.broker.topic.bytes.out.rate.1.min|io.confluent.kafka.server.request.total.time.ms.p95|io.confluent.kafka.server.controller.active.controller.count|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.total|io.confluent.kafka.server.request.total.time.ms.p999|io.confluent.kafka.server.controller.active.broker.count|io.confluent.kafka.server.request.handler.pool.request.handler.avg.idle.percent.rate.1.min|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.rate.1.min|io.confluent.kafka.server.controller.unclean.leader.elections.rate.1.min|io.confluent.kafka.server.replica.manager.partition.count|io.confluent.kafka.server.controller.unclean.leader.elections.total|io.confluent.kafka.server.partition.replicas.count|io.confluent.kafka.server.broker.topic.total.produce.requests.rate.1.min|io.confluent.kafka.server.controller.offline.partitions.count|io.confluent.kafka.server.socket.server.network.processor.avg.idle.percent|io.confluent.kafka.server.partition.under.replicated|io.confluent.kafka.server.log.log.start.offset|io.confluent.kafka.server.log.tier.size|io.confluent.kafka.server.log.size|io.confluent.kafka.server.tier.fetcher.bytes.fetched.total|io.confluent.kafka.server.request.total.time.ms.p50|io.confluent.kafka.server.tenant.consumer.lag.offsets|io.confluent.kafka.server.session.expire.listener.zookeeper.expires.rate.1.min|io.confluent.kafka.server.log.log.end.offset|io.confluent.kafka.server.broker.topic.bytes.in.rate.1.min|io.confluent.kafka.server.partition.under.min.isr|io.confluent.kafka.server.partition.in.sync.replicas.count|io.confluent.telemetry.http.exporter.batches.dropped|io.confluent.telemetry.http.exporter.items.total|io.confluent.telemetry.http.exporter.items.succeeded|io.confluent.telemetry.http.exporter.send.time.total.millis|io.confluent.kafka.server.controller.leader.election.rate.(?!.*delta).*|io.confluent.telemetry.http.exporter.batches.failed
   confluent.telemetry.exporter._c3.client.base.url=http://localhost:9090/api/v1/otlp
   confluent.telemetry.exporter._c3.client.compression=gzip
   confluent.telemetry.exporter._c3.api.key=dummy
   confluent.telemetry.exporter._c3.api.secret=dummy
   confluent.telemetry.exporter._c3.buffer.pending.batches.max=80
   confluent.telemetry.exporter._c3.buffer.batch.items.max=4000
   confluent.telemetry.exporter._c3.buffer.inflight.submissions.max=10
   confluent.telemetry.metrics.collector.interval.ms=60000
   confluent.telemetry.remoteconfig._confluent.enabled=false
   confluent.consumer.lag.emitter.enabled=true
   ```
3. Download the Control Center archive and run these commands:
   ```bash
   wget https://packages.confluent.io/confluent-control-center-next-gen/archive/confluent-control-center-next-gen-2.3.0.tar.gz
   ```

   ```bash
   tar -xvf confluent-control-center-next-gen-2.3.0.tar.gz
   ```

   ```bash
   cd confluent-control-center-next-gen-2.3.0
   ```

   ```bash
   export C3_HOME=`pwd`
   ```
4. Start Prometheus and Alertmanager

   To start Control Center, you must have three dedicated command windows: one for Prometheus, another for the Control Center process,
   and a third dedicated command window for Alertmanager. Run the following commands from `$C3_HOME` in all command windows.
   1. Open `etc/confluent-control-center/prometheus-generated.yml` and change `localhost:9093` to `localhost:9098`
      ```bash
      alerting:
         alertmanagers:
            - static_configs:
               - targets:
                  - localhost:9098
      ```
   2. Start Prometheus.

      All operating systems except MacOS:
      ```bash
      bin/prometheus-start
      ```

      MacOS:
      ```bash
      bash bin/prometheus-start
      ```

      #### NOTE
      Prometheus runs but does not output any information to the screen.
   3. Start Alertmanager.
      1. Run this command:
         ```bash
         export ALERTMANAGER_PORT=9098
         ```
      2. All operating systems except MacOS:
         ```bash
         bin/alertmanager-start
         ```

         MacOS
         ```bash
         bash bin/alertmanager-start
         ```
5. Start Control Center.
   1. Open `etc/confluent-control-center/control-center-dev.properties` and update port `9093` to `9098`:
      ```bash
      confluent.controlcenter.alertmanager.url=http://localhost:9098
      ```
   2. Run this command:
      ```bash
      bin/control-center-start etc/confluent-control-center/control-center-dev.properties
      ```
6. Start Confluent Platform.

   To start Confluent Platform, you must have two dedicated command windows, one for the controller and another for the broker process. All
   the following commands are meant to be run from `CONFLUENT_HOME` in both command windows. The Confluent Platform start sequence requires
   you to generate a single random ID and use that *same* ID for both the controller and the broker process.
   1. In the command window dedicated to running the controller, change directories into `CONFLUENT_HOME`.
      ```bash
      cd CONFLUENT_HOME
      ```
   2. Generate a random value for `KAFKA_CLUSTER_ID`.
      ```bash
      KAFKA_CLUSTER_ID="$(bin/kafka-storage random-uuid)"
      ```
   3. Use the following command to get the random ID and save the output. You need this value to start the controller *and*
      the broker.
      ```bash
      echo $KAFKA_CLUSTER_ID
      ```
   4. Format the log directories for the controller:
      ```bash
      bin/kafka-storage format --cluster-id $KAFKA_CLUSTER_ID -c etc/kafka/kraft/controller.properties --standalone
      ```
   5. Start the controller:
      ```bash
      bin/kafka-server-start etc/kafka/kraft/controller.properties
      ```
   6. Open a command window for the broker and navigate to `CONFLUENT_HOME`.
      ```bash
      cd CONFLUENT_HOME
      ```
   7. Set the `KAFKA_CLUSTER_ID` variable to the random ID you generated earlier with `kafka-storage random-uuid`.
      ```bash
      export KAFKA_CLUSTER_ID=<KAFKA-CLUSTER-ID>
      ```
   8. Format the log directories for this broker:
      ```bash
      bin/kafka-storage format --cluster-id $KAFKA_CLUSTER_ID -c etc/kafka/kraft/broker.properties
      ```
   9. Start the broker:
      ```bash
      bin/kafka-server-start etc/kafka/kraft/broker.properties
      ```


#### IMPORTANT
If you configured Control Center for RBAC in the 5.3 preview
release, the configuration options have changed in Confluent Platform version
5.4 and later. You must update your configuration.

1. Uncomment the following lines for each configuration option in the
   appropriate Control Center properties file for your environment
   (`CONFLUENT_HOME/etc/confluent-control-center/control-center.properties`).
   Replace the placeholder values with your actual values.
   ```RST
   ############################# Control Center RBAC Settings #############################

   # Enable RBAC authorization in Control Center by providing a comma-separated list of Metadata Service (MDS) URLs
   #confluent.metadata.bootstrap.server.urls=http://localhost:8090

   # MDS credentials of an RBAC user for Control Center to act on behalf of
   # NOTE: This user must be a SystemAdmin on each Apache Kafka cluster
   #confluent.metadata.basic.auth.user.info=username:password

   # Enable SASL-based authentication for each Apache Kafka cluster (SASL_PLAINTEXT or SASL_SSL required)
   #confluent.controlcenter.streams.security.protocol=SASL_PLAINTEXT
   #confluent.controlcenter.kafka.<name>.security.protocol=SASL_PLAINTEXT

   # Enable authentication using a bearer token for Control Center's REST endpoints
   #confluent.controlcenter.rest.authentication.method=BEARER

   # NOTE: Must match the MDS public key
   #public.key.path=/path/to/publickey.pem
   ```

   **Line descriptions:**
   - **Line 4:** MDS URL for authorizing resources. In a multiple MDS environment, separate
     the URLs with a comma. The presence of the MDS URL is what indicates to Control Center that
     RBAC is enabled.
   - **Line 8:** Metadata Service (MDS) credentials of an RBAC user for Control Center to act on behalf of.
   - **Line 11-12:** The confluent.controlcenter.streams prefix represents the Kafka streams application
     (You can use the option in line 12 for another Kafka cluster) and all configurations you need to add
     for setting up a Kafka cluster.
   - **Line 15:** The authentication method required to talk to the Control Center
     backend through the REST layer. The OAuth-style `BEARER` method is required.
     The Control Center frontend acquires an access token on your behalf and keeps it
     refreshed. HTTP Basic authentication headers are not accepted.
   - **Line 18:** The path to the public key required for REST authentication. Must be the
     same public key that resides on MDS. The public key checks the token and
     makes sure that the user requesting access is a valid user in the system.

   #### IMPORTANT
   Additional clusters in a multi-cluster environment
   require connections to Kafka with RBAC enabled due to
   a [known issue](#c3-ki-cluster-connections).
   You can no longer send only metrics to Control Center in
   an RBAC-enabled environment; you must fully enable management.
   For more information, see [Monitor Kafka with Metrics Reporter in Confluent Platform](/platform/current/monitor/metrics-reporter.html#metrics-reporter).
2. Restart Confluent Platform for the properties file configuration to take effect.

   If you are using a Confluent Platform development environment with a [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html), stop and start as follows:
   ```bash
   confluent local stop
   confluent local start
   ```

   The `control-center-dev.properties` file is passed in automatically.


### Configure TLS proxy server access to Schema Registry

When Confluent Control Center connects to Schema Registry and Schema Registry has TLS enabled:

- Schema Registry communicates with Kafka over the Kafka protocol, which is secured with TLS.
- Control Center communicates with Kafka over the Kafka protocol, which is secured with TLS.
- Control Center communicates with Schema Registry with the HTTPS protocol,
  which is secured with TLS.

Essentially, Control Center functions as a proxy server to Schema Registry. To secure Control Center
with HTTPS, configure Schema Registry to allow HTTPS as described in [Configuring the REST API for HTTP or HTTPS](/platform/current/schema-registry/security/index.html#schema-registry-http-https).

In addition, Control Center should include a trusted certificate to its truststore to
connect to Schema Registry over HTTPS as described in [Additional configurations for HTTPS](/platform/current/schema-registry/security/index.html#sr-https-additional).

Be sure to prefix the Control Center configuration attributes in `control-center.properties`
with `confluent.controlcenter.` For example:

```bash
confluent.controlcenter.schema.registry.schema.registry.ssl.truststore.location=<truststore-location>
confluent.controlcenter.schema.registry.schema.registry.ssl.truststore.password=<password>
confluent.controlcenter.schema.registry.schema.registry.ssl.keystore.location=<keystore-location>
confluent.controlcenter.schema.registry.schema.registry.ssl.keystore.password=<password>
confluent.controlcenter.schema.registry.schema.registry.ssl.key.password=<password>
```


## Topic details

Select a topic to display overview details for that topic and navigate to other features for topics:

- [Schema](/platform/current/control-center/topics/schema.html#topicschema).
- Inspect ([Message browser](messages.md#c3-topic-message-browser)).
- [Settings](edit.md#c3-edit-topic) for topic configuration.
- Query in ksqlDB - Registers a stream or table for a topic.
- Consumer lag - [View consumer lag](view.md#c3-view-topic-consume-metrics) at the topic level.
  You can view consumer lag for a consumer group from
  the [Consumers](../clients/consumers.md#controlcenter-userguide-consumers) menu.

To access the overview page for a Topic:

1. Select a cluster from the navigation bar and click the **Topics** menu item.
2. In the **Topics** table, click the topic name. The topic overview page
   automatically opens for that topic.

In Normal mode, use the **Topic** page to:

* View a topic overview with a health roll-up.
* Drill into topic metric metrics by clicking the **Production**, **Consumption**,
  or **Availability** panels.
* Search for partitions by partition ID.
* View partition, replica placement, offset, and partition size details.

![Topics Overview page (Normal mode)](images/c3-topics-overview-page.png)


#### Semantic and per-method changes

- `subscribe`:
  - Regex flags are ignored while passing a topic subscription (like `i`
    or `g`). Regexes must start with `^,`; otherwise, an error is thrown.
  - Subscribe must be called only after `connect`.
  - An optional parameter, `replace` is provided. If set to `true`, the
    current subscription is replaced with the new one. If set to false,
    the new subscription is added to the current one, for example,
    `consumer.subscribe({ topics: ['topic1'], replace: true});`. The
    default value is false to retain existing behaviour.
  - While passing a list of topics to `subscribe()`, the
    `fromBeginning` is not set per `subscribe`. It
    must be configured in the top-level configuration.

    Before:
    ```javascript
    const consumer = kafka.consumer({
      groupId: 'test-group',
    });
    await consumer.connect();
    await consumer.subscribe({
      topics: ["topic"], fromBeginning: true
    });
    ```

    After:
    ```javascript
    const consumer = kafka.consumer({
      kafkaJS: {
        groupId: 'test-group',
        fromBeginning: true,
      }
    });
    await consumer.connect();
    await consumer.subscribe({
      topics: ["topic"]
    });``
    ```
- `run` :
  - For auto-committing using a consumer, the properties `autoCommit`
    and `autoCommitInterval` on `run` are not set per `subscribe()`. They
    must be configured in the top-level configuration. `autoCommitThreshold`
    is not supported. If `autoCommit` is set to `true`, messages are *not*
    committed per-message, but periodically at the interval specified by
    `autoCommitInterval` (default 5 seconds).

    Before:
    ```javascript
    const kafka = new Kafka({ /* ... */ });
    const consumer = kafka.consumer({ /* ... */ });
    await consumer.connect();
    await consumer.subscribe({ topics: ["topic"] });
    consumer.run({
      eachMessage: someFunc,
      autoCommit: true,
      autoCommitInterval: 5000,
    });
    ```

    After:
    ```javascript
    const kafka = new Kafka({ kafkaJS: { /* ... */ } });
    const consumer = kafka.consumer({
      kafkaJS: {
        /* ... */,
        autoCommit: true,
        autoCommitInterval: 5000,
      },
    });
    await consumer.connect();
    await consumer.subscribe({ topics: ["topic"] });
    consumer.run({
      eachMessage: someFunc,
    });
    ```
  - The `heartbeat()` no longer needs to be called by the user in the
    `eachMessage/eachBatch` callback. Heartbeats are automatically
    managed by librdkafka.
  - The `partitionsConsumedConcurrently` is supported by both
    `eachMessage` and `eachBatch`.
  - An API compatible version of `eachBatch` is available, but the
    batch size calculation is not as per configured parameters. the batch size
    a constant maximum size and is configured internally. This is subject to
    change. The property `eachBatchAutoResolve` is supported. Within
    the `eachBatch` callback, use of `uncommittedOffsets` is
    unsupported, and within the returned batch, `offsetLag` and
    `offsetLagLow` are unsupported.
- `commitOffsets`:
  - Does not yet support sending metadata for topic partitions being
    committed.
  - If called with no arguments, it commits all offsets passed to the
    user (or the stored offsets, if manually handling offset storage
    using `consumer.storeOffsets`).
- `seek`:
  - The restriction to call seek only after `run` is removed. It can
    be called any time.
- `pause` and `resume`:
  - These methods MUST be called after the consumer group is joined. In
    practice, this means it can be called whenever
    `consumer.assignment()` has a non-zero size, or within the
    `eachMessage/eachBatch` callback.
- `stop` is not yet supported, and the user must disconnect the
  consumer.


### Admin client

The library provides an admin client to interact with the Kafka cluster. The
admin client provides several methods to manage topics, groups, and other Kafka
entities.

To following code snippet instantiates the `AdminClient`:

```js
const Kafka = require('@confluentinc/kafka-javascript');

const client = Kafka.AdminClient.create({
  'client.id': 'kafka-admin',
  'bootstrap.servers': 'broker01'
});

// From an existing producer or consumer
const depClient = Kafka.AdminClient.createFrom(producer);
```

These will instantiate and connect the `AdminClient`, which will allow the
calling of the admin methods.

A complete list of methods available on the admin
client can be found in the API reference documentation.


## OAuthbearer callback authentication

The JavaScript Client library supports OAuthBearer token authentication for both the
promisified and the callback-based API. The token is fetched using a
callback provided by the user. The callback is called at 80% of the
token expiry time, and the library uses the new token for the next login
attempt.

```js
async function token_refresh(oauthbearer_config /* string - passed from config */, cb /* can be used if function is not async */) {
    // Some logic to fetch the token, before returning it.
    return { tokenValue, lifetime, principal, extensions };
}

const producer = new Kafka().producer({
    'bootstrap.servers': '<fill>',
    'security.protocol': 'sasl_ssl', // or sasl_plain
    'sasl.mechanisms': 'OAUTHBEARER',
    'sasl.oauthbearer.config': 'someConfigPropertiesKey=value', // Just passed straight to token_refresh as a string, carries no other significance.
    'oauthbearer_token_refresh_cb': token_refresh,
});
```

For a special case of OAuthBearer token authentication, where the token
is fetched from an OIDC provider using the `client_credentials` grant
type, the library provides a built-in callback, which can be set through
just the configuration without any custom function required:

```js
const producer = new Kafka().producer({
    'bootstrap.servers': '<fill>',
    'security.protocol': 'sasl_ssl', // or sasl_plain
    'sasl.mechanisms': 'OAUTHBEARER',
    'sasl.oauthbearer.method': 'oidc',
    'sasl.oauthbearer.token.endpoint.url': issuerEndpointUrl,
    'sasl.oauthbearer.scope': scope,
    'sasl.oauthbearer.client.id': oauthClientId,
    'sasl.oauthbearer.client.secret': oauthClientSecret,
    'sasl.oauthbearer.extensions': `logicalCluster=${kafkaLogicalCluster},identityPoolId=${identityPoolId}`
});
```

These examples are for the promisified API, but the callback-based API
can be used with the same configuration settings.


### Sink Connector Configuration

Start the services using the Confluent CLI:

```bash
confluent local start
```

Create a configuration file named aws-cloudwatch-metrics-sink-config.json with the following
contents.

```text
 {
  "name": "aws-cloudwatch-metrics-sink",
  "config": {
    "name": "aws-cloudwatch-metrics-sink",
    "topics": "cloudwatch-metrics-topic",
    "connector.class": "io.confluent.connect.aws.cloudwatch.metrics.AwsCloudWatchMetricsSinkConnector",
    "tasks.max": "1",
    "key.converter": "io.confluent.connect.avro.AvroConverter",
    "key.converter.schema.registry.url": "http://localhost:8081",
    "value.converter": "io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url": "http://localhost:8081",

    "aws.cloudwatch.metrics.url": "https://monitoring.us-east-2.amazonaws.com",
    "aws.cloudwatch.metrics.namespace": "service-namespace",
    "behavior.on.malformed.metric": "fail",
    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1"
  }
}
```

The important configuration parameters used here are:

- **aws.cloudwatch.metrics.url**: The endpoint URL that the sink connector uses to push
  the given metrics.
- **aws.cloudwatch.metrics.namespace**: The Amazon CloudWatch Metrics namespace associated with the
  desired metrics.
- **tasks.max**: The maximum number of tasks that should be created for
  this connector.

Run this command to start the Amazon CloudWatch Metrics sink connector.

```bash
confluent local load aws-cloudwatch-metrics-sink --config aws-cloudwatch-metrics-sink-config.json
```

To check that the connector started successfully view the Connect
worker’s log by running:

```bash
confluent local services connect log
```

Produce test data to the `cloudwatch-metrics-topic` topic in Kafka using the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html)
confluent local produce command.

```bash
  kafka-avro-console-producer \
--broker-list localhost:9092 --topic cloudwatch-metrics-topic \
--property parse.key=true \
--property key.separator=, \
--property key.schema='{"type":"string"}' \
--property value.schema='{"name": "myMetric","type": "record","fields": [{"name": "name","type": "string"},{"name": "type","type": "string"},{"name": "timestamp","type": "long"},{"name": "dimensions","type": {"name": "dimensions","type": "record","fields": [{"name": "dimensions1","type": "string"},{"name": "dimensions2","type": "string"}]}},{"name": "values","type": {"name": "values","type": "record","fields": [{"name":"count", "type": "double"},{"name":"oneMinuteRate", "type": "double"},{"name":"fiveMinuteRate", "type": "double"},{"name":"fifteenMinuteRate", "type": "double"},{"name":"meanRate", "type": "double"}]}}]}'
```


#### NOTE
For details about using this connector with Kafka Connect Reporter, see
[Connect Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).

```properties
name=datadog-metrics-sink
topics=datadog-metrics-topic
connector.class=io.confluent.connect.datadog.metrics.DatadogMetricsSinkConnector
tasks.max=1
datadog.api.key=< Your Datadog Api key >
datadog.domain=< anyone of COM/EU >
behavior.on.error=< Optional Configuration >
reporter.bootstrap.servers=localhost:9092

key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081

confluent.topic.bootstrap.servers=localhost:9092
confluent.topic.replication.factor=1
confluent.license=
```

Before starting the connector, make sure that the configurations in `datadog
properties` are properly set.


#### NOTE
Change the `confluent.topic.bootstrap.servers` property to include your
broker address(es), and change the `confluent.topic.replication.factor` to
`3` for staging or production use.

Use curl to post a configuration to one of the Connect workers. Change
`http://localhost:8083/` to the endpoint of one of your Connect worker(s).

```bash
curl -sS -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
```

Use the following command to update the configuration of existing connector.

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors/FirebaseSourceConnector/config
```

Confirm that the connector is in a `RUNNING` state by running the following
command:

```bash
curl http://localhost:8083/connectors/FirebaseSourceConnector/status
```

The output should resemble:

```bash
{
   "name":"FirebaseSourceConnector",
   "connector":{
      "state":"RUNNING",
      "worker_id":"127.0.1.1:8083"
   },
   "tasks":[
      {
         "id":0,
         "state":"RUNNING",
         "worker_id":"127.0.1.1:8083"
      }
   ],
   "type":"source"
}
```

To publish records into Firebase, follow the [Firebase documentation](https://firebase.google.com/docs/database/admin/save-data). The data
produced to firebase should adhere to the following [data format](#firebase-source-data-format). You can also use the JSON example mentioned in
the [data format section](#firebase-source-data-format), save it into a
`data.json` file and finally import it into a Firebase database reference using the
import feature in the Firebase console.

To consume records written by the connector to the Kafka topic, run the following
command:

```bash
kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081  --topic artists --from-beginning
```

```bash
kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081  --topic songs --from-beginning
```


### Properties-based example

Create a file called github-source-quickstart.properties file with following properties:

```bash
name=MyGithubConnector
confluent.topic.bootstrap.servers=localhost:9092
confluent.topic.replication.factor=1
tasks.max=1
connector.class=io.confluent.connect.github.GithubSourceConnector
github.service.url=https://api.github.com
github.access.token=<ACCESS-TOKEN>
github.repositories=apache/kafka
github.resources=stargazers
github.since=2019-01-01
topic.name.pattern=github-${resourceName}
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
```

Next, load the Source connector.

```bash
.confluent local load MyGithubConnector --config github-source-quickstart.properties
```

Your output should resemble the following:

```bash
{
    "name": "MyGithubConnector",
    "config": {
        "connector.class": "io.confluent.connect.github.GithubSourceConnector",
        "tasks.max": "1",
        "confluent.topic.bootstrap.servers":"localhost:9092",
        "confluent.topic.replication.factor":"1",
        "github.service.url":"https://api.github.com",
        "github.repositories":"apache/kafka",
        "github.resources":"stargazers",
        "github.since":"2019-01-01",
        "github.access.token":"<Your-Github-Access-Token>",
        "topic.name.pattern":"github-${resourceName}",
        "key.converter":"io.confluent.connect.avro.AvroConverter",
        "key.converter.schema.registry.url":"http://localhost:8081",
        "value.converter":"io.confluent.connect.avro.AvroConverter",
        "value.converter.schema.registry.url":"http://localhost:8081"
    },
    "tasks": [],
    "type": null
}
```

Enter the following command to confirm that the connector is in a `RUNNING` state:

```bash
confluent local status MyGithubConnector
```

The output should resemble:

```bash
{
   "name":"MyGithubConnector",
   "connector":
   {
      "state":"RUNNING",
      "worker_id":"127.0.1.1:8083"
   },
   "tasks":
   [
      {
         "id":0,
         "state":"RUNNING",
         "worker_id":"127.0.1.1:8083"
      }
   ],
   "type":"source"
}
```


#### NOTE
Change the `confluent.topic.bootstrap.servers` property to include your
broker address(es), and change the `confluent.topic.replication.factor` to
3 for staging or production use.

Use curl to post a configuration to one of the Kafka Connect Workers. Change
`http://localhost:8083/` to the endpoint of one of your Kafka Connect
worker(s).

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
```

Use the following command to update the configuration of existing connector.

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors/HDFS2SourceConnector/config
```

To consume records written by the connector to the configured Kafka topic, run
the following command:

```bash
kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081  --topic copy_of_test_hdfs --from-beginning
```


#### NOTE
Change the `confluent.topic.bootstrap.servers` property to include your broker address(es), and change the `confluent.topic.replication.factor` to 3 for staging or production use.

Use curl to post a configuration to one of the Kafka Connect Workers. Change
`http://localhost:8083/` to the endpoint of one of your Kafka Connect
worker(s).

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
```

Use the following command to update the configuration of existing connector.

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors/HDFS3SourceConnector/config
```

To consume records written by the connector to the configured Kafka topic, run the following command:

```bash
kafka-avro-console-consumer --bootstrap-server localhost:9092 --property schema.registry.url=http://localhost:8081  --topic copy_of_test_hdfs --from-beginning
```


## Quick start

This quick start uses the HTTP Sink connector to consume records and send HTTP
requests to a demo HTTP service running locally that is running without any
authentication.

1. Before starting the connector, clone and run the [kafka-connect-http-demo](https://github.com/confluentinc/kafka-connect-http-demo) app on your machine.
   ```bash
   git clone https://github.com/confluentinc/kafka-connect-http-demo.git
   cd kafka-connect-http-demo
   mvn spring-boot:run -Dspring.profiles.active=simple-auth
   ```
2. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html)
   ```bash
   confluent local start
   ```
3. Produce test data to the `http-messages` topic in Kafka using the
   Confluent CLI [confluent local services kafka
   produce](https://docs.confluent.io/confluent-cli/current/command-reference/local/services/kafka/confluent_local_services_kafka_produce.html)
   command.
   ```bash
   seq 10 | confluent local produce http-messages
   ```
4. Create a `http-sink.json` file with the following contents:
   ```json
   {
     "name": "HttpSink",
     "config": {
       "topics": "http-messages",
       "tasks.max": "1",
       "connector.class": "io.confluent.connect.http.HttpSinkConnector",
       "http.api.url": "http://localhost:8080/api/messages",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1",
       "reporter.bootstrap.servers": "localhost:9092",
       "reporter.result.topic.name": "success-responses",
       "reporter.result.topic.replication.factor": "1",
       "reporter.error.topic.name":"error-responses",
       "reporter.error.topic.replication.factor":"1"
     }
   }
   ```
5. Load the HTTP Sink connector.
   ```bash
   confluent local load HttpSink --config http-sink.json
   ```
6. Verify the connector is in a `RUNNING` state.
   ```bash
   confluent local status HttpSink
   ```
7. Verify the data was sent to the HTTP endpoint.
   ```bash
   curl localhost:8080/api/messages
   ```

Note that before running other examples, you should kill the demo app (`CTRL +
C`) to avoid port conflicts.


### Property-based example

Configure the `jira-source-quickstart.properties` file with following properties:

```bash
name=MyJiraConnector
confluent.topic.bootstrap.servers=localhost:9092
confluent.topic.replication.factor=1
tasks.max=1
connector.class=io.confluent.connect.jira.JiraSourceConnector
jira.url=<Your-Jira-URL>
jira.since=2019-10-17 23:50
jira.username=<Your-Jira-Username>
jira.api.token=<Your-Jira-Access-Token>
jira.tables=roles
topic.name.pattern=jira-topic-${resourceName}
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
```

Next, load the Source connector.

```bash
./bin/confluent local load MyJiraConnector --config ./etc/kafka-connect-jira/jira-source-quickstart.properties
```

Your output should resemble the following:

```bash
{
     "name": "MyJiraConnector",
     "config": {
        "confluent.topic.bootstrap.servers": "localhost:9092",
        "confluent.topic.replication.factor": "1",
        "tasks.max": "1",
        "connector.class": "io.confluent.connect.jira.JiraSourceConnector",
        "jira.url": "<Your-Jira-URL>",
        "jira.since": "2019-10-17 23:50",
        "jira.username": "< Your-Jira-Username >",
        "jira.api.token": "< Your-Jira-Access-Token >",
        "jira.tables": "roles",
        "topic.name.pattern":"jira-topic-${resourceName}",
        "key.converter":"io.confluent.connect.avro.AvroConverter",
        "key.converter.schema.registry.url":"http://localhost:8081",
        "value.converter":"io.confluent.connect.avro.AvroConverter",
        "value.converter.schema.registry.url":"http://localhost:8081"
        "name": "MyJiraConnector"
     },
     "tasks": [],
     "type": "source"
}
```

Enter the following command to confirm that the connector is in a `RUNNING` state:

```bash
confluent local status MyJiraConnector
```

The output should resemble the example below:

```bash
{
   "name":"MyJiraConnector",
   "connector":{
      "state":"RUNNING",
      "worker_id":"127.0.1.1:8083"
   },
   "tasks":[
      {
         "id":0,
         "state":"RUNNING",
         "worker_id":"127.0.1.1:8083"
      }
   ],
   "type":"source"
}
```


## Distributed

This configuration is used typically along with [distributed
mode](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`connector.json`, configure all of the required values, and use the command
below to post the configuration to one of the distributed connect workers.

```bash
{
  "name": "connector1",
  "config": {
    "connector.class": "io.confluent.connect.jms.JmsSourceConnector",
    "kafka.topic":"MyKafkaTopicName",
    "jms.destination.name":"MyQueueName",
    "java.naming.factory.initial":"",
    "java.naming.provider.url":"",
    "confluent.license":"",
    "confluent.topic.bootstrap.servers":"localhost:9092"
  }
}
```

Change the `confluent.topic.*` properties as required to suit your environment.
If running on a single-node Kafka cluster you will need to include `confluent.topic.replication.factor=1`.
Leave the `confluent.license` property blank for a 30 day trial.
See the [configuration options](source_connector_config.md#jms-source-connector-license-config) for more details.

For example, the following specifies looking up the IBM MQ connection information in LDAP
(check the documentation for your JMS broker for more details).

```bash
{
  "name": "connector1",
  "config": {
    "connector.class": "io.confluent.connect.jms.JmsSourceConnector",
    "kafka.topic":"MyKafkaTopicName",
    "jms.destination.name":"MyQueueName",
    "jms.destination.type":"queue",
    "java.naming.factory.initial":"com.sun.jndi.ldap.LdapCtxFactory",
    "java.naming.provider.url":"ldap://<ldap_url>"
    "java.naming.security.principal":"MyUserName",
    "java.naming.security.credentials":"MyPassword",
    "confluent.license":"",
    "confluent.topic.bootstrap.servers":"localhost:9092"
  }
}
```

Change the `confluent.topic.*` properties as required to suit your environment.
If running on a single-node Kafka cluster you will need to include `"confluent.topic.replication.factor":"1"`.
Leave the `confluent.license` property blank for a 30 day trial.
See the [configuration options](source_connector_config.md#jms-source-connector-license-config) for more details.

Use curl to post the configuration to one of the Kafka Connect Workers.
Change `http://localhost:8083/` the endpoint of one of your Kafka Connect worker(s).

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
```


#### NOTE
For details about using this connector with Kafka Connect Reporter, see
[Connect Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).

Run the connector with this configuration.

```bash
confluent local load pagerduty-sink-connector --config pagerduty-sink.properties
```

The output should resemble:

```json
 {
    "name":"pagerduty-sink-connector",
    "config":{
        "topics":"incidents",
        "tasks.max":"1",
        "connector.class":"io.confluent.connect.pagerduty.PagerDutySinkConnector",

        "pagerduty.api.key":"****",
        "behavior.on.error":"fail",

        "key.converter": "org.apache.kafka.connect.storage.StringConverter",
        "value.converter":"io.confluent.connect.avro.AvroConverter",
        "value.converter.schema.registry.url":"http://localhost:8081",

        "confluent.topic.bootstrap.servers":"localhost:9092",
        "confluent.topic.replication.factor":"1",
        "reporter.bootstrap.servers": "localhost:9092",
        "reporter.result.topic.replication.factor":"1",
        "reporter.error.topic.replication.factor":"1"

        "name":"pagerduty-sink-connector"
     },
    "tasks":[
      {
        "connector":"pagerduty-sink-connector",
        "task":0
      }
     ],
     "type":"sink"
}
```


#### NOTE
For details about using this connector with Kafka Connect Reporter, see
[Connect Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).

1. Write the following JSON to `config.json` and configure all of the required values.
   ```json
   {
     "name" : "prometheus-connector",
     "config" : {
       "topics":"test-topic",
       "connector.class" : "io.confluent.connect.prometheus.PrometheusMetricsSinkConnector",
       "tasks.max" : "1",
       "confluent.topic.bootstrap.servers":"localhost:9092",
       "prometheus.listener.url": "http://localhost:8889/metrics",
       "key.converter": "io.confluent.connect.avro.AvroConverter",
       "key.converter.schema.registry.url": "http://localhost:8081",
       "value.converter": "io.confluent.connect.avro.AvroConverter",
       "value.converter.schema.registry.url": "http://localhost:8081",
       "reporter.result.topic.replication.factor": "1",
       "reporter.error.topic.replication.factor": "1",
       "behavior.on.error": "log"
     }
   }
   ```

   #### NOTE
   Change the `confluent.topic.bootstrap.servers` property to include your
   broker address(es) and change the `confluent.topic.replication.factor`
   to `3` for production use.
2. Enter the following curl command to post the configuration to one of the
   Kafka Connect workers. Change `http://localhost:8083/` to the endpoint of
   one of your Kafka Connect worker(s).
   ```bash
   curl -sS -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
   ```
3. Enter the following curl command to update the configuration of the connector:
   ```bash
   curl -s -X PUT -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors/prometheus-connector/config
   ```
4. Enter the following curl command to confirm that the connector is in a `RUNNING` state:
   ```bash
   curl http://localhost:8083/connectors/prometheus-connector/status | jq
   ```

   The output should resemble:
   ```bash
   {
     "name": "prometheus-connector",
     "connector": {
     "state": "RUNNING",
     "worker_id": "127.0.1.1:8083"
   },
     "tasks": [
     {
     "id": 0,
     "state": "RUNNING",
     "worker_id": "127.0.1.1:8083",
     }
   ],
   "type": "sink"
   }
   ```

   Search for the endpoint `/connectors/prometheus-connector/status`, the
   state of the connector and tasks should have status as `RUNNING`.
5. Use the following command to produce Avro data to the Kafka topic: `test-topic`:
   ```bash
    ./bin/kafka-avro-console-producer \
   --broker-list localhost:9092 --topic test-topic \
   --property value.schema='{"name": "metric","type": "record","fields": [{"name": "name","type": "string"},{"name": "type","type": "string"},{"name": "timestamp","type": "long"},{"name": "values","type": {"name": "values","type": "record","fields": [{"name":"doubleValue", "type": "double"}]}}]}'
   ```

   While the console is waiting for the input, use the following three records
   and paste each of them on the console.
   ```bash
   {"name":"kafka_gaugeMetric1", "type":"gauge","timestamp": 1576236481,"values": {"doubleValue": 5.639623848362502}}
   {"name":"kafka_gaugeMetric1", "type":"gauge","timestamp": 1576236481,"values": {"doubleValue": 5.639623848362502}}
   {"name":"kafka_gaugeMetric2", "type":"gauge","timestamp": 1576236481,"values": {"doubleValue": 5.639623848362502}}
   {"name":"kafka_gaugeMetric3", "type":"gauge","timestamp": 1576236481,"values": {"doubleValue": 5.639623848362502}}
   ```
6. Check the Prometheus portal on `localhost:9090` and verify that metrics
   were created.


### REST-based example

This configuration is used typically along with [distributed workers](/platform/current/connect/concepts.html#distributed-workers).
Write the following JSON to `connector.json`, configure all of the required values, and use the command below to
post the configuration to one of the distributed connect worker(s).
See Kafka Connect [REST API](/platform/current/connect/references/restapi.html) for more information.

**Connect Distributed REST example with Platform Event:**

```json
{
  "name" : "SFDCPlatformEvents1",
  "config" : {
    "connector.class", "io.confluent.salesforce.SalesforcePlatformEventSourceConnector",
    "tasks.max" : "1",
    "kafka.topic" : "< Required Configuration >",
    "salesforce.consumer.key" : "< Required Configuration >",
    "salesforce.consumer.secret" : "< Required Configuration >",
    "salesforce.password" : "< Required Configuration >",
    "salesforce.password.token" : "< Required Configuration >",
    "salesforce.platform.event.name" : "< Required Configuration >",
    "salesforce.username" : "< Required Configuration >",
    "salesforce.initial.start" : "all",
    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1",
    "confluent.license": " Omit to enable trial mode "
  }
}
```


### REST-based example

This configuration is typically used with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers).

1. Write the following JSON to `connector.json` and configure all the required
   values:

   **Connect Distributed REST example with Platform Event**
   ```json
      {
        "name" : "SFDCPlatformEventsSink1",
        "config" : {
          "connector.class": "io.confluent.salesforce.SalesforcePlatformEventSinkConnector",
          "tasks.max" : "1",
          "topics" : "< Required Configuration >",
          "salesforce.consumer.key" : "< Required Configuration >",
          "salesforce.consumer.secret" : "< Required Configuration >",
          "salesforce.password" : "< Required Configuration >",
          "salesforce.password.token" : "< Required Configuration >",
          "salesforce.platform.event.name" : "< Required Configuration >",
          "salesforce.username" : "< Required Configuration >",
          "confluent.topic.bootstrap.servers": "localhost:9092",
          "confluent.topic.replication.factor": "1",
          "confluent.license": " Omit to enable trial mode "
        }
      }
   ```

   #### NOTE
   - Change the `confluent.topic.bootstrap.servers` property to include
     your broker address(es), and change the
     `confluent.topic.replication.factor` to 3 for staging or production
     use.
   - For details about using this connector with Kafka Connect Reporter,
     see [Connect
     Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
2. Use curl to post a configuration to one of the Kafka Connect Workers.
   Change `http://localhost:8083/` to the endpoint of one of your
   Kafka Connect worker(s).  For more information, see Kafka Connect
   [REST API](/platform/current/connect/references/restapi.html).

   **Create a new connector:**
   ```bash
   curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
   ```

   **Update an existing connector:**
   ```bash
   curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/SFDCPlatformEventsSink1/config
   ```


### REST-based example

This configuration is used typically along with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers).

1. Create a file named `connector.json` using the following JSON
   configuration example:

   **Connect Distributed REST example with Push Topic**:
   ```json
     {
       "name" : "SalesforcePushTopicSourceConnector1",
       "config" : {
         "connector.class" : "io.confluent.salesforce.SalesforcePushTopicSourceConnector",
         "tasks.max" : "1",
         "kafka.topic" : "< Required Configuration >",
         "salesforce.consumer.key" : "< Required Configuration >",
         "salesforce.consumer.secret" : "< Required Configuration >",
         "salesforce.object" : "< Required Configuration >",
         "salesforce.password" : "< Required Configuration >",
         "salesforce.password.token" : "< Required Configuration >",
         "salesforce.push.topic.name" : "< Required Configuration >",
         "salesforce.username" : "< Required Configuration >",
         "salesforce.initial.start" : "all",
         "confluent.topic.bootstrap.servers": "localhost:9092",
         "confluent.topic.replication.factor": "1",
         "confluent.license": " Omit to enable trial mode "
       }
     }
   ```

   To include your broker address(es), change the
   `confluent.topic.bootstrap.servers` property. You can change the
   `confluent.topic.replication.factor` to 3 for staging or production use.
2. Use `curl` to post a configuration to one of the Connect workers. Change
   `http://localhost:8083/` to the endpoint of one of your Connect
   worker(s). For more information, see Connect [REST
   API](/platform/current/connect/references/restapi.html) .

   **Create a new connector:**
   ```bash
   curl -s -X POST -H 'Content-Type: application/json' --data @connectorPushTopic.json http://localhost:8083/connectors
   ```

   **Update an existing connector:**
   ```bash
   curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/SalesforcePushTopicSourceConnector1/config
   ```


#### NOTE
You can add the following Single Message Transform (SMT) to the connector
configuration to process records generated by the Salesforce Bulk API Source
connector.

```text
"transforms" : "InsertField",
"transforms.InsertField.type" : "org.apache.kafka.connect.transforms.InsertField$Value",
"transforms.InsertField.static.field" : "_EventType",
"transforms.InsertField.static.value" : "created"
```

1. Create a configuration file named
   `salesforce-bulk-api-leads-sink-config.json` with the following contents.
   Ensure you enter a real username, password, security token, consumer key, and
   consumer secret. For details about configuration properties, see
   [Configuration Properties](configuration_options.md#salesforce-bulk-api-sink-connector-config).
   ```text
   {
      "name" : "SalesforceBulkApiSinkConnector",
      "config" : {

        "connector.class" : "io.confluent.connect.salesforce.SalesforceBulkApiSinkConnector",
        "tasks.max" : "1",
        "topics" : "sfdc-pushtopic-lead",
        "salesforce.object" : "Lead",
        "salesforce.password" : "< Required Configuration >",
        "salesforce.password.token" : "< Required Configuration >",
        "salesforce.username" : "< Required Configuration: secondary organization username >",
        "reporter.result.topic.replication.factor" : "1",
        "reporter.error.topic.replication.factor" : "1",
        "reporter.bootstrap.servers" : "localhost:9092",
        "confluent.topic.bootstrap.servers": "localhost:9092",
        "confluent.topic.replication.factor": "1",
        "confluent.license": " Omit to enable trial mode "
      }
    }
   ```

   For details about using this connector with Kafka Connect Reporter, see
   [Connect Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
2. Enter the Confluent CLI command to start the Salesforce Sink connector.
   ```bash
   confluent local load SalesforceBulkApiSinkConnector -- -d salesforce-bulk-api-leads-sink-config.json
   ```

   Your output should resemble:
   ```none
   {
      "name": "SalesforceBulkApiSinkConnector",
       "config": {
           "connector.class" : "io.confluent.connect.salesforce.SalesforceBulkApiSinkConnector",
           "tasks.max" : "1",
           "topics" : "sfdc-pushtopic-leads",
           "salesforce.object" : "Lead",
           "salesforce.username" : "<Required>"
           "salesforce.password" : "<Required>",
           "salesforce.password.token" : "<Required>",
           "reporter.result.topic.replication.factor" : "1",
           "reporter.error.topic.replication.factor" : "1",
           "reporter.bootstrap.servers" : "localhost:9092",
           "confluent.topic.bootstrap.servers": "localhost:9092",
           "confluent.topic.replication.factor": "1",
           "confluent.license": " Omit to enable trial mode "
       },
       "tasks": [
           ...
       ],
       "type": null
   }
   ```


### REST-based example

This configuration typically is used with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`connector.json`, configure all of the required values, and use the command
below to post the configuration to one the distributed connect worker(s). For
more information, see Kafka Connect [REST
API](/platform/current/connect/references/restapi.html) .

```text
 {
   "config" : {
     "name" : "SalesforceSObjectSinkConnector1",
     "connector.class" : "io.confluent.salesforce.SalesforceSObjectSinkConnector",
     "tasks.max" : "1",
     "topics" : "< Required Configuration >",
     "salesforce.consumer.key" : "< Required Configuration >",
     "salesforce.consumer.secret" : "< Required Configuration >",
     "salesforce.object" : "< Required Configuration >",
     "salesforce.password" : "< Required Configuration >",
     "salesforce.password.token" : "< Required Configuration >",
     "salesforce.username" : "< Required Configuration >",
     "confluent.topic.bootstrap.servers": "localhost:9092",
     "confluent.topic.replication.factor": "1",
     "salesforce.sink.object.operation": "delete",
      "override.event.type": "true",
     "confluent.license": " Omit to enable trial mode "
   }
 }
```

To include your broker address(es), change the
`confluent.topic.bootstrap.servers` property. For staging or production use,
change the `confluent.topic.replication.factor` to 3.

For details about using this connector with Kafka Connect Reporter, see
[Connect Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
Use curl to post a configuration to one of the Kafka Connect Workers. Change
`http://localhost:8083/` to the endpoint of one of your Kafka Connect
worker(s).

```none
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
```

```none
curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/SalesforcePushTopicSourceConnector1/config
```


### REST-based example

Use this setting with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`snmp-trap-source-config.json`, configure all of the required values, and use
the following command to post the configuration to one of the distributed
connect workers. For more information, see the Kafka Connect [Kafka
Connect REST Interface](/platform/current/connect/references/restapi.html).

```json
 {
 "name": "SnmpTrapSourceConnector",
 "config": {
     "name":"SnmpTrapSourceConnector",
     "connector.class":"io.confluent.connect.snmp.SnmpTrapSourceConnector",
     "tasks.max":"1",
     "kafka.topic":"snmp-kafka-topic",
     "snmp.v3.enabled":"true",
     "snmp.batch.size":"50",
     "snmp.listen.address":"<ip-address to listen trap from>",
     "snmp.listen.port":"<port to listen trap from>",
     "auth.password":"<Auth-Password>",
     "privacy.password":"<privacy-password>",
     "security.name":"<security-name>",
     "confluent.topic.bootstrap.servers":"localhost:9092",
     "confluent.topic.replication.factor":"1"
 }
}
```

Use `curl` to post the configuration to one of the Kafka Connect Workers. Change `http://localhost:8083/` the endpoint of
one of your Kafka Connect worker(s).

```bash
curl -sS -X POST -H 'Content-Type: application/json' --data @snmp-trap-source-config.json http://localhost:8083/connectors
```

Use the following command to update the configuration of existing connector.

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @snmp-trap-source-config.json http://localhost:8083/connectors/snmpTrapSourceConnector/config
```

Check that the connector started successfully. Review the Connect worker’s log by entering the following:

```bash
confluent local services connect log
```

The SNMP device should be running and generating PDUs. The connector will listen
and push PDUs of type trap to Kafka topic.


## JSON Schemaless Source Connector Example

This example follows the same steps as the Quick Start. Review the Quick Start for help running the Confluent Platform and installing the Spool Dir connectors.

1. Generate a JSON dataset using the command below:
   ```bash
   curl "https://api.mockaroo.com/api/17c84440?count=500&key=25fd9c80" > "json-spooldir-source.json"
   ```
2. Create a `spooldir.properties` file with the following contents:
   ```properties
   name=SchemaLessJsonSpoolDir
   tasks.max=1
   connector.class=com.github.jcustenborder.kafka.connect.spooldir.SpoolDirSchemaLessJsonSourceConnector
   input.path=/path/to/data
   input.file.pattern=json-spooldir-source.json
   error.path=/path/to/error
   finished.path=/path/to/finished
   halt.on.error=false
   topic=spooldir-schemaless-json-topic
   value.converter=org.apache.kafka.connect.storage.StringConverter
   ```
3. Load the SpoolDir Schemaless JSON Source connector.
   ```bash
   confluent local load spooldir --config spooldir.properties
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production environments.


#### Client-side OAuth client assertion for Kafka and KRaft

For each of the Confluent component (currently in Confluent Platform 8.0, Schema Registry only) to
authenticate with Kafka or KRaft using OAuth client assertion, configure the
client-side OAuth client assertion in the component CR as below.

For KRaft, the `authentication` object is under
`dependencies.kRaftController.controllerListener.authentication`.

To set up client assertion, first, you must complete the [client-side OAuth
configuration](#co-authenticate-kafka-client-oauth).

For client assertion, configure the additional properties on top of the existing
OAuth configurations:

```yaml
kind: <Confluent component>
spec:
  dependencies:
    <kafka or kRaftController>:
      authentication:
        type: oauth        --- [1]
        oauthSettings:
          clientAssertion: --- [2]
```

* [1] Required.
* [2] See [the client assertion properties](#co-authenticate-client-assertion-settings) for a list of properties you can use.

The following is a sample snippet of Schema Registry to authenticate with Kafka using local
client assertion:

```yaml
  kind: SchemaRegistry
  spec:
    dependencies:
      kafka:
        bootstrapEndpoint: kafka.operator.svc.cluster.local:9071
        authentication:
          type: oauth
          oauthSettings:
            tokenEndpointUri: http://keycloak:8080/realms/sso_test/protocol/openid-connect/token
            clientAssertion:
              clientId: private-key-client

## Configure for Kubernetes Horizontal Pod Autoscaler

In Kubernetes, the [Horizontal Pod Autoscaler (HPA)](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
feature automatically scales the number of pod replicas.

Starting in Confluent for Kubernetes (CFK) 2.1.0, you can configure Confluent Platform components to use
HPA based on CPU and memory utilization of Confluent Platform pods.

HPA is not supported for ZooKeeper and Control Center.

To use HPA with a Confluent Platform component, create an HPA resource for the component
custom resource (CR) out of band to integrate with CFK.

The following example is to create an HPA resource for Connect based on CPU
utilization and memory usage:

```yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: connect-cluster-hpa
  namespace: confluent
spec:
  scaleTargetRef:                             --- [1]
    apiVersion: platform.confluent.io/v1beta1 --- [2]
    kind: Connect                             --- [3]
    name: connect                             --- [4]
  minReplicas: 2                              --- [5]
  maxReplicas: 4                              --- [6]
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50              --- [7]
    - type: Resource
      resource:
        name: memory
        targetAverageValue: 1000Mi            --- [8]
```

* [1] Required. Specify the Confluent component-specific information in this
  section.
* [2] Required. CFK API version.
* [3] Required. The CR kind of the object to scale.
* [4] Required. The CR name of the object to scale.
* [5] The minimum number of replicas when scaling down.

  If your Kafka default replication factor is N, the `minReplicas`
  on your HPA for your Kafka cluster must be >= N.

  If you want Schema Registry, Connect, ksqlDB to be HA, set `minReplicas` >= 2
* [6] The maximum number of replicas when scaling up.
* [7] The target average CPU utilization of 50%.
* [8] The target average memory usage value of 1000 Mi.

Take the following into further consideration when setting up HPA for Confluent Platform:

* If you have `oneReplicaPerNode` set to `true` for Kafka (which is the
  default), your upper bound for Kafka brokers is the number of
  available Kubernetes worker nodes you have.
* If you have affinity or taint/toleration rules set for Kafka, that
  further constrains the available nodes.
* If your underlying Kubernetes cluster doesn’t itself support autoscaling of
  the Kubernetes worker nodes, make sure there is enough Kubernetes worker nodes
  to allow HPA is successful.

You can check the current status of HPA by running:

```bash
kubectl get hpa
```


# Configure Replicator for Confluent Platform Using Confluent for Kubernetes

Confluent Replicator allows you to replicate topics from one Apache Kafka® cluster to another.
In addition to copying the messages, Replicator will create topics as needed,
preserving the topic configuration in the source cluster. This includes
preserving the number of partitions, the replication factor, and any
configuration overrides specified for individual topics.

Confluent Replicator is built as a connector. So, when you deploy Replicator in Confluent for Kubernetes,
you use the Connect CRD to define a custom resource (CR) for Replicator and specify
to use the `cp-enterprise-replicator` Docker image that contains the Replicator
JARs.

For example:

```yaml
apiVersion: platform.confluent.io/v1beta1
kind: Connect
metadata:
  name: replicator
  namespace: destination
spec:
  replicas: 2
  image:
    application: confluentinc/cp-enterprise-replicator:8.1.0
    init: confluentinc/confluent-init-container:3.1.0
```

This is a change from Confluent Operator 1.x, where Replicator had a Helm sub-Chart
and a section in the `values.yaml` for configuration.

See the [comprehensive example for configuring and deploying Confluent
Replicator](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/hybrid/replicator)
for the detailed steps and an example CR.


## Configure and deploy Unified Stream Manager

1. Configure **Unified Stream Manager Agent** using the USMAgent custom resource
   (CR), and then apply the CR using the `kubectl apply`
   command.
   ```yaml
   kind: USMAgent
   spec:
     replicas:
     image:
       application:              --- [1]
       init:                     --- [2]
     authentication:
       type:                     --- [3]
       basic:
         secretRef:              --- [4]
     tls:
       secretRef:                --- [5]
     confluentCloudClient:
       endpoint:                 --- [6]
       environmentId:            --- [7]
       authentication:
         type:                   --- [8]
         basic:
           secretRef:            --- [9]
     externalAccess:             --- [10]
       type:                     --- [11]
       loadBalancer:             --- [12]
       nodePort:                 --- [13]
   ```

   * [1] Set to the Unified Stream Manager application image.
   * [2] Set to the Unified Stream Manager CFK init container image.
   * [3] Set to `basic` or `mtls`.
   * [4] Required for basic authentication. Specify the secret containing the
     basic authentication credentials.
   * [5] For TLS between Unified Stream Manager Agent and Confluent Platform components, specify the secret
     containing the TLS certificate, the key, and the root certificate authority
     (CA) files.
   * [6] Specify the Confluent Cloud endpoint.

     The Confluent Cloud endpoint is available in the output file generated when you
     perform the first step in the registration process. See [Generate the
     configuration file](https://docs.confluent.io/cloud/current/usm/register/deploy-agent.html#generate-and-download-the-configuration-file).

     This step has to be completed before you deploy Unified Stream Manager Agent.
   * [7] Specify the Confluent Cloud Environment ID.

     The Environment ID is available in the output file generated when you
     perform the first step in the registration process. See [Generate the
     configuration file](https://docs.confluent.io/cloud/current/usm/register/deploy-agent.html#generate-and-download-the-configuration-file).

     This step has to be completed before you deploy Unified Stream Manager Agent.
   * [8] Set to `basic` for basic authentication.
   * [8] Set to `basic` for basic authentication.
   * [9] Required for basic authentication. Specify the secret containing the
     Cloud Api key and secret. The values are available in the output file
     generated when you perform the first step in the registration process. See
     [Generate the configuration file](https://docs.confluent.io/cloud/current/usm/register/deploy-agent.html#generate-and-download-the-configuration-file).

     This step has to be completed before you deploy Unified Stream Manager Agent.
   * [10] Optional. External access is optional for Unified Stream Manager Agent.
   * [11] Set to `loadBalancer` or `nodePort` to specify the
     external access type.
   * [12] Required when externalAccess type ([11]) is set to `loadBalancer`.
     For configuring load balancers, see [Configure Load Balancers for Confluent Platform in Confluent for Kubernetes](co-loadbalancers.md#co-loadbalancers).
   * [13] Required when externalAccess type ([11]) is set to `nodePort`. For
     configuring node ports, see [Configure Node Ports to Access Confluent Platform Components Using Confluent for Kubernetes](co-nodeports.md#co-nodeports).
2. Configure the **client-side properties** in Kafka, KRaft, and Connect for
   communication with Unified Stream Manager Agent, and then apply the changes to the CRs with
   the `kubectl apply` command.
   ```yaml
   spec:
     dependencies:
       usmAgentClient:
         url:                    --- [1]
         authentication:
           type:                 --- [2]
           basic:
             secretRef:          --- [3]
             dpic:               --- [4]
         tls:
           enabled:              --- [5]
           secretRef:            --- [6]
           dpic:                 --- [7]
   ```

   * [1] Specify the Unified Stream Manager Agent URL.
   * [2] Set to `basic` or `mtls` to specify the authentication type.

     See [Basic authentication credentials](co-authenticate-cp.md#co-basic-server-creds) for the required format.
   * [3] Specify the secret containing the basic authentication credentials.
   * [4] Specify the basic authentication credential secret path in the
     container. For details, see [Provide secrets in HashiCorp Vault](co-credentials.md#co-directory-path-in-container).
   * [5] Set to `true` or `false` to enable or disable TLS.
   * [6] Specify the secret containing the TLS certificate.
   * [7] Specify the TLS certificate secret path in the container. For details,
     see [Provide secrets in HashiCorp Vault](co-credentials.md#co-directory-path-in-container).
3. [Register your Confluent Platform Connect cluster in Confluent Cloud](http://docs.confluent.io/cloud/current/usm/register-connect.html).

   You can use the following options to retrieve the Connect cluster ID (also known as the group ID) that is required to register the Connect cluster in Confluent Cloud:
   * Use the `kubectl describe connect` command, and fetch the Group ID under
     the `Status` section.
   * If you have the Confluent CLI installed, you can use the command
     `confluent connect cluster list` as described in the above registration
     topic.


### Configure the source-initiated cluster link on the source cluster

For a source-initiated cluster link, configure the cluster information in the
Source mode ClusterLink CR:

```yaml
spec:
  sourceInitiatedLink:
    linkMode: Source             --- [1]
  destinationKafkaCluster:
    bootstrapEndpoint:           --- [2]
    clusterID:                   --- [3]
    kafkaRestClassRef:           --- [4]
      name:                      --- [5]
      namespace:                 --- [6]
  sourceKafkaCluster:
    kafkaRestClassRef:           --- [7]
      name:                      --- [8]
      namespace:                 --- [9]
  configs:                       --- [10]
    local.security.protocol:     --- [11]
    local.listener.name:         --- [12]
```

* [1] Required.
* [2] Required. The bootstrap endpoint where the destination Kafka is running.
* [3] The cluster ID of the destination Kafka cluster. If both `clusterID` and
  Kafka REST class name ([5]) are specified, this `clusterID` value takes
  precedence over the Kafka REST class name ([5]).

  You can get the cluster ID using the `curl` or `kafka-cluster` command
  with the proper flags. For example:
  ```bash
  curl https://<cluster url>:8090/kafka/v3/clusters/ -kv
  ```

  ```bash
  kafka-cluster cluster-id --bootstrap-server kafka.operator.svc.cluster.local:9092 \
    --config /tmp/kafka.properties
  ```
* [4] Optional. The reference to the KafkaRestClass application custom resource
  (CR) which defines the Kafka REST Class connection information.
* [5] Required under [4]. The name of the [KafkaRestClass CR](co-manage-rest-api.md#co-manage-rest-api) on the destination cluster.
* [6] Optional. The namespace of the KafkaRestClass CR. If omitted, the same
  namespace as this CR is assumed.
* [7] Required. The reference to the KafkaRestClass application custom resource
  (CR) which defines the Kafka REST Class connection information.
* [8] Required. The name of the [KafkaRestClass CR](co-manage-rest-api.md#co-manage-rest-api) on
  the source Kafka cluster.
* [9] Optional. The namespace of the KafkaRestClass CR. If omitted, the same
  namespace as this CR is assumed.
* [10] Use to specify additional configurations for the cluster link.
* [11] SSL is required when using mTLS or SASL authentication in an RBAC-enabled cluster.
  : In all other cases, it is optional. Set to `SSL` for mTLS and `SASL_SSL` for SASL
    authentication.
* [12] An SSL listener name is required when using mTLS authentication
  in an RBAC-enabled cluster. In all other cases it is optional.


#### NOTE
When RBAC is enabled in this Confluent Platform environment, the super user you configured
for Kafka (`kafka.spec.authorization.superUsers`) does not have access to
resources in the Schema Registry cluster. If you want the super user to be able to
create schema exporters, grant the super user the permission on the Schema Registry
cluster.

In the source Schema Registry clusters, create a schema exporter CR and apply the
configuration with the `kubectl apply -f <Schema Exporter CR>` command:

```yaml
apiVersion: platform.confluent.io/v1beta1
kind: SchemaExporter
metadata:
  name:                   --- [1]
  namespace:              --- [2]
spec:
  sourceCluster:          --- [3]
  destinationCluster:     --- [4]
  subjects:               --- [5]
  subjectRenameFormat:    --- [6]
  contextType:            --- [7]
  contextName:            --- [8]
  configs:                --- [9]
```

* [1] Required. The name of the schema exporter. The name must be unique in a
  source Schema Registry cluster.
* [2] The namespace for the schema exporter.
* [3] The source Schema Registry cluster. You can either specify the cluster name or the
  endpoint. If not given, CFK will auto discover the source Schema Registry in the
  namespace of this schema exporter. The discover process errors out if more
  than one Schema Registry clusters are discovered in the namespace.

  See [Specify the source and destination Schema Registry clusters](#co-schema-exporter-discover) for configuration details.
* [4] The destination Schema Registry cluster where the schemas will be exported. If not
  defined, the source cluster is used as the destination, and the schema
  exporter will be exporting schemas across contexts within the source cluster.

  See [Specify the source and destination Schema Registry clusters](#co-schema-exporter-discover) for configuration details.
* [5] The subjects to export to the destination. Default value is `["*"]`, which
  denotes all subjects in the default context.
* [6] The rename format that defines how to rename the subject at the
  destination.

  For example, if the value is `my-${subject}`, subjects at destination will
  become `my-XXX` where `XXX` is the original subject.
* [7] Specify how to create context to move the subjects at the destination.

  The default value is `AUTO`, with which, the exporter will use an auto
  generated context in the destination cluster. The auto generated context name
  will be reported in the status.

  If set to `NONE`, the exporter copies the source schemas as-is.
* [8] The name of the schema context on the destination to export the subjects.
  If this is defined, `spec.contextType` is ignored.
* [9] Additional configs not supported by the SchemaExporter CRD properties.

An example SchemaExporter CR:

```yaml
apiVersion: platform.confluent.io/v1beta1
kind: SchemaExporter
metadata:
  name: schema-exporter
  namespace: confluent
spec:
  sourceCluster:
    schemaRegistryClusterRef:
      name: sr
      namespace: operator
  destinationCluster:
    schemaRegistryRest
      endpoint: https://schemaregistry.operator-dest.svc.cluster.local:8081
      authentication:
        type: basic
        secretRef: sr-basic
  subjects:
  - subject1
  - subject2
  contextName: link-source
```


#### Discover Schema Registry using Schema Registry endpoint

To specify how to connect to the Schema Registry endpoint, specify the connection
information in the Schema CR.

**Schema Registry endpoint**

```yaml
spec:
  schemaRegistryRest:
    endpoint:                    --- [1]
    authentication:
      type:                      --- [2]
```

* [1] The endpoint where Schema Registry is running.
* [2] Authentication method to use for the Schema Registry cluster. Supported types are
  `basic`, `mtls`, `bearer`, and `oauth`. You can use `bearer` when
  RBAC is enabled for Schema Registry.

**Basic authentication to Schema Registry**

```yaml
spec:
  schemaRegistryRest:
    authentication:
      type: basic                 --- [1]
      basic:
        secretRef:                --- [2]
        directoryPathInContainer: --- [3]
```

* [1] Required for the basic authentication type.
* [2] or [3] is required.
* [2] The name of the secret that contains the credentials. See
  [Basic authentication](co-authenticate-cp.md#co-authenticate-cp-basic) for the required format.
* [3] The directory path in the container where the required credentials are
  injected by Vault.

  See [Basic authentication](co-authenticate-cp.md#co-authenticate-cp-basic) for the required
  format.

  See  [Provide secrets for Confluent Platform application CR](co-credentials.md#co-vault-category-2) for providing the credentials and required
  annotations when using Vault.

**mTLS authentication to Schema Registry**

```yaml
spec:
  schemaRegistryRest:
    authentication:
      type: mtls                 --- [1]
    tls:
      secretRef:                 --- [2]
      directoryPathInContainer:  --- [3]
```

* [1] Required for the mTLS authentication type.
* [2] The name of the secret that contains the TLS certificates.

  See [Provide TLS keys and certificates in PEM format](co-network-encryption.md#co-certs-pem) for the expected keys in the TLS secret. Only the PEM
  format is supported for Schema CRs.
* [3] The directory path in the container where the expected keys and
  certificates are mounted.

  See [Provide TLS keys and certificates in PEM format](co-network-encryption.md#co-certs-pem) for the expected keys in the TLS secret. Only the PEM
  format is supported for Schema CRs.

  See  [Provide secrets for Confluent Platform application CR](co-credentials.md#co-vault-category-2) for providing the keys and certificates using
  the Directory Path in Container feature.

**Bearer authentication to Schema Registry (for RBAC)**

When RBAC is enabled for Schema Registry, you can configure bearer authentication as below:

```yaml
spec:
  schemaRegistryRest:
    authentication:
      type: bearer                --- [1]
      bearer:
        secretRef:                --- [2]
        directoryPathInContainer: --- [3]
```

* [1] Required for the bearer authentication type.
* [2] or [3] is required.
* [2] Required. The name of the secret that contains the bearer credentials. See
  [Bearer authentication](co-authenticate-kafka.md#co-authenticate-mds-bearer) for the required
  format.
* [3] The directory path in the container where the required the bearer
  credentials are mounted.

  See [Bearer authentication](co-authenticate-kafka.md#co-authenticate-mds-bearer) for the
  required format.

  See  [Provide secrets for Confluent Platform application CR](co-credentials.md#co-vault-category-2) for providing the credential using the
  Directory Path in Container feature.

**OAuth authorization and authentication to Schema Registry**

```yaml
schemaRegistryRest:
  authentication:
    type: oauth                  --- [1]
    oauth:
      secretRef:                 --- [2]
      directoryPathInContainer:  --- [3]
      configuration:             --- [4]
```

* [1] Required for OAuth.
* [2] or [3] is required.
* [2] The name of the secret that contains the bearer credentials. See
  [Bearer authentication](co-authenticate-kafka.md#co-authenticate-mds-bearer) for the required
  format.
* [3] Set to the directory path in the container where required authentication
  credentials are injected by Vault.

  See [Bearer authentication](co-authenticate-kafka.md#co-authenticate-mds-bearer) for the
  required format.

  See  [Provide secrets for Confluent Platform application CR](co-credentials.md#co-vault-category-2) for providing the credential and required
  annotations when using Vault.
* [4] The client-side OAuth configuration. For details, see
  [Client-side OAuth/OIDC authentication for Confluent components](co-authenticate-cp.md#co-authenticate-cp-client-oauth).


# Configure Network Encryption for Confluent Platform Using Confluent for Kubernetes

This document describes how to configure network encryption with Confluent for Kubernetes
(CFK). For  security concepts in Confluent Platform, see
[Security](/platform/current/security/index.html).

To secure network communications of Confluent components, CFK supports
Transport Layer Security (TLS), an industry-standard encryption  protocol.

TLS relies on keys and certificates to establish trusted connections. This
section  describes how to manage keys and certificates when you configure TLS
encryption for Confluent Platform.

CFK supports the following mechanisms to enable TLS encryption:

[Auto-generated certificates](#co-configure-auto-certificates)
: CFK auto-generates the server certificates, using the certificate
  authority (CA) that you provide.


  If all access and communication to Confluent services is within the Kubernetes
  network, auto-generated certificates are recommended.

[User-provided certificates](#co-configure-user-provided-certificates)
: User provides the private key, public key, and CA.


  If you need to enable access to Confluent services from an
  external-to-Kubernetes domain, user-provided certificates are recommended.

[Separate certificates for internal and external communications](#co-configure-separate-certificates)
: You provide separate TLS certificates for the internal and external
  communications so that you do not mix external and internal domains in the
  certificate SAN.


  This feature is supported for ksqlDB, Schema Registry, MDS, and Kafka REST services,
  starting in CFK 2.6.0 and Confluent Platform 7.4.0 release.

[Dynamic Kafka certificate updates](#co-dynamic-certificates-update)
: When you rotate certificates by providing new server certificates, CFK
  automatically updates the configurations to use those new certificates. And, by
  default, this update triggers a rolling restart of the affected Confluent Platform pod.


  To minimize disruption during rolling restarts of Kafka brokers, you can enable
  dynamic certificate loading for Kafka and Kafka REST service. CFK will update
  TLS private keys and certificates without rolling the Kafka cluster.


  This feature is only supported at the individual listener level.


### Define SAN

The certificate must have the Subject Alternative Name (SAN) list, and the SAN
list must be properly defined and cover all hostnames that the Confluent
component will be accessed on:

* If TLS for internal communication network encryption is enabled, include the
  internal network, `<component>.<namespace>.svc.cluster.local`, in the
  SAN list.
* If TLS for external network communication is enabled, include the external
  domain name in the SAN list.

The following are the internal and external SANs of each Confluent component
that need to be included in the component certificate SAN. The examples use the
default component prefixes.

Kafka
: * Internal bootstrap access SAN: `<customResourceName>.<namespace>.svc.cluster.local`
    * Example: `kafka.confluent.svc.cluster.local`
  * Internal access SAN: `<customResourceName>-<x>.<customResourceName>.<namespace>.svc.cluster.local`


    `<x>` is the ordinal number of brokers, 0 to (number of brokers - 1).
    * Example: `kafka-0.kafka.confluent.svc.cluster.local`
    * The range can be handled through a wildcard domain, for example, `*.kafka.confluent.svc.cluster.local`.
  * External bootstrap domain SAN: `<bootstrap_prefix>.my-external-domain`
    * Example: `kafka-bootstrap.acme.com`
  * External broker SAN: `<broker_prefix><x>.my-external-domain`
    * Example: `b0.acme.com`
    * The range can be handled through a wildcard domain, for example, `*.acme.com`

MDS
: * Internal access SAN: `<customResourceName>-<x>.<customResourceName>.<namespace>.svc.cluster.local`


    `<x>` is the ordinal number of brokers, 0 to (number of brokers - 1).
    * Example: `kafka-0.kafka.confluent.svc.cluster.local`
  * External domain SAN: `<mds_prefix>.my-external-domain`
    * Example: `mds.my-external-domain`

KRaft
: * Internal bootstrap access SAN: `<customResourceName>.<namespace>.svc.cluster.local`
  * Internal access SAN: `<customResourceName>-<x>.<customResourceName>.<namespace>.svc.cluster.local`


    `<x>` is the ordinal number of the KRaft controller, 0 to (number of
    servers - 1).
    * Example: `kraftcontroller-0.kraftcontroller.confluent.svc.cluster.local`
  * External domain SAN: `<kraftcontroller_prefix>.my-external-domain`

ZooKeeper
: * Internal bootstrap access SAN: `<customResourceName>.<namespace>.svc.cluster.local`
  * Internal access SAN: `<customResourceName>-<x>.<customResourceName>.<namespace>.svc.cluster.local`


    `<x>` is the ordinal number of ZooKeeper servers, 0 to (number of servers - 1).
    * Example: `zookeeper-0.zookeeper.confluent.svc.cluster.local`
  * No external access domain


  #### IMPORTANT
  Starting with Confluent Platform version 8.0, ZooKeeper is no longer part of Confluent Platform.

Schema Registry
: * Internal bootstrap access SAN: `<customResourceName>.<namespace>.svc.cluster.local`
  * Internal access SAN: `<customResourceName>-<x>.<customResourceName>.<namespace>.
    svc.cluster.local`


    `<x>` is the ordinal number of Schema Registry servers, 0 to (number of servers - 1).
    * Example: `schemaregistry-0.schemaregistry.confluent.svc.cluster.local`
  * External domain SAN: `<schemaregistry_prefix>.my-external-domain`

REST Proxy
: * Internal access SAN: `<customResourceName>-<x>.<customResourceName>.<namespace>.
    svc.cluster.local`


    `<x>` is the ordinal number of REST Proxy servers, 0 to (number of servers - 1).
    * Example: `kafkarestproxy-0.kafkarestproxy.confluent.svc.cluster.local`
  * External domain SAN: `<kafkarestproxy_prefix>.my-external-domain`

Connect
: * Internal bootstrap access SAN: `<customResourceName>.<namespace>.svc.cluster.local`
  * Internal SAN: `<customResourceName>-<x>.<customResourceName>.<namespace>.svc.cluster.local`


    `<x>` is the ordinal number of Connect servers, 0 to (number of servers - 1).
    * Example: `connect-0.connect.confluent.svc.cluster.local`
  * External domain SAN: `<connect_prefix>.my-external-domain`

ksqlDB
: * Internal bootstrap access SAN: `<customResourceName>.<namespace>.svc.cluster.local`
  * Internal access SAN: `<customResourceName>-<x>.<customResourceName>.<namespace>.svc.cluster.local`


    `<x>` is the ordinal number of ksqlDB servers,  0 to (number of servers - 1).
    * Example: `ksqldb-0.ksqldb.confluent.svc.cluster.local`
  * External domain SAN: `<ksqldb_prefix>.my-external-domain`

Control Center (Legacy)
: * Internal bootstrap access SAN: `<customResourceName>.<namespace>.svc.cluster.local`
  * Internal access SAN: `<customResourceName>-0.<customResourceName>.<namespace>.svc.cluster.local`
    * Example: `controlcenter-0.controlcenter.confluent.svc.cluster.local`
  * External domain SAN: `<controlcenter_prefix>.my-external-domain`

For an example of how to create certificates with appropriate SAN
configurations, see the [Create your own certificates tutorial](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/security/production-secure-deploy#appendix-create-your-own-certificates).


### Migrate RBAC from OAuth or LDAP-based to dual authentication with mTLS

This section describes how to migrate an OAuth or LDAP-based RBAC deployment to
use mTLS as the one of the dual authentication methods.

1. Add a custom listener with dual OAuth or LDAP and mTLS authentications in
   Kafka.

   This listener will be used for Confluent Platform components to Kafka communications while
   the internal listener gets updated during migration.

   If migrating from LDAP to dual LDAP and mTLS, set the listener
   authentication type to `bearer`.

   If migrating from OAuth to dual OAuth and mTLS, set the listener
   authentication type to `oauth`.
   ```yaml
   kind: Kafka
   spec:
     listeners:
       custom:
         - name: customoauth
           port: 9093
           authentication:
             type: <oauth or bearer>
             oauthSettings :  # If type: oauth above
               tokenEndpointUri:
               expectedIssuer:
               jwksEndpointUri:
               subClaimName: client_id
             mtls:
               sslClientAuthentication: "required"
               principalMappingRules:
                 - "RULE:.*CN=([a-zA-Z0-9.-]*).*$/$1/"
                 - "DEFAULT"
           tls:
             enabled: true
   ```
2. Update all the Confluent Platform components CRs (Schema Registry, Connect, REST Proxy, Control Center) and
   the admin KafkaRestClass CR, to enable `sslClientAuthentication` from the
   client-side, and to update their Kafka dependency endpoint to communicate on
   the custom listener created in Step 1.

   All these have to be done in the same  step.

   Note that the MDS authentication type and the Kafka authentication type
   should be the same since MDS exists on the Kafka cluster.
   * For OAuth-based RBAC, `oauth` for Kafka and MDS
   * For LDAP-based RBAC, `oauthbearer` for Kafka and `bearer` for MDS
   * Update the Confluent Platform components:
     ```yaml
     kind: <component>
     spec:
       dependencies:
         kafka:
           bootstrapEndpoint: kafka.confluent.svc.cluster.local:9093
           authentication:
             type: <oauth or oauthbearer>
             <oauth or oauthbearer>:
             sslClientAuthentication: true
           tls:
             enabled: true
         mds:
           endpoint: https://kafka.confluent.svc.cluster.local:8090
           tokenKeyPair:
             secretRef: mds-token
           authentication:
             type: <oauth or oauthbearer>
             <oauth or bearer>:
             sslClientAuthentication: true
           tls:
             enabled: true
     ```
   * Update the `kafkaRest` dependency in the Kafka CR to enable the client-side
     mTLS in the embedded REST Proxy.
     ```yaml
      kind: Kafka
      spec:
        dependencies:
          kafkaRest:
            authentication:
              type: <oauth or bearer>
              <oauthSettings or bearer>:
              sslClientAuthentication: true
            tls:
              enabled: true
              secretRef: tls-kafka
     ```
3. Update the KaftRestClass CR. It is required to create role bindings.
   ```yaml
   kind: KafkaRestClass
   spec:
     kafkaRest:
       endpoint: https://kafka.confluent.svc.cluster.local:8090
       authentication:
         type: <oauth or bearer>
         <oauth or bearer>:
         sslClientAuthentication: true
       tls:
         secretRef: tls-kafka
   ```
4. Add the mTLS provider in the MDS service in parallel to the already
   existing OAuth or LDAP provider. This enables dual authentication with OAuth
   or LDAP and mTLS.
   ```yaml
   kind: kafka
   spec:
     services:
       mds:
         provider:
           mtls:
             sslClientAuthentication: <"required" or "requested">
             principalMappingRules:
           <oauth or ldap>:
             configurations:
   ```
5. Add the mTLS authentication section in the Schema Registry, Connect, and REST Proxy
   CRs to support dual authentication.
   ```yaml
   kind: <component>
   spec:
     authentication:
       mtls:
         sslClientAuthentication: "required"
         principalMappingRules:
       oauth:  # If migrating to OAuth+mTLS
   ```


# Configure Host-Based Static Access to Confluent Platform Components Using Confluent for Kubernetes

When you configure Kafka for host-based static access, the Kafka advertised
listeners are set up with the broker prefix and the domain name.

This method does not create any Kubernetes resources, and you need to explicitly
configure external access to Kafka, for example, using NGINX ingress
controller.

This method requires:

* Kafka is configured with TLS.
* An Ingress controller that supports SSL passthrough is used.

**To configure external access to Kafka using static host-based routing:**

1. Configure and deploy Kafka with the `staticForHostBasedRouting` access type.
   ```yaml
   listeners:
     external:
       externalAccess:
         type: staticForHostBasedRouting
         staticForHostBasedRouting:
           port:            --- [1]
           domain:          --- [2]
           brokerPrefix:    --- [3]
   ```

   * [1] Required. The `port` to be used in the advertised listener for a
     broker. Set it to `443` to support SNI capabilities.

     If you change this value on a running cluster, you must roll the cluster.
   * [2] Required. `domain` will be configured as part of the Kafka advertised
     listener.

     If you change this value on a running cluster, you must roll the cluster.
   * [3] Optional. Use `brokerPrefix` to change the default Kafka broker
     prefix. The default Kafka broker prefix is `b`.

     These are used for DNS entries. The broker DNS names become
     `<brokerPrefix>0.<domain>`, `<brokerPrefix>1.<domain>`, and so on.

     If not set, the default broker DNS names are `b0.<domain>`,
     `b1.<domain>`, and so on.

     For example, the following are Kafka advertised listeners for three Kafka
     brokers with `port: 443`  and `domain: example.com`:
     * `b0.example.com:443`
     * `b1.example.com:443`
     * `b2.example.com:443`

     If you change this value on a running cluster, you must roll the cluster.
2. Deploy an Ingress controller, such as [ingress-nginx](https://kubernetes.github.io/ingress-nginx/deploy). For a list of
   available controllers, see [Ingress controllers](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers).

   Your Ingress controller must support SSL passthrough that intercepts all
   traffic on the configured HTTPS port (default is 443) and hands it over to
   the Kafka TCP proxy.

   The below example is a Helm command to install NGINX Ingress controller
   with SSL passthrough enabled:
   ```bash
   helm upgrade --install ingress-nginx ingress-nginx \
     --repo https://kubernetes.github.io/ingress-nginx \
     --set controller.publishService.enabled=true \
     --set controller.extraArgs.enable-ssl-passthrough="true"
   ```
3. Configure the DNS addresses for Kafka brokers to point to the
   Ingress controller. You need the following to derive Kafka DNS entries:
   * `domain` name you provided in the configuration file in Step #1
   * The external IP of the Ingress controller load balancer

     You can retrieve the external IP using the following command:
     ```bash
     kubectl get services -n <namespace>
     ```
   * Kafka `brokerPrefix` you provided in the configuration file in Step #1

   The following example shows the DNS table entries using:
   * Domain: `example.com`
   * Three broker replicas with the default prefix/replica numbers: `b`

   ```none
   DNS name               ExternalIP
   b0.example.com       34.71.198.214
   b1.example.com       34.71.198.214
   b2.example.com       34.71.198.214
   ```
4. Create an [Ingress resource](https://kubernetes.io/docs/concepts/services-networking/ingress/#the-ingress-resource)
   that includes a collection of rules the Ingress controller uses to route the
   inbound traffic to Kafka.

   Ingress uses annotations to configure some options depending on the Ingress
   controller, an example of which is the [rewrite-target annotation](https://github.com/kubernetes/ingress-nginx/blob/master/docs/examples/rewrite/README.md).
   Review the documentation for your Ingress controller to learn which
   annotations are supported.

   For detail on deploying the NGINX controller and configuring an Ingress
   resource, refer to [this tutorial](https://cloud.google.com/community/tutorials/nginx-ingress-gke).

   The following example is to create an Ingress resource for NGINX Ingress
   controller. The resource exposes three Kafka brokers:
   ```yaml
   apiVersion: networking.k8s.io/v1
   kind: Ingress
   metadata:
     name: ingress-with-sni
     namespace: confluent
     annotations:
       nginx.ingress.kubernetes.io/ssl-passthrough: "true" ---[1]
       nginx.ingress.kubernetes.io/ssl-redirect: "false"   ---[2]
       nginx.ingress.kubernetes.io/backend-protocol: HTTPS ---[3]
       ingress.kubernetes.io/ssl-passthrough: "true"       ---[4]
       kubernetes.io/ingress.class: nginx                  ---[5]
   spec:
     tls:
       - hosts:
           - demo0.example.com
           - demo1.example.com
           - demo2.example.com
           - demo.example.com
     rules:
       - host: demo0.example.com
         http:
           paths:
             - path: /
               pathType: Prefix
               backend:
                 service:
                   name: kafka-0-internal
                   port:
                     number: 9092
       - host: demo1.example.com
         http:
           paths:
             - path: /
               pathType: Prefix
               backend:
                 service:
                   name: kafka-1-internal
                   port:
                     number: 9092
       - host: demo2.example.com
         http:
           paths:
             - path: /
               pathType: Prefix
               backend:
                 service:
                   name: kafka-2-internal
                   port:
                     number: 9092
   ```

   * Annotation [1] instructs the controller to send TLS connections directly to
     the backend instead of letting NGINX decrypt the communication.
   * Annotation [2] disables the default value.
   * Annotation [3] indicates how NGINX should communicate with the backend
     service.
   * Annotation [4] `ssl-passthrough` is required.
   * Annotation [5] uses the NGINX controller.

For a tutorial scenario on configuring external access using host-based
static access, see the [quickstart tutorial for host-based static access](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/networking/external-access-static-host-based).


### Issue: JAAS class path discrepancy between CFK 3.0 and Confluent Platform 7.x

CFK 3.0 defaults to Confluent Platform 8.x behavior, including Jetty 12 support.

Specifically, CFK 3.0.0 and higher uses the Confluent Platform 8.0 JAAS class path
(`org.eclipse.jetty.security.jaas.spi.PropertyFileLoginModule`), instead of
the Confluent Platform 7.x class path
(`org.eclipse.jetty.jaas.spi.PropertyFileLoginModule`).

**Solution:** To use the JAAS class path compatible with Confluent Platform 7.x, add the
annotation, `platform.confluent.io/use-old-jetty9: "true"`, to your Confluent Platform
component CR that expose REST API endpoints and have authentication enabled on
those endpoints, such as Control Center, Control Center (Legacy), Schema Registry, Connect, ksqlDB, and
REST Proxy.

```yaml
apiVersion: platform.confluent.io/v1beta1
kind: <CP component>
metadata:
  name: controlcenter
  annotations:
    platform.confluent.io/use-old-jetty9: "true"
```

When you upgrade to Confluent Platform 8.0 or higher, remove the above annotation.


# Kafka Producer for Confluent Platform

An Apache Kafka® Producer is a client application that publishes (writes) events to a
Kafka cluster. This section gives an overview of the Kafka producer and an
introduction to the configuration settings for tuning.


The Kafka producer is conceptually much simpler than the consumer
since it does not need group coordination. A producer **partitioner**
maps each message to a topic partition, and the producer sends a produce request to
the leader of that partition. The partitioners shipped with Kafka guarantee that all
messages with the same non-empty key will be sent to the same
partition.


# Manage Clusters in Confluent Platform

Confluent Platform provides features to help you manage Apache Kafka® cluster rebalancing and cost.

The following topics are included in this section:

- [Metadata Management of Kafka in Confluent Platform](../kafka-metadata/overview.md#zk-or-kraft)- Discusses the options for metadata storage and leader elections for a cluster.
- [Manage Self-Balancing Kafka Clusters in Confluent Platform](sbc/index.md#sbc) - With this feature enabled, a cluster automatically rebalances partitions across brokers when new brokers are added or existing brokers are removed.
- [Quick Start for Auto Data Balancing in Confluent Platform](rebalancer/quickstart.md#rebalancer) -  A tool that balances data so that the number of leaders and disk usage are even across brokers and racks on a per topic and cluster level while
  minimizing data movement. [Manage Self-Balancing Kafka Clusters in Confluent Platform](sbc/index.md#sbc) is the preferred alternative to [Quick Start for Auto Data Balancing in Confluent Platform](rebalancer/quickstart.md#rebalancer).
- [Tiered Storage in Confluent Platform](tiered-storage.md#tiered-storage) - A feature that helps make storing huge volumes of data in Kafka manageable by reducing operational burden and cost.

If you’re just getting started with Confluent Platform and Kafka, see also [Learn More About Confluent Products and Kafka](../get-started/kafka-basics.md#ak-basics) and [Quick Start for Confluent Platform](../get-started/platform-quickstart.md#quickstart).


## Configuring and starting controllers and brokers in KRaft mode

As of Confluent Platform 8.0, ZooKeeper is no longer available. Confluent recommends migrating to [Kraft mode](https://docs.confluent.io/platform/current/installation/installing_cp/zip-tar.html#kraft-mode) .
To learn more about running Kafka in KRaft mode, see [KRaft Overview for Confluent Platform](../../kafka-metadata/kraft.md#kraft-overview), [KRaft Configuration for Confluent Platform](../../kafka-metadata/config-kraft.md#configure-kraft), and the [Platform Quick Start](../../get-started/platform-quickstart.md#cp-quickstart-step-1).
To learn about migrating from older versions, see [Migrate from ZooKeeper to KRaft on Confluent Platform](../../installation/migrate-zk-kraft.md#migrate-zk-kraft).

This tutorial provides examples for KRaft mode only. Earlier versions of this documentation (such as [version 7.9](https://docs.confluent.io/platform/7.9/clusters/sbc/sbc-tutorial.html)) provide examples for both KRaft and ZooKeeper.

The examples show an *isolated mode* configuration for a multi-broker cluster managed by a single controller.
As shown in the steps below, you will use `$CONFLUENT_HOME/etc/kafka/broker.properties` and `$CONFLUENT_HOME/etc/kafka/controller.properties`
as the basis to create a controller (`$CONFLUENT_HOME/etc/kafka/controller-sbc.properties`) and multiple brokers to test Self-Balancing.


### 4. Configure the controller and brokers to send metrics to Control Center with Prometheus

In the following next steps, you will configure your Kafka brokers and controller to export their metrics using the `confluent.telemetry.exporter._c3.client.base.url` setting to push OTLP (OpenTelemetry Protocol) metrics.
Control Center will act as an OTLP receiver, listening on `localhost:9090` for the incoming metrics.

1. If you have the controller and brokers running (per the previous steps), **stop these components in the reverse order** from which you started them.
   1. Stop each broker by using Ctrl-C in each window.
   2. Finally, stop the controller with Ctrl-C in its window.

   Leave the windows open so that you can quickly re-start the controller and brokers after you’ve added the additional required configurations.
2. Add the following lines to the end of the properties files for the controller and each one of the brokers to emit metrics to Prometheus, the OTLP endpoint. (The fourth line with the value for `confluent.telemetry.exporter._c3.metrics.include=i` is very long. Simply copy
   the code block as provided and paste it in at the end of the properties files. This line will paste in as a single line, even though it shows as wrapped in the documentation.)
   ```bash
   metric.reporters=io.confluent.telemetry.reporter.TelemetryReporter
   confluent.telemetry.exporter._c3.type=http
   confluent.telemetry.exporter._c3.enabled=true
   confluent.telemetry.exporter._c3.metrics.include=io.confluent.kafka.server.request.(?!.*delta).*|io.confluent.kafka.server.server.broker.state|io.confluent.kafka.server.replica.manager.leader.count|io.confluent.kafka.server.request.queue.size|io.confluent.kafka.server.broker.topic.failed.produce.requests.rate.1.min|io.confluent.kafka.server.tier.archiver.total.lag|io.confluent.kafka.server.request.total.time.ms.p99|io.confluent.kafka.server.broker.topic.failed.fetch.requests.rate.1.min|io.confluent.kafka.server.broker.topic.total.fetch.requests.rate.1.min|io.confluent.kafka.server.partition.caught.up.replicas.count|io.confluent.kafka.server.partition.observer.replicas.count|io.confluent.kafka.server.tier.tasks.num.partitions.in.error|io.confluent.kafka.server.broker.topic.bytes.out.rate.1.min|io.confluent.kafka.server.request.total.time.ms.p95|io.confluent.kafka.server.controller.active.controller.count|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.total|io.confluent.kafka.server.request.total.time.ms.p999|io.confluent.kafka.server.controller.active.broker.count|io.confluent.kafka.server.request.handler.pool.request.handler.avg.idle.percent.rate.1.min|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.rate.1.min|io.confluent.kafka.server.controller.unclean.leader.elections.rate.1.min|io.confluent.kafka.server.replica.manager.partition.count|io.confluent.kafka.server.controller.unclean.leader.elections.total|io.confluent.kafka.server.partition.replicas.count|io.confluent.kafka.server.broker.topic.total.produce.requests.rate.1.min|io.confluent.kafka.server.controller.offline.partitions.count|io.confluent.kafka.server.socket.server.network.processor.avg.idle.percent|io.confluent.kafka.server.partition.under.replicated|io.confluent.kafka.server.log.log.start.offset|io.confluent.kafka.server.log.tier.size|io.confluent.kafka.server.log.size|io.confluent.kafka.server.tier.fetcher.bytes.fetched.total|io.confluent.kafka.server.request.total.time.ms.p50|io.confluent.kafka.server.tenant.consumer.lag.offsets|io.confluent.kafka.server.session.expire.listener.zookeeper.expires.rate.1.min|io.confluent.kafka.server.log.log.end.offset|io.confluent.kafka.server.broker.topic.bytes.in.rate.1.min|io.confluent.kafka.server.partition.under.min.isr|io.confluent.kafka.server.partition.in.sync.replicas.count|io.confluent.telemetry.http.exporter.batches.dropped|io.confluent.telemetry.http.exporter.items.total|io.confluent.telemetry.http.exporter.items.succeeded|io.confluent.telemetry.http.exporter.send.time.total.millis|io.confluent.kafka.server.controller.leader.election.rate.(?!.*delta).*|io.confluent.telemetry.http.exporter.batches.failed
   confluent.telemetry.exporter._c3.client.base.url=http://localhost:9090/api/v1/otlp
   confluent.telemetry.exporter._c3.client.compression=gzip
   confluent.telemetry.exporter._c3.api.key=dummy
   confluent.telemetry.exporter._c3.api.secret=dummy
   confluent.telemetry.exporter._c3.buffer.pending.batches.max=80
   confluent.telemetry.exporter._c3.buffer.batch.items.max=4000
   confluent.telemetry.exporter._c3.buffer.inflight.submissions.max=10
   confluent.telemetry.metrics.collector.interval.ms=60000
   confluent.telemetry.remoteconfig._confluent.enabled=false
   confluent.consumer.lag.emitter.enabled=true
   ```
3. Save the updated files.


# Configure and Manage Confluent Platform

This section provides topics on resources for managing Confluent Platform, including tools, configuration reference, and Apache Kafka® deployment and post-deployment guidance.

- [Kafka Configuration Reference for Confluent Platform](../installation/configuration/index.md#cp-config-reference) - Contains a comprehensive reference guide of broker, topic, consumer,
  producer, and connect configuration properties.
- [CLI Tools Shipped With Confluent Platform](../tools/cli-reference.md#cp-all-cli) - Contains a list of CLI tools for use with Kafka and Confluent Platform.
- [Change Kafka Configurations Without Restart for Confluent Platform](../kafka/dynamic-config.md#kafka-dynamic-configurations) - Shows how you can change configuration properties
  using the `kafka-configs` tool without stopping a broker.
- [Manage Clusters in Confluent Platform](../clusters/overview.md#manage-clusters) - Contains topics that describe how metadata is managed for a cluster,
  and how cost-saving features like Tiered Storage and the Confluent Rebalancer work.
- [Configure Metadata Service (MDS) in Confluent Platform](../kafka/configure-mds/index.md#rbac-mds-config) - Describes the Confluent Metadata Service (MDS)
- [Docker Operations for Confluent Platform](../installation/docker/operations/index.md#operations-overview) - Provides a guide to configuring Confluent running on Docker
- [Running Kafka in Production with Confluent Platform](../kafka/deployment.md#cp-production-recommendations) - Covers how much memory, how many disks, CPU and more
  you should use for a Confluent Platform production deployment. In addition, provides configuration guidelines
  for production cluster.
- [Best Practices for Kafka Production Deployments in Confluent Platform](../kafka/post-deployment.md#kafka-post-deployment) - This topic describes tasks that you might complete after
  moving to production; changing the log level, adding or modifying topics, changing the replication
  factor and more.


### Task example - source task

Next you’ll describe the implementation of the corresponding `SourceTask`. The
class is small, but too long to cover completely in this guide. You’ll use
helper methods of which the details aren’t provided to describe most of the
implementation, but you can refer to the source code for the full example.

Just as with the connector, you must create a class inheriting from the
appropriate base `Task` class. It also has some standard lifecycle methods:

```java
public class FileStreamSourceTask extends SourceTask {
  private String filename;
  private InputStream stream;
  private String topic;
  private Long streamOffset;

  public void start(Map<String, String> props) {
    filename = props.get(FileStreamSourceConnector.FILE_CONFIG);
    stream = openOrThrowError(filename);
    topic = props.get(FileStreamSourceConnector.TOPIC_CONFIG);
  }

  @Override
  public synchronized void stop() {
    stream.close()
  }
```

These are slightly simplified versions, but show that that these methods should be relatively
simple and the only work they perform is allocating or freeing resources. There are two
points to note about this implementation. First, the `start()` method does not yet handle
resuming from a previous offset, which will be addressed in a later section. Second, the `stop()`
method is synchronized. This will be necessary because `SourceTasks` are given a dedicated
thread which they can block indefinitely, so they need to be stopped with a call from a different
thread in the Worker.

Next, implement the main functionality of the task: the `poll()` method that gets records
from the input system and returns a `List<SourceRecord>`:

```java
@Override
public List<SourceRecord> poll() throws InterruptedException {
  try {
    ArrayList<SourceRecord> records = new ArrayList<>();
    while (streamValid(stream) && records.isEmpty()) {
      LineAndOffset line = readToNextLine(stream);
      if (line != null) {
        Map sourcePartition = Collections.singletonMap("filename", filename);
        Map sourceOffset = Collections.singletonMap("position", streamOffset);
        records.add(new SourceRecord(sourcePartition, sourceOffset, topic, Schema.STRING_SCHEMA, line));
      } else {
        Thread.sleep(1);
      }
    }
    return records;
  } catch (IOException e) {
    // Underlying stream was killed, probably as a result of calling stop. Allow to return
    // null, and driving thread will handle any shutdown if necessary.
  }
  return null;
}
```

Again, some details are omitted, but you can see the important steps: the `poll()` method is
going to be called repeatedly, and for each call it will loop trying to read records from the
file. For each line it reads, it also tracks the file offset. It uses this information to create
an output [SourceRecord](/platform/current/connect/javadocs/javadoc/org/apache/kafka/connect/source/SourceRecord.html)
with four pieces of information: the source partition (there is only one, the single file being read),
source offset (position in the file), output topic name, and output value (the line, including a schema indicating
this value will always be a string). Other variants of the `SourceRecord` constructor can also include
a specific output partition and a key.

Note that this implementation uses the normal Java `InputStream` interface and may sleep if data is not available.
This is acceptable because Kafka Connect provides each task with a dedicated thread. While task implementations
have to conform to the basic `poll()` interface, they have a lot of flexibility in how they are implemented.
In this case, an NIO-based implementation would be more efficient, but this simple approach works, is quick to
implement, and is compatible with older versions of Java.

Although not used in the example, `SourceTask` also provides two APIs to commit offsets
in the source system: `commit()` and `commitRecord()`. These APIs are provided for source systems
which have an acknowledgement mechanism for messages. Overriding these methods allows the source
connector to acknowledge messages in the source system, either in bulk or individually,
once they have been written to Kafka.

The `commit()` API stores the offsets in the source system, up to the offsets that have been
returned by `poll()`. The implementation of this API should block until the commit is complete.
The `commitRecord()` API saves the offset in the source system for each `SourceRecord` after
it is written to Kafka. As Kafka Connect will record offsets automatically, `SourceTask` is not
required to implement them. In cases where a connector does need to acknowledge messages in the
source system, only one of the APIs is typically required.


# Kafka Connect


* [Overview](index.md)
* [Get Started](userguide.md)
* [Connectors](kafka_connectors.md)
* [Confluent Hub](confluent-hub/overview.md)
* [Connect on z/OS](connect-zos.md)
* [Install](install.md)
* [License](license.md)
* [Supported](supported-overview.md)
* [Preview](preview.md)
* [Configure](configuring.md)
* [Monitor](monitoring.md)
* [Logging](logging.md)
* [Connect to Confluent Cloud](https://docs.confluent.io/cloud/current/cp-component/connect-cloud-config.html)
* [Developer Guide](devguide.md)
* [Tutorial: Moving Data In and Out of Kafka](quickstart.md)
* [Reference](references/index.md)
* [Transform](https://docs.confluent.io/kafka-connectors/transforms/current/overview.html)
* [Custom Transforms](transforms/custom.md)
* [Security](security-overview.md)
* [Design](design.md)
* [Add Connectors and Software](extending.md)
* [Install Community Connectors](community.md)
* [Upgrade](upgrade.md)
* [Troubleshoot](troubleshoot.md)
* [FileStream Connectors](filestream_connector.md)
* [FAQ](faq.md)


## Connect worker role bindings

Use the following steps to configure role bindings for the Connect worker:
`User:$CONNECT_USER`.

1. Grant principal `User:$CONNECT_USER` the `ResourceOwner` role for
   `Topic:connect-configs`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Topic:connect-configs \
   --kafka-cluster $KAFKA_CLUSTER_ID
   ```
2. Grant principal `User:$CONNECT_USER` the `ResourceOwner` role for
   `Topic:connect-offsets`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Topic:connect-offsets \
   --kafka-cluster $KAFKA_CLUSTER_ID
   ```
3. Grant principal `User:$CONNECT_USER` the `ResourceOwner` role for
   `Topic:connect-statuses`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Topic:connect-statuses \
   --kafka-cluster $KAFKA_CLUSTER_ID
   ```
4. Grant principal `User:$CONNECT_USER` the `ResourceOwner` role for
   `Group:connect-cluster`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Group:connect-cluster \
   --kafka-cluster $KAFKA_CLUSTER_ID
   ```
5. Grant principal `User:$CONNECT_USER` the `SecurityAdmin` role. This
   allows `User:$CONNECT_USER` permission to make requests to the Metadata
   Service (MDS) to find out if a user making calls to the Connect REST API
   is authorized to perform required operations. Note that `$CONNECT_USER`
   does this by making an authorized request to MDS to check `$CLIENT`
   permissions.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role SecurityAdmin \
   --kafka-cluster $KAFKA_CLUSTER_ID \
   --connect-cluster-id $CONNECT_CLUSTER_ID
   ```
6. List the role bindings for the principal `User:$CONNECT_USER`. Verify that
   all the role bindings are properly configured.
   ```none
   confluent iam rbac role-binding list \
   --principal User:$CONNECT_USER \
   --kafka-cluster $KAFKA_CLUSTER_ID \
   --connect-cluster-id $CONNECT_CLUSTER_ID
   ```

   The following two steps are required if using a Connect [Secret Registry](connect-rbac-secret-registry.md#connect-rbac-secret-registry).
7. Grant principal `User:$CONNECT_USER` the `ResourceOwner` role to
   `Topic:_confluent-secrets`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Topic:_secrets \
   --kafka-cluster $KAFKA_CLUSTER_ID
   ```
8. Grant principal `User:$CONNECT_USER` the `ResourceOwner` role to
   `Group:secret-registry`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Group:secret-registry \
   --kafka-cluster $KAFKA_CLUSTER_ID
   ```


# Connect Secret Registry

Kafka Connect provides a secret serving layer called the Secret Registry. The
Secret Registry enables Connect to store encrypted Connect credentials in
a topic exposed through a REST API. This eliminates any unencrypted credentials
being located in the actual connector configuration.

Two additional Connect REST API extensions support the Connect Secret
Registry. The first extension enables RBAC. The second extension instantiates
the Secret Registry node in Connect. Note that the property takes a
comma-separated list of class names.

```properties
rest.extension.classes=io.confluent.connect.security.ConnectSecurityExtension,io.confluent.connect.secretregistry.ConnectSecretRegistryExtension
```

The Connect Secret Registry provides the following:

* **Persistence:** Secrets are stored in a compacted topic.
* **Key grouping:** Secrets are associated with both a *key* and a *path*. This allows multiple keys to be grouped together. Authorization is typically performed at the path level.
* **Versioning:** Multiple versions of a secret can be stored.
* **Encryption:** Keys are stored in encrypted format.
* **Master key rotation:** The master key for encryption can be changed. This allows all secrets to be re-encrypted if necessary.
* **Auditing:** All requests to save or retrieve secrets are logged.

The first character of the Connect Secret Registry key must be an alphabetic letter (a–z or A–Z).

The following sections define the roles used to configure and interact with the
Secret Registry and show a worker configuration example.


# Kafka Connect and RBAC

[Role-Based Access Control (RBAC)](../security/authorization/rbac/overview.md#rbac-overview) can be enabled for your
Confluent Platform environment. If RBAC is enabled, there are role bindings that you may
need to configure (or have set up for you) before you work with Connect  and
Connect resources. There are also RBAC configuration parameters that you
need to add to your Connect worker configuration and connectors.

Connect roles are managed by the RBAC system administrator for your
environment. Make sure to review your user principal, RBAC role, and
permissions with your RBAC system administrator before creating a Connect
cluster or connectors.

The following sections provide information about how to configure RBAC access
as it applies to Kafka Connect. For information about how to configure RBAC
for the overall Confluent Platform environment and other components, see [Role-Based Access Control (RBAC)](../security/authorization/rbac/overview.md#rbac-overview).

* [Get Started With RBAC and Kafka Connect](rbac/connect-rbac-getting-started.md)
* [Configure RBAC for a Connect Cluster](rbac/connect-rbac-connect-cluster.md)
* [Configure RBAC for a Connect Worker](rbac/connect-rbac-worker.md)
* [RBAC for self-managed connectors](rbac/connect-rbac-connectors.md)
* [Connect Secret Registry](rbac/connect-rbac-secret-registry.md)
* [Example Connect role-binding sequence](rbac/connect-rbac-example.md)


## Common Worker Configuration

`bootstrap.servers`
: A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping - this list only impacts the initial hosts used to discover the full set of servers. This list should be in the form `host1:port1,host2:port2,...`. Since these servers are just used for the initial connection to discover the full cluster membership (which may change dynamically), this list need not contain the full set of servers (you may want more than one, though, in case a server is down).


  * Type: list
  * Default: [localhost:9092]
  * Importance: high

`key.converter`
: Converter class for key Connect data. This controls the format of the data that will be written to Kafka for source connectors or read from Kafka for sink connectors. Popular formats include Avro and JSON.


  * Type: class
  * Default:
  * Importance: high

`value.converter`
: Converter class for value Connect data. This controls the format of the data that will be written to Kafka for source connectors or read from Kafka for sink connectors. Popular formats include Avro and JSON.


  * Type: class
  * Default:
  * Importance: high

`internal.key.converter`
: Converter class for internal key Connect data that implements the `Converter` interface. Used for converting data like offsets and configs.


  * Type: class
  * Default:
  * Importance: low

`internal.value.converter`
: Converter class for offset value Connect data that implements the `Converter` interface. Used for converting data like offsets and configs.


  * Type: class
  * Default:
  * Importance: low

`offset.flush.interval.ms`
: Interval at which to try committing offsets for tasks.


  * Type: long
  * Default: 60000
  * Importance: low

`offset.flush.timeout.ms`
: Maximum number of milliseconds to wait for records to flush and partition offset data to be committed to offset storage before cancelling the process and restoring the offset data to be committed in a future attempt.


  * Type: long
  * Default: 5000
  * Importance: low

`plugin.path`
: The comma-separated list of paths to directories that contain [Kafka Connect plugins](/kafka-connectors/self-managed/userguide.html#installing-kconnect-plugins).


  * Type: string
  * Default:
  * Importance: low

`rest.advertised.host.name`
: If this is set, this is the hostname that will be given out to other Workers to connect to.


  * Type: string
  * Importance: low

`rest.advertised.listener`
: Configures the listener used for communication between Workers. Valid values are either `http` or `https`.  If the
  listeners property is not defined or if it contains an HTTP listener, the default value for this field is `http`.
  When the listeners property is defined and contains only HTTPS listeners, the default value is `https`.


  * Type:  string
  * Importance: low

`rest.advertised.port`
: If this is set, this is the port that will be given out to other Workers to connect to.


  * Type: int
  * Importance: low

`listeners`
: A list of REST listeners in the format
  `protocol://host:port,protocol2://host2:port2` that determines the
  protocol used by Kafka Connect, where the protocol is either
  HTTP or HTTPS. For example:


  ```bash
  listeners=http://localhost:8080,https://localhost:8443
  ```


  By default, if no listeners are specified, the REST server runs on port 8083
  using the HTTP protocol. When using HTTPS, the configuration must include the
  TLS/SSL configuration. For more details, see [Configuring the Connect REST API for HTTP or HTTPS](../security.md#connect-rest-api-http).


  * Type: list
  * Importance: low

`response.http.headers.config`
: Used to select which HTTP headers are returned in the HTTP response for Confluent Platform
  components. Specify multiple values in a comma-separated string using the
  format `[action][header name]:[header value]` where `[action]` is one of
  the following: `set`, `add`, `setDate`, or `addDate`. You must use
  quotation marks around the header value when the header value contains commas. For example:


  ```none
  response.http.headers.config="add Cache-Control: no-cache, no-store, must-revalidate", add X-XSS-Protection: 1; mode=block, add Strict-Transport-Security: max-age=31536000; includeSubDomains, add X-Content-Type-Options: nosniff
  ```


  * Type: string
  * Default: “”
  * Importance: low

`task.shutdown.graceful.timeout.ms`
: Amount of time to wait for tasks to shutdown gracefully. This is the total amount of time, not per task. All task have shutdown triggered, then they are waited on sequentially.


  * Type: long
  * Default: 5000
  * Importance: low


### GET /connectors

Get a list of active connectors

* **Response JSON Object:**
  * **connectors** (*array*) – List of connector names

**Example request**:

```http
GET /connectors HTTP/1.1
Host: connect.example.com
Accept: application/json
```

**Example response**:

```http
HTTP/1.1 200 OK
Content-Type: application/json

["my-jdbc-source", "my-hdfs-sink"]
```

**Query parameters**:

| Name             | Data type   | Required / Optional   | Description                                                                                                                                                                                                                                          |
|------------------|-------------|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `?expand=status` | Map         | Optional              | Retrieves additional state information for each of the connectors
returned in the API call. The endpoint also returns the status of
each of the connectors and its tasks as shown in the [?expand=status
example](#expand-status) below. |
| `?expand=info`   | Map         | Optional              | Returns metadata of each of the connectors such as the
configuration, task information, and type of connector as in [?expand=info
example](#expand-info) below.                                                                              |


**?expand=status example**

```json
 {
        "FileStreamSinkConnectorConnector_0": {
            "status": {
            "name": "FileStreamSinkConnectorConnector_0",
            "connector": {
                "state": "RUNNING",
                "worker_id": "10.0.0.162:8083"
            },
            "tasks": [
                {
                "id": 0,
                "state": "RUNNING",
                "worker_id": "10.0.0.162:8083"
                }
            ],
            "type": "sink"
            }
        },
        "DatagenConnectorConnector_0": {
            "status": {
            "name": "DatagenConnectorConnector_0",
            "connector": {
                "state": "RUNNING",
                "worker_id": "10.0.0.162:8083"
            },
            "tasks": [
                {
                "id": 0,
                "state": "RUNNING",
                "worker_id": "10.0.0.162:8083"
                }
            ],
            "type": "source"
            }
        }
}
```


**?expand=info example**

```json
{
     "FileStreamSinkConnectorConnector_0": {
           "info": {
           "name": "FileStreamSinkConnectorConnector_0",
           "config": {
               "connector.class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
               "file": "/Users/smogili/file.txt",
               "tasks.max": "1",
               "topics": "datagen",
               "name": "FileStreamSinkConnectorConnector_0"
           },
           "tasks": [
               {
               "connector": "FileStreamSinkConnectorConnector_0",
               "task": 0
               }
           ],
           "type": "sink"
           }
       },
       "DatagenConnectorConnector_0": {
           "info": {
           "name": "DatagenConnectorConnector_0",
           "config": {
               "connector.class": "io.confluent.kafka.connect.datagen.DatagenConnector",
               "quickstart": "clickstream",
               "tasks.max": "1",
               "name": "DatagenConnectorConnector_0",
               "kafka.topic": "datagen"
           },
           "tasks": [
               {
               "connector": "DatagenConnectorConnector_0",
               "task": 0
               }
           ],
           "type": "source"
           }
       }
  }
```

Users can also combine the status and info expands by appending both to the
endpoint (for example,
`http://localhost:8083/connectors?expand=status&expand=info`). This will
return the metadata for the connectors and the current status of the
connector and its tasks as shown in the following example:


### InternalSecretConfigProvider

Confluent Platform provides another implementation of `ConfigProvider` named
`InternalSecretConfigProvider` which is used with the Connect
[Secret Registry](/platform/current/connect/rbac/connect-rbac-secret-registry.html). The
`InternalSecretConfigProvider` requires [Role-based access control (RBAC)](../security/authorization/rbac/overview.md#rbac-overview) with Secret Registry. The Secret Registry is a secret serving
layer that enables Connect to store encrypted Connect credentials in a
topic exposed through a REST API. This eliminates any unencrypted credentials
being located in the actual connector configuration. The following example shows
how `InternalSecretConfigProvider` is configured in the worker configuration
file:

```properties

### Standalone mode

Standalone mode is typically used for development and testing, or for
lightweight, single-agent environments-for example, sending web server logs to
Kafka. The following example shows a command that launches a worker in standalone
mode:

```bash
bin/connect-standalone worker.properties connector1.properties [connector2.properties connector3.properties ...]
```

The first parameter (`worker.properties`) is the [worker configuration
properties file](#connect-configuring-workers). Note that `worker.properties`
is an example file name. You can use any valid file name for your worker
configuration file. This file gives you control over settings such as the Kafka
cluster to use and serialization format. For an example configuration file that
uses [Avro](http://avro.apache.org/docs/current/) and [Schema
Registry](/platform/current/schema-registry/connect.html) in a standalone mode, open the file
located at `etc/schema-registry/connect-avro-standalone.properties`. You can
copy and modify this file for use as your standalone worker properties file.

The second parameter (`connector1.properties`) is the connector configuration
properties file. All connectors have configuration properties that are loaded
with the worker. As shown in the example, you can launch multiple connectors
using this command.

If you run multiple standalone workers on the same host machine, the following
two configuration properties must be unique for each worker:

* `offset.storage.file.filename`: The storage file name for connector offsets.
  This file is stored on the local filesystem in standalone mode. Using the same
  file name for two workers will cause offset data to be deleted or overwritten
  with different values.
* `listeners`: A list of URIs the REST API will listen on in the format
  `protocol://host:port,protocol2://host2:port`–the protocol is either HTTP or
  HTTPS. You can specify hostname as `0.0.0.0` to bind to all interfaces or
  leave hostname empty to bind to the default interface.

  #### NOTE
  You update the `etc/schema-registry/connect-avro-standalone.properties`
  file if you need to apply a change to Connect when starting Confluent Platform
  services using the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/command-reference-index.html).


### Distributed mode

Distributed mode does not have any additional command-line parameters other than
loading the worker configuration file. New workers will either start a new group
or join an existing one with a matching `group.id`. Workers then coordinate
with the consumer groups to distribute the work to be done.

The following shows an example command that launches a worker in distributed
mode:

```bash
bin/connect-distributed worker.properties
```

For an example distributed mode configuration file that uses Avro and
[Schema Registry](/platform/current/schema-registry/connect.html), open
`etc/schema-registry/connect-avro-distributed.properties`. You can make a copy
of this file, modify it, use it as the new `worker.properties` file. Note that
`worker.properties` is an example file name. You can use any valid file name
for your properties file.

In standalone mode, connector configuration property files are added as
command-line parameters. However, in distributed mode, connectors are deployed
and managed using a REST API request. To create connectors, you start the worker
and then make a REST request to create the connector. REST request examples are
provided in many [supported connector](https://docs.confluent.io/kafka-connectors/self-managed/supported.html) documents. For
instance, see the [Azure Blob Storage Source connector REST-based example](https://docs.confluent.io/kafka-connectors/azure-blob-storage-source/current/index.html#rest-based-example)
for one example.

Note that if you run many distributed workers on one host machine for development
and testing, the `listeners` configuration property must be unique for each
worker. This is the port the REST interface listens on for HTTP requests.


### YAML

```yaml
apiVersion: cmf.confluent.io/v1
kind: FlinkApplication
metadata:
  name: app-1
spec:
  image: confluentinc/cp-flink:1.19.3-cp1
  flinkVersion: v1_19
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "1"
  serviceAccount: flink
  jobManager:
    resource:
      memory: 1024m
      cpu: 1
  taskManager:
    resource:
      memory: 1024m
      cpu: 1
  job:
    jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
    state: running
    parallelism: 1
    upgradeMode: stateless
```

The resource spec includes the following fields:

* `image`: The name of the Docker image that is used to start the Flink cluster. CMF expects this image to be a
  [Confluent Platform Flink image](https://hub.docker.com/r/confluentinc/cp-flink) or to be derived from a
  Confluent Platform Flink image.
* `flinkVersion`: The Flink version corresponding to the Flink version of the Docker image.
* `flinkConfiguration`: A map of Flink configuration parameters. Before the configuration is passed to the Flink cluster,
  it is merged with the Environment’s default configuration for applications. The Flink configuration is used to configure
  cluster and job behavior, such as checkpointing, security, logging, and more.
  For more on Flink job configuration, see [Configure Flink Jobs in Confluent Manager for Apache Flink](../configure/overview.md#cmf-configure).
* `serviceAccount`: The name of the Kubernetes service account that is used to start and run the application’s Flink cluster.
* `jobManager` & `taskManager`: TheKubernetes specification of the Flink Job Manager and Task Manager pods.
* `job.jarURI`: The path to the Flink job JAR file. To learn how to package Flink jobs and make the job JAR available to the cluster,
  see [Package Flink Jobs](packaging.md#cmf-package).
* `job.state`: The desired state of the application. Can be `running` or `suspended`.
* `job.parallelism`: The desired execution parallelism of the application.
  Can be adapted to rescale the application.


## Step 3: Generate mock data

In Confluent Platform, you get [events](../_glossary.md#term-event) from an external source by using [Kafka Connect](../connect/index.md#connect-concepts).
Connectors enable you to stream large volumes of data to and from your [Kafka cluster](../_glossary.md#term-Kafka-cluster). Confluent
publishes many connectors for integrating with external systems, like MongoDB
and Elasticsearch. For more information, see the [Kafka Connect Overview](../connect/index.md#kafka-connect) page.

In this step, you run the
[Datagen Source Connector](https://www.confluent.io/hub/confluentinc/kafka-connect-datagen/)
to generate mock data. The mock data is stored in the `pageviews` and
`users` topics that you created previously. To learn more about installing connectors,
see [Install Self-Managed Connectors](../connect/install.md#connect-install-connectors).

1. In the navigation menu, click **Connect**.
2. Click the `connect-default` cluster in the **Connect clusters** list.
3. Click **Add connector** to start creating a connector for pageviews data.
4. Select the `DatagenConnector` tile.
5. In the **Name** field, enter `datagen-pageviews` as the name of the connector.
6. Enter the following configuration values in the following sections:

   **Common** section:
   - **Key converter class:** `org.apache.kafka.connect.storage.StringConverter`

   **General** section:
   - **kafka.topic:** Choose `pageviews` from the dropdown menu
   - **max.interval:** `100`
   - **quickstart:** `pageviews`
7. Click **Next** to review the connector configuration. When you’re satisfied
   with the settings, click **Launch**.
   ![Reviewing connector configuration in Confluent Control Center](images/connect-review-pageviews.png)

Run a second instance of the
[Datagen Source connector](https://www.confluent.io/hub/confluentinc/kafka-connect-datagen/)
connector to produce mock data to the `users` topic.

1. In the navigation menu, click **Connect**.
2. In the **Connect clusters** list, click `connect-default`.
3. Click **Add connector**.
4. Select the `DatagenConnector` tile.
5. In the **Name** field, enter `datagen-users` as the name of the connector.
6. Enter the following configuration values:

   **Common** section:
   - **Key converter class:** `org.apache.kafka.connect.storage.StringConverter`

   **General** section:
   - **kafka.topic:** Choose `users` from the dropdown menu
   - **max.interval:** `1000`
   - **quickstart:** `users`
7. Click **Next** to review the connector configuration. When you’re satisfied
   with the settings, click **Launch**.
8. In the navigation menu, click **Topics** and in the list, click **users**.
9. Click **Messages** to confirm that the `datagen-users` connector is
   producing data to the `users` topic.
   ![Incoming messages displayed in the Topics page in Confluent Control Center](images/c3-topics-messages-users.gif)


## Confluent Platform features

At the core of Confluent Platform is Kafka, the most popular open source distributed streaming platform.
Kafka enables you to:

- Publish and subscribe to streams of records
- Store streams of records in a fault tolerant way
- Process streams of records

Each Confluent Platform release includes the latest release of Kafka and additional tools and services that make it
easier to build and manage an event streaming platform. Confluent Platform provides community and
commercially licensed features such as [Schema Registry](/platform/current/schema-registry/index.html),
[Cluster Linking](../multi-dc-deployments/cluster-linking/index.md#cluster-linking), a [REST Proxy](../kafka-rest/index.md#kafkarest-intro), [100+ pre-built Kafka connectors](../connect/kafka_connectors.md#connectors-self-managed-cp), and [ksqlDB](../ksqldb/overview.md#ksql-home).
For more information about Confluent components and the license that applies to them, see [Confluent Licenses](../installation/license.md#cp-license-overview).

![image](images/confluentPlatform.png)


## (Optional) Explore Control Center

Confluent Control Center is a web-based tool for managing and monitoring Kafka in Confluent Platform. If you opted to install it as described in [(Optional) Install and configure Confluent Control Center](#get-started-multi-broker-install-config-c3),
you can use if for monitoring, to create topics, and other actions.

To view your cluster running locally in Control Center, open a browser and navigate to [http://localhost:9021/](http://localhost:9021).

- To learn about managing clusters with Confluent Control Center, see [Manage Kafka Clusters Using Control Center for Confluent Platform](https://docs.confluent.io/control-center/current/clusters.html#controlcenter-userguide-clusters)
- To view brokers in Confluent Control Center, see [Manage Kafka Brokers Using Control Center for Confluent Platform](https://docs.confluent.io/control-center/current/brokers.html#controlcenter-userguide-brokers)
- To manage topics in Confluent Control Center, see [Manage Topics Using Control Center for Confluent Platform](https://docs.confluent.io/control-center/current/topics/overview.html#c3-all-topics)

1. Click either the Brokers card or **Brokers** on the menu to view broker metrics.
   From the brokers list at the bottom of the page, you can view detailed metrics and drill down on each broker.
   ![image](images/basics-c3-brokers-list.png)
2. Click **Topics** on the navigation menu.

   Note that only your test topic and the system (internal) topics are available at this point.
   The `default_ksql_processing_log` will show up as a topic if you configured and started ksqlDB.

There is a lot more to Confluent Control Center, but it is not the focus of this tutorial. To complete similar steps using Confluent Control Center, see the [Quick Start for Confluent Platform](platform-quickstart.md#quickstart).


# Kafka Configuration Reference for Confluent Platform

Apache Kafka® configuration refers to the various settings and parameters that can be adjusted to optimize the
performance, reliability, and security of a Kafka cluster and its clients.

Kafka uses key-value pairs in a property file format for configuration. These values can be supplied either from a
file or programmatically.

The following configuration reference topics include settings for Kafka brokers, producers, and consumers, topics, and Kafka Connect.

- [Configure Brokers and Controllers](broker-configs.md#cp-config-brokers)
- [Configure Topics](topic-configs.md#cp-config-topics)
- [Configure Producers](producer-configs.md#cp-config-producer)
- [Configure Consumers](consumer-configs.md#cp-config-consumer)
- [Configure Kafka Streams](streams-configs.md#cp-config-streams)
- [Configure the AdminClient](admin-configs.md#cp-config-admin)
- [Configure Kafka Connect](connect/index.md#cp-config-connect)
- [Configure Kafka Source Connectors](connect/source-connect-configs.md#cp-config-source-connect)
- [Configure Kafka Sink Connectors](connect/sink-connect-configs.md#cp-config-sink-connect)


### Optional Confluent Replicator Executable configurations

Additional configurations that are optional and maybe passed to Replicator Executable via environment variable instead of files are:

`REPLICATION_CONFIG`
: A file that contains the configuration settings for the replication from the origin cluster. Default location is `/etc/replicator/replication.properties` in the Docker image.

`CONSUMER_MONITORING_CONFIG`
: A file that contains the configuration settings of the producer writing monitoring information related to Replicator’s consumer. Default location is `/etc/replicator/consumer-monitoring.properties` in the Docker image.

`PRODUCER_MONITORING_CONFIG`
: A file that contains the configuration settings of the producer writing monitoring information related to Replicator’s producer. Default location is `/etc/replicator/producer-monitoring.properties` in the Docker image.

`BLACKLIST`
: A comma-separated list of topics that should not be replicated, even if they are included in the whitelist or matched by the regular expression.

`WHITELIST`
: A comma-separated list of the names of topics that should be replicated. Any topic that is in this list and not in the blacklist will be replicated.

`CLUSTER_THREADS`
: The total number of threads across all workers in the Replicator cluster.

`CONFLUENT_LICENSE`
: The Confluent license key. Without the license key, Replicator can be used for a 30-day trial period.

`TOPIC_AUTO_CREATE`
: Whether to automatically create topics in the destination cluster if required.
  If you disable automatic topic creation, Kafka Streams and ksqlDB applications
  continue to work. Kafka Streams and ksqlDB applications use the Admin Client,
  so topics are still created.

`TOPIC_CONFIG_SYNC`
: Whether to periodically sync topic configuration to the destination cluster.

`TOPIC_CONFIG_SYNC_INTERVAL_MS`
: Specifies how frequently to check for configuration changes when `topic.config.sync` is enabled.

`TOPIC_CREATE_BACKOFF_MS`
: Time to wait before retrying auto topic creation or expansion.

`TOPIC_POLL_INTERVAL_MS`
: Specifies how frequently to poll the source cluster for new topics matching the whitelist or regular expression.

`TOPIC_PRESERVE_PARTITIONS`
: Whether to automatically increase the number of partitions in the destination cluster to match the source cluster and ensure that messages replicated from the source cluster use the same partition in the destination cluster.

`TOPIC_REGEX`
: A regular expression that matches the names of the topics to be replicated. Any topic that matches this expression (or is listed in the whitelist) and not in the blacklist will be replicated.

`TOPIC_RENAME_FORMAT`
: A format string for the topic name in the destination cluster, which may contain ${topic} as a placeholder for the originating topic name.

`TOPIC_TIMESTAMP_TYPE`
: The timestamp type for the topics in the destination cluster.


## Overview

The systemd service unit files are included in the [RPM](rhel-centos.md#systemd-rhel-centos-install) and [Debian packages](deb-ubuntu.md#systemd-ubuntu-debian-install) for the following Confluent Platform components:

- Apache Kafka® (`kafka`)
- Kafka Connect (`kafka-connect`)
- Confluent REST Proxy (`kafka-rest`)
- ksqlDB (`ksql`)
- Schema-Registry (`schema-registry`)

Each component runs under its own user and a common `confluent` group that are set up during package installation. This
configuration ensures proper security separation between components that are running on the same system. The usernames are
prefixed with `cp-` followed by the component name. For example, `cp-kafka` and `cp-schema-registry`.

For components with persistent storage, such as Kafka, the default component configuration file points to component-specific
data directories `/var/lib/<component>`. For example, Kafka points to `/var/lib/kafka`.


## Hardware

The following table lists machine recommendations for installing individual Confluent Platform components.
Confluent Platform supports both ARM64 and X86 hardware architecture. ARM64 is supported in Confluent Platform 7.6.0 and later.

For consistent and optimal performance in your Confluent Platform cluster, ensure all cluster nodes have identical hardware specifications. This includes CPU type
and core count, RAM capacity and speed, and storage type with matching performance characteristics like throughput and IOPS. Varying hardware among nodes
can cause performance bottlenecks, uneven workload distribution, and overall cluster instability. Maintaining identical hardware across all nodes is
essential for high availability and reliable performance in your Confluent Platform deployment.

Note that the recommended CPU resource is the same for all platforms. For example, if 12 CPUs is recommended for non-Kubernetes
environment, the recommendation for a Kubernetes environment would also be 12 CPU units.

The following table lists hardware recommendations. Confluent Platform is used for a wide range of use cases and on a lot of different machines. These recommendations provide a good starting point based on the experiences of Confluent with production clusters, but actual requirements depend on your specific workload.

| Component                                                                                                                                                     | Nodes            | Storage                                                                                                                                   | Memory                                                                   | CPU                                                                                                          |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|-------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|
| Control Center-Normal mode, see [System Requirements](https://docs.confluent.io/control-center/current/installation/system-requirements.html)                 | 1                | 200 GB, preferably SSDs                                                                                                                   | Minimum 8 GB RAM                                                         | 4 cores or more                                                                                              |
| Control Center-Reduced infrastructure mode, see [System Requirements](https://docs.confluent.io/control-center/current/installation/system-requirements.html) | 1                | 128 GB, preferably SSDs                                                                                                                   | 8 GB RAM                                                                 | 4 cores or more                                                                                              |
| Control Center (Legacy)-Normal mode                                                                                                                           | 1                | 300 GB, preferably SSDs                                                                                                                   | 32 GB RAM  (JVM
default 6 GB)                                        | 12 cores or more                                                                                             |
| Control Center (Legacy)-Reduced infrastructure mode                                                                                                           | 1                | 128 GB, preferably SSDs                                                                                                                   | 8 GB RAM  (JVM
default 4 GB)                                         | 4 cores or more                                                                                              |
| Broker                                                                                                                                                        | 3                | - 12 X 1 TB disk. RAID 10 is optional
- Separate OS disks from Apache Kafka®
  storage                                            | 64 GB RAM                                                                | 24 cores                                                                                                     |
| KRaft controller                                                                                                                                              | 3-5              | 64 GB SSD                                                                                                                                 | 4 GB RAM                                                                 | 4 cores                                                                                                      |
| Confluent Manager for Apache Flink                                                                                                                            | 1 Kubernetes pod | 10 GB (Kubernetes persistent volume)                                                                                                      | 4 GB RAM
(For managing
150 Flink
applications)               | 3 cores
(For managing
150 Flink
applications)                                                    |
| Connect                                                                                                                                                       | 2                | Storage is only required at installation time.                                                                                            | 0.5 - 4 GB
heap size
depending on
connectors                 | Typically not CPU-bound.
More cores is better than
faster cores.                                     |
| Replicator- Same as Connect for nodes, storage, memory, and CPU. (See
note that follows about AWS.)                                                       | 2                | Storage is only required at installation time.                                                                                            | 0.5 - 4 GB
heap size                                                 | More cores is better                                                                                         |
| ksqlDB - See [Capacity planning](../ksqldb/operate-and-deploy/capacity-planning.md#ksqldb-operate-capacity-planning-ksqldb-resources)                         | 2                | Use SSD. Sizing depends on the
number of concurrent queries
and the aggregation performed.
Minimum 100 GB for a basic server. | 20 GB RAM                                                                | 4 cores                                                                                                      |
| REST Proxy                                                                                                                                                    | 2                | Storage is only required at installation time.                                                                                            | 1 GB overhead
plus 64 MB per
producer and 16
MB per consumer | 16 cores to handle HTTP
requests in parallel and
background threads for
consumers and producers. |
| Schema Registry                                                                                                                                               | 2                | Storage is only required at installation time.                                                                                            | 1 GB heap size                                                           | Typically not CPU-bound.
More cores is better than
faster cores.                                     |
* If you want to use RAID disks, the recommendation is:
  * RAID 1 and RAID 10: Preferred
  * RAID 0: 2nd preferred
  * RAID 5: Not recommended


## Step 2: Upgrade Confluent Platform components

In this step, you will upgrade the Confluent Platform components. For a
[rolling upgrade](upgrade.md#rolling-upgrade), you can do this on one server at a
time while the cluster continues to run. The details depend on your environment,
but the steps to upgrade components are the same.

You should always upgrade Confluent Control Center as the final Confluent Platform component.

Upgrade steps:

1. Stop the Confluent Platform components.
2. Back up configuration files, for example in `./etc/kafka`.
3. Remove existing packages and their dependencies.
4. Install new packages.
5. Restart the Confluent Platform components.

For details on how to upgrade different package types, see the following sections:

- [Upgrade DEB packages using APT](upgrade.md#upgrade-deb-packages)
- [Upgrade RPM packages by using YUM](upgrade.md#upgrade-rpm-packages)
- [Upgrade using TAR or ZIP archives](upgrade.md#upgrade-tar-zip-archives)

For details on how to upgrade individual Confluent Platform components, see the following sections:

- [Upgrade Schema Registry](upgrade.md#upgrade-sr)
- [Upgrade Confluent REST Proxy](upgrade.md#upgrade-rest-proxy)
- [Upgrade Kafka Streams applications](upgrade.md#upgrade-kafka-streams)
- [Upgrade Kafka Connect](upgrade.md#upgrade-connect)

  The [Confluent Replicator](../multi-dc-deployments/replicator/index.md#replicator-detail) version must match the Connect version it is deployed on.
  For example, Replicator 8.1 should only be deployed to Connect 8.1,
  so if you upgrade Connect, you must upgrade  Replicator.
- [Upgrade ksqlDB](upgrade.md#upgrade-ksqldb)
- [Upgrade Control Center](https://docs.confluent.io/control-center/current/installation/upgrade.html)


## Hardware

If you have followed the normal development path, you have tried Apache Kafka®
on your laptop or on a small cluster of machines. But when it comes time to deploying
Kafka to production, there are a few recommendations that you should consider.

The following table lists hardware recommendations. Nothing is a hard-and-fast rule;
Kafka is used for a wide range of use cases and on a lot of different machines. These recommendations
provide a good starting point based on the experiences of Confluent with production clusters.

| Component                                                                                                                                                     | Nodes            | Storage                                                                                                                                   | Memory                                                                   | CPU                                                                                                          |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|-------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|
| Control Center-Normal mode, see [System Requirements](https://docs.confluent.io/control-center/current/installation/system-requirements.html)                 | 1                | 200 GB, preferably SSDs                                                                                                                   | Minimum 8 GB RAM                                                         | 4 cores or more                                                                                              |
| Control Center-Reduced infrastructure mode, see [System Requirements](https://docs.confluent.io/control-center/current/installation/system-requirements.html) | 1                | 128 GB, preferably SSDs                                                                                                                   | 8 GB RAM                                                                 | 4 cores or more                                                                                              |
| Control Center (Legacy)-Normal mode                                                                                                                           | 1                | 300 GB, preferably SSDs                                                                                                                   | 32 GB RAM  (JVM
default 6 GB)                                        | 12 cores or more                                                                                             |
| Control Center (Legacy)-Reduced infrastructure mode                                                                                                           | 1                | 128 GB, preferably SSDs                                                                                                                   | 8 GB RAM  (JVM
default 4 GB)                                         | 4 cores or more                                                                                              |
| Broker                                                                                                                                                        | 3                | - 12 X 1 TB disk. RAID 10 is optional
- Separate OS disks from Apache Kafka®
  storage                                            | 64 GB RAM                                                                | 24 cores                                                                                                     |
| KRaft controller                                                                                                                                              | 3-5              | 64 GB SSD                                                                                                                                 | 4 GB RAM                                                                 | 4 cores                                                                                                      |
| Confluent Manager for Apache Flink                                                                                                                            | 1 Kubernetes pod | 10 GB (Kubernetes persistent volume)                                                                                                      | 4 GB RAM
(For managing
150 Flink
applications)               | 3 cores
(For managing
150 Flink
applications)                                                    |
| Connect                                                                                                                                                       | 2                | Storage is only required at installation time.                                                                                            | 0.5 - 4 GB
heap size
depending on
connectors                 | Typically not CPU-bound.
More cores is better than
faster cores.                                     |
| Replicator- Same as Connect for nodes, storage, memory, and CPU. (See
note that follows about AWS.)                                                       | 2                | Storage is only required at installation time.                                                                                            | 0.5 - 4 GB
heap size                                                 | More cores is better                                                                                         |
| ksqlDB - See [Capacity planning](../ksqldb/operate-and-deploy/capacity-planning.md#ksqldb-operate-capacity-planning-ksqldb-resources)                         | 2                | Use SSD. Sizing depends on the
number of concurrent queries
and the aggregation performed.
Minimum 100 GB for a basic server. | 20 GB RAM                                                                | 4 cores                                                                                                      |
| REST Proxy                                                                                                                                                    | 2                | Storage is only required at installation time.                                                                                            | 1 GB overhead
plus 64 MB per
producer and 16
MB per consumer | 16 cores to handle HTTP
requests in parallel and
background threads for
consumers and producers. |
| Schema Registry                                                                                                                                               | 2                | Storage is only required at installation time.                                                                                            | 1 GB heap size                                                           | Typically not CPU-bound.
More cores is better than
faster cores.                                     |
* If you want to use RAID disks, the recommendation is:
  * RAID 1 and RAID 10: Preferred
  * RAID 0: 2nd preferred
  * RAID 5: Not recommended


# Monitoring Kafka with JMX in Confluent Platform

Confluent Platform is a data-streaming platform that completes Kafka with advanced capabilities designed to help accelerate
application development and connectivity for enterprise use cases.

This topic describes the Java Management Extensions (JMX) and Managed Beans (MBeans) that are enabled by default for Kafka and Confluent Platform
to enable monitoring of your Kafka applications.

The next several sections describe how to configure JMX, verify that you have configured it correctly, and lists MBeans by Confluent Platform component.
Note that features that are not enabled in your deployment will not generate MBeans.


You can [search for a metric by name](#search-for-metric).

You can also browse metrics by category:

- [Broker metrics](#kafka-monitoring-metrics-broker)
- [KRaft broker metrics](#kraft-broker-metrics)
- [KRaft Quorum metrics](#kraft-quorum-metrics)
- [Controller metrics](#controller-metrics)
- [Log metrics](#log-metrics)
- [Network metrics](#network-metrics)
- [Producer metrics](#kafka-monitoring-metrics-producer)
- [Consumer metrics](#kafka-monitoring-metrics-consumer)
- [Consumer group metrics](#kafka-monitoring-metrics-consumer-group)
- [Audit metrics](#audit-metrics)
- [Authorizer metrics](#authorizer-metrics)
- [RBAC and LDAP metrics](#rbac-and-ldap-health-metrics)

To monitor these metrics with Docker, see [Monitoring with Docker Deployments](../installation/docker/operations/monitoring.md#use-jmx-monitor-docker-deployments).

Find metrics for specific Confluent Platform and Kafka features in the following topics:

- [Cluster linking metrics](../multi-dc-deployments/cluster-linking/metrics.md#cluster-linking-metrics)
- [Connect metrics](../connect/monitoring.md#connect-monitoring-config-connectors)
- [Kafka Streams metrics](../streams/monitoring.md#streams-monitoring)

Confluent offers some alternatives to using JMX monitoring.

- **Confluent Control Center**: You can deploy [Control Center](/control-center/current/overview.html) for out-of-the-box Kafka cluster monitoring so you don’t
  have to build your own monitoring system.
- **Health+**: Consider monitoring and managing your environment with [Monitor Confluent Platform with Health+](../health-plus/index.md#health-plus).
  Ensure the health of your clusters and minimize business disruption with intelligent
  alerts, monitoring, and proactive support based on best practices created by
  the inventors of Kafka.


#### IMPORTANT
Secrets `config.providers` do not propagate to prefixes such as `client.*`.
Thus, when using prefixes with secrets you must specify `config.providers`
and `config.providers.securepass.class`. Refer to [Using prefixes in secrets configurations](../security/compliance/secrets/overview.md#secrets-prefixes) for details.

| Security Configuration                                | Prefix                                                                                                                                                                                                                                                                              | Where to Configure                                                                    |
|-------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|
| Audit logging                                         | `confluent.security.event.`                                                                                                                                                                                                                                                         | `etc/kafka/server.properties`                                                         |
| Broker                                                | none                                                                                                                                                                                                                                                                                | `etc/kafka/server.properties`                                                         |
| Broker LDAP configurations                            | `ldap.`                                                                                                                                                                                                                                                                             | `etc/kafka/server.properties`                                                         |
| Broker Metadata Service (MDS) back-end configurations | `confluent.metadata.`                                                                                                                                                                                                                                                               | `etc/kafka/server.properties`                                                         |
| Metadata Service (MDS) configurations                 | `confluent.metadata.server.`                                                                                                                                                                                                                                                        | `etc/kafka/server.properties`                                                         |
| Console Clients                                       | none                                                                                                                                                                                                                                                                                | `client properties` (for example, `producer.config` or `consumer.config`)             |
| Connect workers                                       | none, `producer.`, `consumer.`, or `admin.`                                                                                                                                                                                                                                         | `etc/kafka/connect-distributed.properties`                                            |
| Control Center                                        | `confluent.controlcenter.streams.`
`confluent.controlcenter.connect.`
`confluent.controlcenter.ksql.`                                                                                                                                                                       | `etc/confluent-control-center/control-center.properties`                              |
| Java Clients                                          | Java clients use static parameters defined in the
Javadoc:

- [SSL](/platform/current/clients/javadocs/javadoc/org/apache/kafka/common/config/SslConfigs.html)
- [SASL](/platform/current/clients/javadocs/javadoc/org/apache/kafka/common/config/SaslConfigs.html) | SslConfigs or SaslConfigs in Properties class                                         |
| Metrics Reporter                                      | `confluent.metrics.reporter.`                                                                                                                                                                                                                                                       | `etc/kafka/server.properties`                                                         |
| Rebalancer                                            | `confluent.rebalancer.metrics.`                                                                                                                                                                                                                                                     | Pass configuration (e.g. `rebalance-metrics-client.properties`) using `--config-file` |
| Replicator                                            | - `dest.kafka.`
- `src.kafka.`                                                                                                                                                                                                                                                  | connector JSON file (not the worker properties file)                                  |
| REST Proxy                                            | `client.`                                                                                                                                                                                                                                                                           | `etc/kafka/kafka-rest.properties`                                                     |
| Schema Registry                                       | `kafkastore.`                                                                                                                                                                                                                                                                       | `etc/schema-registry/schema-registry.properties`                                      |


### **ReplicaStatus**

```bash
/clusters/{cluster_id}/topics/-/partitions/-/replica-status
/clusters/{cluster_id}/topics/{topic_name}/partitions/-/replica-status
/clusters/{cluster_id}/topics/{topic_name}/partitions/{partition_id}/replica-status
```

REST that runs with a Confluent Server deployment provides the full set of REST APIs. REST
that runs in a Standalone deployment consists of the open-source Kafka REST APIs
only. For more information about the open-source Kafka REST APIs available, see
[Kafka REST Proxy](https://github.com/confluentinc/kafka-rest#kafka-rest-proxy) and the
[openapi yaml](https://github.com/confluentinc/kafka-rest/blob/master/api/v3/openapi.yaml).

When using the API in Confluent Server, all paths should be prefixed with `/kafka` as
opposed to Standalone REST Proxy. For example, the path to list clusters is:

* Confluent Server: `/kafka/v3/clusters`
* Standalone REST Proxy: `/v3/clusters`

Confluent Server provides an embedded instance of these APIs on the Kafka brokers for the v3 Admin API.
The embedded APIs run on the Confluent HTTP service, `confluent.http.server.listeners`. Therefore, if
you have the HTTP server running, the REST Proxy v3 API is automatically available to you through the brokers.
Note that the [Metadata Server (MDS)](../security/authorization/rbac/mds-api.md#mds-api) is also running on the Confluent HTTP service,
as another endpoint available to you with additional configurations.


### GET /clusters/{cluster_id}

**Get Cluster**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the Kafka cluster with the specified `cluster_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.

**Example request:**

```http
GET /clusters/{cluster_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The Kafka cluster.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaCluster",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1",
            "resource_name": "crn:///kafka=cluster-1"
        },
        "cluster_id": "cluster-1",
        "controller": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
        },
        "acls": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/acls"
        },
        "brokers": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers"
        },
        "broker_configs": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/broker-configs"
        },
        "consumer_groups": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups"
        },
        "topics": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics"
        },
        "partition_reassignments": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/-/partitions/-/reassignment"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### PUT /clusters/{cluster_id}/broker-configs/{name}

**Update Dynamic Broker Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Update the dynamic cluster-wide broker configuration parameter specified by `name`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **name** (*string*) – The configuration parameter name.

**Example request:**

```http
PUT /clusters/{cluster_id}/broker-configs/{name} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "value": "gzip"
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### DELETE /clusters/{cluster_id}/broker-configs/{name}

**Reset Dynamic Broker Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Reset the configuration parameter specified by `name` to its
default value by deleting a dynamic cluster-wide configuration.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **name** (*string*) – The configuration parameter name.
* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### PUT /clusters/{cluster_id}/brokers/{broker_id}/configs/{name}

**Update Broker Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Update the configuration parameter specified by `name`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.
  * **name** (*string*) – The configuration parameter name.

**Example request:**

```http
PUT /clusters/{cluster_id}/brokers/{broker_id}/configs/{name} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "value": "gzip"
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### DELETE /clusters/{cluster_id}/brokers/{broker_id}/configs/{name}

**Reset Broker Config**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Reset the configuration parameter specified by `name` to its default value.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.
  * **name** (*string*) – The configuration parameter name.
* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### POST /clusters/{cluster_id}/topics/{topic_name}/configs:alter

**Batch Alter Topic Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Update or delete a set of topic configuration parameters.
Also supports a dry-run mode that only validates whether the operation would succeed if the
`validate_only` request property is explicitly specified and set to true.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.

**batch_alter_topic_configs:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/configs:alter HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "data": [
        {
            "name": "cleanup.policy",
            "operation": "DELETE"
        },
        {
            "name": "compression.type",
            "value": "gzip"
        }
    ]
}
```

**validate_only_batch_alter_topic_configs:**

```http
POST /clusters/{cluster_id}/topics/{topic_name}/configs:alter HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "data": [
        {
            "name": "cleanup.policy",
            "operation": "DELETE"
        },
        {
            "name": "compression.type",
            "value": "gzip"
        }
    ],
    "validate_only": true
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Indicates attempted access to an unreachable or non-existing resource like e.g. an unknown topic or partition. GET requests to endpoints not allowed in the accesslists will also result in this response.

    **endpoint_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "HTTP 404 Not Found"
    }
    ```

    **cluster_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Cluster my-cluster cannot be found."
    }
    ```

    **unknown_topic_or_partition:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 40403,
        "message": "This server does not host this topic-partition."
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/topics/-/configs

**List All Topic Configs**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the list of configuration parameters for all topics hosted by the specified
cluster.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.

**Example request:**

```http
GET /clusters/{cluster_id}/topics/-/configs HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of cluster configs.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaTopicConfigList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaTopicConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs/cleanup.policy",
                    "resource_name": "crn:///kafka=cluster-1/topic=topic-1/config=cleanup.policy"
                },
                "cluster_id": "cluster-1",
                "topic_name": "topic-1",
                "name": "cleanup.policy",
                "value": "compact",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_TOPIC_CONFIG",
                "synonyms": [
                    {
                        "name": "cleanup.policy",
                        "value": "compact",
                        "source": "DYNAMIC_TOPIC_CONFIG"
                    },
                    {
                        "name": "cleanup.policy",
                        "value": "delete",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            },
            {
                "kind": "KafkaTopicConfig",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs/compression.type",
                    "resource_name": "crn:///kafka=cluster-1/topic=topic-1/config=compression.type"
                },
                "cluster_id": "cluster-1",
                "topic_name": "topic-1",
                "name": "compression.type",
                "value": "gzip",
                "is_default": false,
                "is_read_only": false,
                "is_sensitive": false,
                "source": "DYNAMIC_TOPIC_CONFIG",
                "synonyms": [
                    {
                        "name": "compression.type",
                        "value": "gzip",
                        "source": "DYNAMIC_TOPIC_CONFIG"
                    },
                    {
                        "name": "compression.type",
                        "value": "producer",
                        "source": "DEFAULT_CONFIG"
                    }
                ]
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/brokers/{broker_id}

**Get Broker**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the broker specified by `broker_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.

**Example request:**

```http
GET /clusters/{cluster_id}/brokers/{broker_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The broker.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaBroker",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1",
            "resource_name": "crn:///kafka=cluster-1/broker=1"
        },
        "cluster_id": "cluster-1",
        "broker_id": 1,
        "host": "localhost",
        "port": 9291,
        "configs": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/configs"
        },
        "partition_replicas": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/partition-replicas"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### DELETE /clusters/{cluster_id}/brokers/{broker_id}

**Delete Broker**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Delete the broker that is specified by `broker_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.
* **Query Parameters:**
  * **should_shutdown** (*boolean*) – To shutdown the broker or not, Default: true
* **Status Codes:**
  * [202 Accepted](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.3) –

    The single broker removal response

    **Example response:**
    ```http
    HTTP/1.1 202 Accepted
    Content-Type: application/json

    {
        "kind": "KafkaBrokerRemoval",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1",
            "resource_name": "crn:///kafka=cluster-1/broker=1/"
        },
        "cluster_id": "cluster-1",
        "broker_id": 1,
        "broker_task": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
        },
        "broker": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Bad broker or balancer request

    **IllegalBrokerRemoval:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot remove broker 1 as there are partitions with replication factor equal to 1 on the broker. One such partition: test_topic_partition_0."
    }
    ```

    **BalancerOffline:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "The Confluent Balancer component is disabled or not started yet."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Broker not found.

    **Example response:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Broker not found. Broker: 1 not found in the cluster: cluster-1"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/consumer-groups

**List Consumer Groups**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the list of consumer groups that belong to the specified
Kafka cluster.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.

**Example request:**

```http
GET /clusters/{cluster_id}/consumer-groups HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of consumer groups.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumerGroupList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaConsumerGroup",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-1",
                "is_simple": false,
                "partition_assignor": "org.apache.kafka.clients.consumer.RoundRobinAssignor",
                "state": "STABLE",
                "coordinator": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
                },
                "consumers": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers"
                },
                "lag_summary": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lag-summary"
                }
            },
            {
                "kind": "KafkaConsumerGroup",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-2",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-2"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-2",
                "is_simple": false,
                "partition_assignor": "org.apache.kafka.clients.consumer.StickyAssignor",
                "state": "PREPARING_REBALANCE",
                "coordinator": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/2"
                },
                "consumers": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-2/consumers"
                },
                "lag_summary": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-2/lag-summary"
                }
            },
            {
                "kind": "KafkaConsumerGroup",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-3",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-3"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-3",
                "is_simple": false,
                "partition_assignor": "org.apache.kafka.clients.consumer.RangeAssignor",
                "state": "DEAD",
                "coordinator": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/3"
                },
                "consumers": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-3/consumers"
                },
                "lag_summary": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-3/lag-summary"
                }
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/lags

**List Consumer Lags**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)[![Available in dedicated clusters only](https://img.shields.io/badge/-Available%20in%20dedicated%20clusters%20only-%23bc8540)](https://docs.confluent.io/cloud/current/clusters/cluster-types.html#dedicated-cluster)

Return a list of consumer lags of the consumers belonging to the
specified consumer group.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **consumer_group_id** (*string*) – The consumer group ID.

**Example request:**

```http
GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/lags HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of consumer lags.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumerLagList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaConsumerLag",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags/topic-1/partitions/1",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/lag=topic-1/partition=1"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-1",
                "topic_name": "topic-1",
                "partition_id": 1,
                "consumer_id": "consumer-1",
                "instance_id": "consumer-instance-1",
                "client_id": "client-1",
                "current_offset": 1,
                "log_end_offset": 101,
                "lag": 100
            },
            {
                "kind": "KafkaConsumerLag",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags/topic-1/partitions/2",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/lag=topic-1/partition=2"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-1",
                "topic_name": "topic-1",
                "partition_id": 2,
                "consumer_id": "consumer-2",
                "instance_id": "consumer-instance-2",
                "client_id": "client-2",
                "current_offset": 1,
                "log_end_offset": 11,
                "lag": 10
            },
            {
                "kind": "KafkaConsumerLag",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags/topic-1/partitions/3",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/lag=topic-1/partition=3"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-1",
                "topic_name": "topic-1",
                "partition_id": 3,
                "consumer_id": "consumer-3",
                "instance_id": "consumer-instance-3",
                "client_id": "client-3",
                "current_offset": 1,
                "log_end_offset": 1,
                "lag": 0
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/consumers/{consumer_id}

**Get Consumer**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the consumer specified by the `consumer_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **consumer_group_id** (*string*) – The consumer group ID.
  * **consumer_id** (*string*) – The consumer ID.

**Example request:**

```http
GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/consumers/{consumer_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The consumer.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumer",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers/consumer-1",
            "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/consumer=consumer-1"
        },
        "cluster_id": "cluster-1",
        "consumer_group_id": "consumer-group-1",
        "consumer_id": "consumer-1",
        "instance_id": "consumer-instance-1",
        "client_id": "client-1",
        "assignments": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers/consumer-1/assignments"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/consumers/{consumer_id}/assignments

**List Consumer Assignments**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return a list of partition assignments for the specified consumer.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **consumer_group_id** (*string*) – The consumer group ID.
  * **consumer_id** (*string*) – The consumer ID.

**Example request:**

```http
GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/consumers/{consumer_id}/assignments HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of consumer group assignments.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumerAssignmentList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers/consumer-1/assignments",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaConsumerAssignment",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers/consumer-1/assignments/topic-1/partitions/1",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/consumer=consumer-1/assignment=topic=1/partition=1"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-1",
                "consumer_id": "consumer-1",
                "topic_name": "topic-1",
                "partition_id": 1,
                "partition": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/1"
                },
                "lag": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags/topic-1/partitions/1"
                }
            },
            {
                "kind": "KafkaConsumerAssignment",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers/consumer-1/assignments/topic-2/partitions/2",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/consumer=consumer-1/assignment=topic=2/partition=2"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-1",
                "consumer_id": "consumer-1",
                "topic_name": "topic-2",
                "partition_id": 2,
                "partition": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-2/partitions/2"
                },
                "lag": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags/topic-2/partitions/2"
                }
            },
            {
                "kind": "KafkaConsumerAssignment",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers/consumer-1/assignments/topic-3/partitions/3",
                    "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/consumer=consumer-1/assignment=topic=3/partition=3"
                },
                "cluster_id": "cluster-1",
                "consumer_group_id": "consumer-group-1",
                "consumer_id": "consumer-1",
                "topic_name": "topic-3",
                "partition_id": 3,
                "partition": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-3/partitions/3"
                },
                "lag": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags/topic-3/partitions/3"
                }
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/consumers/{consumer_id}/assignments/{topic_name}/partitions/{partition_id}

**Get Consumer Assignment**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return information about the assignment for the specified consumer
to the specified partition.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **consumer_group_id** (*string*) – The consumer group ID.
  * **consumer_id** (*string*) – The consumer ID.
  * **topic_name** (*string*) – The topic name.
  * **partition_id** (*integer*) – The partition ID.

**Example request:**

```http
GET /clusters/{cluster_id}/consumer-groups/{consumer_group_id}/consumers/{consumer_id}/assignments/{topic_name}/partitions/{partition_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The consumer group assignment.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumerAssignment",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/consumers/consumer-1/assignments/topic-1/partitions/1",
            "resource_name": "crn:///kafka=cluster-1/consumer-group=consumer-group-1/consumer=consumer-1/assignment=topic=1/partition=1"
        },
        "cluster_id": "cluster-1",
        "consumer_group_id": "consumer-group-1",
        "consumer_id": "consumer-1",
        "topic_name": "topic-1",
        "partition_id": 1,
        "partition": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/1"
        },
        "lag": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/consumer-groups/consumer-group-1/lags/topic-1/partitions/1"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### PATCH /clusters/{cluster_id}/topics/{topic_name}

**Update Partition Count**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Increase the number of partitions for a topic.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.

**Example request:**

```http
PATCH /clusters/{cluster_id}/topics/{topic_name} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "partitions_count": 10
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The topic.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaTopic",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1",
            "resource_name": "crn:///kafka=cluster-1/topic=topic-1"
        },
        "cluster_id": "cluster-1",
        "topic_name": "topic-1",
        "is_internal": false,
        "replication_factor": 3,
        "partitions_count": 1,
        "partitions": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions"
        },
        "configs": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/configs"
        },
        "partition_reassignments": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/-/reassignments"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **topic_update_partitions_invalid:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40002,
        "message": "Topic already has 1 partitions."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### DELETE /clusters/{cluster_id}/topics/{topic_name}

**Delete Topic**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Delete the topic with the given `topic_name`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **topic_name** (*string*) – The topic name.
* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – No Content
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –

    Indicates attempted access to an unreachable or non-existing resource like e.g. an unknown topic or partition. GET requests to endpoints not allowed in the accesslists will also result in this response.

    **endpoint_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "HTTP 404 Not Found"
    }
    ```

    **cluster_not_found:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 404,
        "message": "Cluster my-cluster cannot be found."
    }
    ```

    **unknown_topic_or_partition:**
    ```http
    HTTP/1.1 404 Not Found
    Content-Type: application/json

    {
        "error_code": 40403,
        "message": "This server does not host this topic-partition."
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/links/{link_name}/configs/{config_name}

**Describe the config under the cluster link**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)
* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **link_name** (*string*) – The link name
  * **config_name** (*string*) – The link config name

**Example request:**

```http
GET /clusters/{cluster_id}/links/{link_name}/configs/{config_name} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    Config name and value

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaLinkConfigData",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/1Rh_4htxSuen7RYGvGmgNw/links/my-new-link-1",
            "resource_name": null
        },
        "cluster_id": "1Rh_4htxSuen7RYGvGmgNw",
        "name": "consumer.offset.sync.ms",
        "value": "3825940",
        "is_default": false,
        "is_read_only": false,
        "is_sensitive": false,
        "source": "DYNAMIC_CLUSTER_LINK_CONFIG",
        "synonyms": [
            "cosm"
        ],
        "link_name": "link-db-1"
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/links/{link_name}/mirrors/{mirror_topic_name}

**Describe the mirror topic**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)
* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **link_name** (*string*) – The link name
  * **mirror_topic_name** (*string*) – Cluster Linking mirror topic name
* **Query Parameters:**
  * **include_state_transition_errors** (*boolean*) – Whether to include mirror state transition errors in the response. Default: false

**Example request:**

```http
GET /clusters/{cluster_id}/links/{link_name}/mirrors/{mirror_topic_name} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    Metadata of the mirror topic

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaMirrorData",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/link/link-1/mirrors/topic-1",
            "resource_name": "crn:///kafka=cluster-1"
        },
        "link_name": "link-sb-1",
        "mirror_topic_name": "topic-1",
        "source_topic_name": "topic-1",
        "num_partitions": 3,
        "mirror_lags": [
            {
                "partition": 0,
                "lag": 0,
                "last_source_fetch_offset": 0
            },
            {
                "partition": 1,
                "lag": 10000,
                "last_source_fetch_offset": 1000
            },
            {
                "partition": 2,
                "lag": 40000,
                "last_source_fetch_offset": 12030
            }
        ],
        "mirror_status": "ACTIVE",
        "mirror_topic_error": "NO_ERROR",
        "state_time_ms": 1612550939300
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /kafka/v3/clusters/{cluster_id}/share-groups

**List Share Groups**

[![Early Access](https://img.shields.io/badge/Lifecycle%20Stage-Early%20Access-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the list of share groups that belong to the specified
Kafka cluster.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.

**Example request:**

```http
GET /kafka/v3/clusters/{cluster_id}/share-groups HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of share groups.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaShareGroupList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaShareGroup",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1",
                    "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1"
                },
                "cluster_id": "cluster-1",
                "share_group_id": "share-group-1",
                "state": "STABLE",
                "coordinator": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
                },
                "consumers": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/consumers"
                },
                "consumer_count": 2,
                "partition_count": 3,
                "assigned_topic_partitions": [
                    {
                        "kind": "KafkaShareGroupTopicPartition",
                        "metadata": {
                            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/assigned-topic-partitions/topic-1/0",
                            "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1/topic-partition=topic-1:0"
                        },
                        "topic_name": "topic-1",
                        "partition_id": 0,
                        "partition": {
                            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/0"
                        }
                    }
                ]
            },
            {
                "kind": "KafkaShareGroup",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-2",
                    "resource_name": "crn:///kafka=cluster-1/share-group=share-group-2"
                },
                "cluster_id": "cluster-1",
                "share_group_id": "share-group-2",
                "state": "EMPTY",
                "coordinator": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/2"
                },
                "consumers": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-2/consumers"
                },
                "consumer_count": 2,
                "partition_count": 3,
                "assigned_topic_partitions": [
                    {
                        "kind": "KafkaShareGroupTopicPartition",
                        "metadata": {
                            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-2/assigned-topic-partitions/topic-1/0",
                            "resource_name": "crn:///kafka=cluster-1/share-group=share-group-2/topic-partition=topic-1:0"
                        },
                        "topic_name": "topic-1",
                        "partition_id": 0,
                        "partition": {
                            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/0"
                        }
                    }
                ]
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /kafka/v3/clusters/{cluster_id}/share-groups/{group_id}

**Get Share Group**

[![Early Access](https://img.shields.io/badge/Lifecycle%20Stage-Early%20Access-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the share group specified by the `group_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **group_id** (*string*) – The group ID.

**Example request:**

```http
GET /kafka/v3/clusters/{cluster_id}/share-groups/{group_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The share group.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaShareGroup",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1",
            "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1"
        },
        "cluster_id": "cluster-1",
        "share_group_id": "share-group-1",
        "state": "STABLE",
        "coordinator": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
        },
        "consumers": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/consumers"
        },
        "consumer_count": 2,
        "partition_count": 3,
        "assigned_topic_partitions": [
            {
                "kind": "KafkaShareGroupTopicPartition",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/assigned-topic-partitions/topic-1/0",
                    "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1/topic-partition=topic-1:0"
                },
                "topic_name": "topic-1",
                "partition_id": 0,
                "partition": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/0"
                }
            },
            {
                "kind": "KafkaShareGroupTopicPartition",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/assigned-topic-partitions/topic-1/1",
                    "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1/topic-partition=topic-1:1"
                },
                "topic_name": "topic-1",
                "partition_id": 1,
                "partition": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/1"
                }
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /kafka/v3/clusters/{cluster_id}/share-groups/{group_id}/consumers/{consumer_id}

**Get Share Group Consumer**

[![Early Access](https://img.shields.io/badge/Lifecycle%20Stage-Early%20Access-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the consumer specified by the `consumer_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **group_id** (*string*) – The group ID.
  * **consumer_id** (*string*) – The consumer ID.

**Example request:**

```http
GET /kafka/v3/clusters/{cluster_id}/share-groups/{group_id}/consumers/{consumer_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The consumer.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaShareGroupConsumer",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/consumers/consumer-1",
            "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1/consumer=consumer-1"
        },
        "cluster_id": "cluster-1",
        "group_id": "share-group-1",
        "consumer_id": "consumer-1",
        "client_id": "client-1",
        "assignments": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/consumers/consumer-1/assignments"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /kafka/v3/clusters/{cluster_id}/share-groups/{group_id}/consumers/{consumer_id}/assignments

**List Share Group Consumer Assignments**

[![Early Access](https://img.shields.io/badge/Lifecycle%20Stage-Early%20Access-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the consumer assignments specified by the `consumer_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **group_id** (*string*) – The group ID.
  * **consumer_id** (*string*) – The consumer ID.

**Example request:**

```http
GET /kafka/v3/clusters/{cluster_id}/share-groups/{group_id}/consumers/{consumer_id}/assignments HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of share group assignments.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaConsumerAssignmentList",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/consumers/consumer-1/assignments",
            "next": null
        },
        "data": [
            {
                "kind": "KafkaShareGroupConsumerAssignment",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/consumers/consumer-1/assignments/topic-1/partitions/1",
                    "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1/consumer=consumer-1/assignment=topic=1/partition=1"
                },
                "cluster_id": "cluster-1",
                "group_id": "share-group-1",
                "consumer_id": "consumer-1",
                "topic_name": "topic-1",
                "partition_id": 1,
                "partition": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-1/partitions/1"
                }
            },
            {
                "kind": "KafkaConsumerAssignment",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/consumers/consumer-1/assignments/topic-2/partitions/2",
                    "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1/consumer=consumer-1/assignment=topic=2/partition=2"
                },
                "cluster_id": "cluster-1",
                "group_id": "share-group-1",
                "consumer_id": "consumer-1",
                "topic_name": "topic-2",
                "partition_id": 2,
                "partition": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-2/partitions/2"
                }
            },
            {
                "kind": "KafkaConsumerAssignment",
                "metadata": {
                    "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/share-groups/share-group-1/consumers/consumer-1/assignments/topic-3/partitions/3",
                    "resource_name": "crn:///kafka=cluster-1/share-group=share-group-1/consumer=consumer-1/assignment=topic=3/partition=3"
                },
                "cluster_id": "cluster-1",
                "group_id": "share-group-1",
                "consumer_id": "consumer-1",
                "topic_name": "topic-3",
                "partition_id": 3,
                "partition": {
                    "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/topics/topic-3/partitions/3"
                }
            }
        ]
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [403 Forbidden](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4) –

    Indicates a client authorization error. Kafka authorization failures will contain error code 40301 in the response body.

    **kafka_authorization_failed:**
    ```http
    HTTP/1.1 403 Forbidden
    Content-Type: application/json

    {
        "error_code": 40301,
        "message": "Request is not authorized"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/brokers/{broker_id}/tasks/{task_type}

**Get single Broker Task.**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return a single Broker Task specified with `task_type` for broker specified with `broker_id` in the cluster specified with `cluster_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.
  * **task_type** (*string*) – The Kafka broker task type.

**Example request:**

```http
GET /clusters/{cluster_id}/brokers/{broker_id}/tasks/{task_type} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The broker task

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaBrokerTask",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1/tasks/add-broker",
            "resource_name": "crn:///kafka=cluster-1/broker=1/task=1"
        },
        "cluster_id": "cluster-1",
        "broker_id": 1,
        "task_type": "add-broker",
        "task_status": "FAILED",
        "sub_task_statuses": {
            "partition_reassignment_status": "ERROR"
        },
        "created_at": "2019-10-12T07:20:50Z",
        "updated_at": "2019-10-12T07:20:55Z",
        "error_code": 10013,
        "error_message": "The Confluent Balancer operation was overridden by a higher priority operation",
        "broker": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### GET /clusters/{cluster_id}/remove-broker-tasks/{broker_id}

**Get Remove Broker Task**

[![Generally Available](https://img.shields.io/badge/Lifecycle%20Stage-Generally%20Available-%2345c6e8)](#section/Versioning/API-Lifecycle-Policy)

Return the remove broker task for the specified `broker_id`.

* **Parameters:**
  * **cluster_id** (*string*) – The Kafka cluster ID.
  * **broker_id** (*integer*) – The Kafka broker ID.

**Example request:**

```http
GET /clusters/{cluster_id}/remove-broker-tasks/{broker_id} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The remove broker task.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "kind": "KafkaRemoveBrokerTask",
        "metadata": {
            "self": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/remove-broker-tasks/1",
            "resource_name": "crn:///kafka=cluster-1/remove-broker-task=1"
        },
        "cluster_id": "cluster-1",
        "broker_id": 1,
        "shutdown_scheduled": false,
        "broker_replica_exclusion_status": "COMPLETED",
        "partition_reassignment_status": "FAILED",
        "broker_shutdown_status": "CANCELED",
        "error_code": 10006,
        "error_message": "Error while computing the initial remove broker plan for brokers [1] prior to shutdown.",
        "broker": {
            "related": "https://pkc-00000.region.provider.confluent.cloud/kafka/v3/clusters/cluster-1/brokers/1"
        }
    }
    ```
  * [400 Bad Request](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.1) –

    Indicates a bad request error. It could be caused by an unexpected request body format or other forms of request validation failure.

    **bad_request_cannot_deserialize:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 400,
        "message": "Cannot deserialize value of type `java.lang.Integer` from String \"A\": not a valid `java.lang.Integer` value"
    }
    ```

    **unsupported_version_exception:**
    ```http
    HTTP/1.1 400 Bad Request
    Content-Type: application/json

    {
        "error_code": 40035,
        "message": "The version of this API is not supported in the underlying Kafka cluster."
    }
    ```
  * [401 Unauthorized](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.2) –

    Indicates a client authentication error. Kafka authentication failures will contain error code 40101 in the response body.

    **kafka_authentication_failed:**
    ```http
    HTTP/1.1 401 Unauthorized
    Content-Type: application/json

    {
        "error_code": 40101,
        "message": "Authentication failed"
    }
    ```
  * [429 Too Many Requests](https://www.rfc-editor.org/rfc/rfc6585#section-4) –

    Indicates that a rate limit threshold has been reached, and the client should retry again later.

    **Example response:**
    ```http
    HTTP/1.1 429 Too Many Requests
    Content-Type: text/html

    {
        "description": "A sample response from Jetty's DoSFilter.",
        "value": "<html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8\"/> <title>Error 429 Too Many Requests</title> </head> <body> <h2>HTTP ERROR 429 Too Many Requests</h2> <table> <tr> <th>URI:</th> <td>/v3/clusters/my-cluster</td> </tr> <tr> <th>STATUS:</th> <td>429</td> </tr> <tr> <th>MESSAGE:</th> <td>Too Many Requests</td> </tr> <tr> <th>SERVLET:</th> <td>default</td> </tr> </table> </body> </html>"
    }
    ```
  * *5XX* –

    A server-side problem that might not be addressable from the client side. Retriable Kafka errors will contain error code 50003 in the response body.

    **generic_internal_server_error:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 500,
        "message": "Internal Server Error"
    }
    ```

    **produce_v3_missing_schema:**
    ```http
    HTTP/1.1 5XX -
    Content-Type: application/json

    {
        "error_code": 50002,
        "message": "Error when fetching latest schema version. subject = my-topic"
    }
    ```


### SASL Authentication

Kafka SASL configurations are described [here](../../../security/authentication/overview.md#kafka-sasl-auth).

Note that all of the SASL configurations (for the Admin REST APIs to broker communication) are prefixed
with `client.`, or alternatively `admin.`.

To enable SASL authentication with the Kafka broker set `kafka.rest.client.security.protocol` to
either `SASL_PLAINTEXT` or `SASL_SSL`.

Then set `kafka.rest.client.sasl.jaas.config` with the credentials to be used by the Admin REST APIs
to authenticate with Kafka. For example:

```none
kafka.rest.client.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="kafkarest" password="kafkarest";
```

Alternatively you can create a JAAS configuration file, for example
`CONFLUENT_HOME/etc/kafka/server-jaas.properties`:

```bash
KafkaClient {
  org.apache.kafka.common.security.plain.PlainLoginModule required
  username="kafkarest"
  password="kafkarest";
};
```

The name of the section in the JAAS file must be `KafkaClient`. Then pass it as a JVM
argument:

```bash
export KAFKA_OPTS="-Djava.security.auth.login.config=${CONFLUENT_HOME}/etc/kafka/server-jaas.properties"
```

For details about configuring Kerberos see
[JDK’s Kerberos Requirements](https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/KerberosReq.html).


## Important Configuration Options

The full set of configuration options are documented [here](config.md#kafkarest-config) .

However, some configurations should be changed for production. Some **must** be changed
because they depend on your cluster layout:

`bootstrap.servers`
: A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The
  client will make use of all servers irrespective of which servers are specified here for bootstrapping.
  This list only impacts the initial hosts used to discover the full set of servers. This list should
  be in the form `host1:port1,host2:port2,...`. Because these servers are only used for the initial
  connection to discover the full cluster membership (which may change dynamically), this list does
  not require the full set of servers. You might want to specify multiple servers in case one goes down.


  * Type: list
  * Default:
  * Valid Values:
  * Importance: high

`schema.registry.url`
: The base URL for Schema Registry that should be used by the serializer.


  * Type: string
  * Default: “[http://localhost:8081](http://localhost:8081)”
  * Importance: high


  #### NOTE
  The configuration property `auto.register.schemas` is not supported for
  Kafka REST Proxy.

`id`
: Unique ID for this REST server instance. This is used in generating unique IDs for consumers
  that do not specify their ID. The ID is empty by default, which makes a single server setup
  easier to get up and running, but is not safe for multi-server deployments where automatic
  consumer IDs are used.


  * Type: string
  * Default: “”
  * Importance: high

Other settings are important to the health and performance of the proxy. You should consider
changing these based on your specific use case.

`consumer.request.max.bytes`
: Maximum number of bytes in message keys and values returned by a single request.
  Smaller values reduce the maximum memory used by a single consumer and may be helpful to
  clients that cannot perform a streaming decode of responses, limiting the maximum memory
  used to decode and process a single JSON payload.


  Conversely, larger values may be more efficient because many messages can be batched into
  a single request, reducing the number of HTTP requests (and network round trips) required to
  consume the same set of messages.


  Note that this can also be overridden by clients on a per-request basis using the
  `max_bytes` query parameter. However, this setting controls the absolute maximum;
  `max_bytes` settings exceeding this value will be ignored.


  * Type: long
  * Default: 67108864
  * Importance: medium

`fetch.min.bytes`
: The minimum number of bytes in message keys and values returned by a single request
  before the timeout of consumer.request.timeout.ms passes.


  * Type: int
  * Default: -1
  * Importance: medium

`consumer.request.timeout.ms`
: The maximum total time to wait for messages for a request if the maximum request size has
  not yet been reached. The consumer uses a timeout to enable batching. A larger value will
  allow the consumer to wait longer, possibly including more messages in the response.
  However, this value is also a lower bound on the latency of consuming a message from Kafka.
  If consumers need low latency message delivery, then specify a lower value.


  * Type: int
  * Default: 1000
  * Importance: medium

`consumer.threads`
: The maximum number of threads to run consumer requests on. Consumers requests are
  ran one per thread in a synchronous manner. You must set this value higher than the maximum number
  of consumers in a single consumer group, otherwise rebalances will deadlock.


  * Type: int
  * Default: 50
  * Importance: medium

`host.name`
: The host name used to generate absolute URLs for consumers. If empty, the default canonical
  hostname is used. You may need to set this value if the FQDN of your host cannot be
  automatically determined.


  * Type: string
  * Default: “”
  * Importance: medium


## Schemas

Although the records serialized to Kafka are opaque bytes, they
must have some rules about their structure to make it possible to
process them. One aspect of this structure is the schema of the data,
which defines its shape and fields. Is it an integer? Is it a map with
keys `foo`, `bar`, and `baz`? Something else?

Without any mechanism for enforcement, schemas are implicit. A consumer,
somehow, needs to know the form of the produced data. Frequently this
happens by getting a group of people to agree verbally on the schema.
This approach, however, is error prone. It’s often better if the schema
can be managed centrally, audited, and enforced programmatically.

[Confluent Schema Registry](../../schema-registry/index.md#schemaregistry-intro),
a project outside of Kafka, helps with schema management.
Schema Registry enables producers to register a topic with a schema so that
when any further data is produced, it is rejected if it doesn’t conform
to the schema. Consumers can consult Schema Registry to find the schema for
topics they don’t know about.

Rather than having you glue together producers, consumers, and schema
configuration, ksqlDB integrates transparently with Schema Registry. By
enabling a configuration option so that the two systems can talk to each
other, ksqlDB stores all stream and table schemas in Schema Registry. These
schemas can then be downloaded and used by any application working with
ksqlDB data. Moreover, ksqlDB can infer the schemas of existing topics
automatically, so that you don’t need to declare their structure when
you define the stream or table over it.


## Content Types

The ksqlDB HTTP API uses content types for requests and responses to
indicate the serialization format of the data and the API version.

Your request should specify this serialization format and version in the
`Accept` header, for example:

```none
Accept: application/vnd.ksql.v1+json
```

The less specific `application/json` content type is also permitted.
However, this is only for compatibility and ease of use, and you should
use the versioned value where possible. `application/json` maps to the
latest versioned content type, meaning the response may change after
upgrading the server to a later version.

The server also supports content negotiation, so you may include
multiple, weighted preferences:

```none
Accept: application/vnd.ksql.v1+json; q=0.9, application/json; q=0.5
```

For example, content negotiation is useful when a new version of the API
is preferred, but you are not sure if it is available yet.

Here’s an example request that returns the results from the
`LIST STREAMS` command:

```bash
curl -X "POST" "http://localhost:8088/ksql" \
     -H "Accept: application/vnd.ksql.v1+json" \
     -d $'{
  "ksql": "LIST STREAMS;",
  "streamsProperties": {}
}'
```

Here’s an example request that retrieves streaming data from
`TEST_STREAM`:

```bash
curl -X "POST" "http://localhost:8088/query" \
     -H "Accept: application/vnd.ksql.v1+json" \
     -d $'{
  "ksql": "SELECT * FROM TEST_STREAM EMIT CHANGES;",
  "streamsProperties": {}
}'
```

A `PROTOBUF` content type where the rows are serialized in the
`PROTOBUF` format is also supported for querying the `/query` and
`/query-stream` endpoints. You can specify this serialization format
in the `Accept` header:

```none
Accept: application/vnd.ksql.v1+protobuf
```

The following example shows a curl command that issues a Pull query on a
table called `CURRENTLOCATION` with the `PROTOBUF` content type:

```bash
curl -X "POST" "http://localhost:8088/query" \
     -H "Accept: application/vnd.ksql.v1+protobuf" \
     -d $'{
  "ksql": "SELECT * FROM CURRENTLOCATION;",
  "streamsProperties": {}
}'
```

Response:

```json
[{"header":{"queryId":"query_1655152127973","schema":"`PROFILEID` STRING KEY, `LA` DOUBLE, `LO` DOUBLE","protoSchema":"syntax = \"proto3\";\n\nmessage ConnectDefault1 {\n  string PROFILEID = 1;\n  double LA = 2;\n  double LO = 3;\n}\n"}},
{"row":{"protobufBytes":"CggxOGY0ZWE4NhF90LNZ9bFCQBmASL99HYRewA=="}},
{"row":{"protobufBytes":"Cgg0YTdjN2I0MRFAE2HD07NCQBnM7snDQoVewA=="}},
{"row":{"protobufBytes":"Cgg0YWI1Y2JhZBGKsOHplbJCQBmMSuoENIVewA=="}},
{"row":{"protobufBytes":"Cgg0ZGRhZDAwMBHNO07RkeRCQBk9m1Wfq5lewA=="}},
{"row":{"protobufBytes":"Cgg4YjZlYWU1ORFtxf6ye7JCQBmMSuoENIVewA=="}},
{"row":{"protobufBytes":"CghjMjMwOWVlYxGUh4Va0+RCQBn0/dR46ZpewA=="}}]
```

The `protoSchema` field in the `header` corresponds to the content
of a `.proto` file that the proto compiler uses at build time. Use the
`protoSchema` field to deserialize the `protobufBytes` into
`PROTOBUF` messages.

Provide the `--basic` and `--user` options if basic HTTPS
authentication is enabled on the cluster, as shown in the following
command.

`bash hl_lines="3" curl -X "POST" "https://localhost:8088/ksql" \      -H "Accept: application/vnd.ksql.v1+json" \      --basic --user "<API key>:<secret>" \      -d $'{   "ksql": "LIST STREAMS;",   "streamsProperties": {} }'`


## Can ksqlDB connect to an Apache Kafka cluster over TLS and authenticate using SASL?

Yes. Internally, ksqlDB uses standard Kafka consumers and producers.
The procedure to securely connect ksqlDB to Kafka is the same as connecting any app to Kafka.

For more information, see
[Configure Kafka Authentication](operate-and-deploy/installation/security.md#ksqldb-installation-security-configure-kafka-auth).


## Important Sizing Factors

This section describes the important factors to consider when scoping
out your ksqlDB deployment.

**Throughput**: In general, higher throughput requires more resources.

**Query Types**: Your realized throughput will largely be a function of
the type of queries you run. You can think of ksqlDB queries as falling
into these categories:

- Project/Filter, e.g.
  `SELECT <columns> FROM <table/stream> WHERE <condition>`
- Joins
- Aggregations, e.g. `SUM, COUNT, TOPK, TOPKDISTINCT`

A project/filter query reads records from an input stream or table, may
filter the records according to some predicate, and performs stateless
transformations on the columns before writing out records to a sink
stream or table. Project/filter queries require the fewest resources.
For a single project/filter query running on an instance provisioned as
recommended above you can expect to realize from ~40 MB/second up to the
rate supported by your network. The throughput depends largely on the
average message size and complexity. Processing small messages with many
columns is CPU intensive and will saturate your CPU. Processing large
messages with fewer columns requires less CPU and ksqlDB will start
saturating the network for such workloads.

Stream-table joins read and write to Kafka Streams state stores and
require around twice the CPU of project/filter. Though Kafka Streams
state stores are stored on disk, we recommend that you provision
sufficient memory to keep the working set memory-resident to avoid
expensive disk I/O. So expect around half the throughput and expect to
provision higher-memory instances.

Aggregations read from and may write to a state store for every record.
They consume around twice the CPU of joins. The CPU required increases
if the aggregation uses a window as the state store must be updated for
every window.

**Number of Queries**: The available resources on a server are shared
across all queries. So expect that the processing throughput per server
will decrease proportionally with the number of queries it is executing
(see the notes on vertically and horizontally scaling a ksqlDB cluster
in this document to add more processing capacity in such situations) .
Furthermore, SQL queries run as Kafka Streams applications. Each
query starts its own Kafka Streams worker threads, and uses its own
consumers and producers. This adds a little bit of CPU overhead per
query. You should avoid running a large number of queries on one ksqlDB
cluster. Instead, use interactive mode to play with your data and
develop sets of queries that function together. Then, run these in their
own headless cluster. Check out the [Recommendations and Best
Practices](#recommendations-and-best-practices) section for more
details.

**Data Schema**: ksqlDB handles mapping serialized Kafka records to
columns in a stream or table’s schema. In general, more complex schemas
with a higher ratio of columns to bytes of data require more CPU to
process.

**Number of Partitions**: Kafka Streams creates one RocksDB state
store instance for aggregations and joins for every topic partition
processed by a given ksqlDB server. Each RocksDB state store instance
has a memory overhead of 50 MB for its cache plus the data actually
stored.

**Key Space**: For aggregations and joins, Kafka Streams/RocksDB
tries to keep the working set of a state store in memory to avoid I/O
operations. If there are many keys, this requires more memory. It also
makes reads and writes to the state store more expensive. Note that the
size of the data in a state store is not limited by memory (RAM) but
only by available disk space on a ksqlDB server.


## Next steps

- See ksqlDB in action with the
  [ksqlDB Quick Start](../quickstart.md#ksqldb-quick-start).
- Learn more with the [ksqlDB Tutorials and Examples](../tutorials/overview.md#ksql-tutorials).
- Take the developer courses: [Introduction to ksqlDB](https://developer.confluent.io/learn-kafka/ksqldb/intro/)
  and [ksqlDB Architecture](https://developer.confluent.io/learn-kafka/inside-ksqldb/streaming-architecture/).


### Write the Kafka consumer code

Now we can write the code that triggers side effects when anomalies are
found. Add the following Java file at
`src/main/java/io/ksqldb/tutorial/EmailSender.java`. This is a simple
program that consumes events from Kafka and sends an email with
SendGrid for each one it finds. There are a few constants to fill in,
including a SendGrid API key. You can get one by signing up for
SendGrid.

```java
package io.ksqldb.tutorial;

import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.StringDeserializer;

import io.confluent.kafka.serializers.KafkaAvroDeserializer;
import io.confluent.kafka.serializers.KafkaAvroDeserializerConfig;
import io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig;

import com.sendgrid.SendGrid;
import com.sendgrid.Request;
import com.sendgrid.Response;
import com.sendgrid.Method;
import com.sendgrid.helpers.mail.Mail;
import com.sendgrid.helpers.mail.objects.Email;
import com.sendgrid.helpers.mail.objects.Content;

import java.time.Duration;
import java.time.Instant;
import java.time.ZoneId;
import java.time.format.DateTimeFormatter;
import java.time.format.FormatStyle;
import java.util.Collections;
import java.util.Properties;
import java.util.Locale;
import java.io.IOException;

public class EmailSender {

    // Matches the broker port specified in the Docker Compose file.
    private final static String BOOTSTRAP_SERVERS = "localhost:29092";
    // Matches the Schema Registry port specified in the Docker Compose file.
    private final static String SCHEMA_REGISTRY_URL = "http://localhost:8081";
    // Matches the topic name specified in the ksqlDB CREATE TABLE statement.
    private final static String TOPIC = "possible_anomalies";
    // For you to fill in: which address SendGrid should send from.
    private final static String FROM_EMAIL = "<< FILL ME IN >>";
    // For you to fill in: the SendGrid API key to use their service.
    private final static String SENDGRID_API_KEY = "<< FILL ME IN >>";

    private final static SendGrid sg = new SendGrid(SENDGRID_API_KEY);
    private final static DateTimeFormatter formatter =
            DateTimeFormatter.ofLocalizedDateTime(FormatStyle.SHORT)
                    .withLocale(Locale.US)
                    .withZone(ZoneId.systemDefault());

    public static void main(final String[] args) throws IOException {
        final Properties props = new Properties();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS);
        props.put(ConsumerConfig.GROUP_ID_CONFIG, "email-sender");
        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true");
        props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        props.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, SCHEMA_REGISTRY_URL);
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaAvroDeserializer.class);
        props.put(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, true);

        try (final KafkaConsumer<String, PossibleAnomaly> consumer = new KafkaConsumer<>(props)) {
            consumer.subscribe(Collections.singletonList(TOPIC));

            while (true) {
                final ConsumerRecords<String, PossibleAnomaly> records = consumer.poll(Duration.ofMillis(100));
                for (final ConsumerRecord<String, PossibleAnomaly> record : records) {
                    final PossibleAnomaly value = record.value();

                    if (value != null) {
                        sendEmail(value);
                    }
                }
            }
        }
    }

    private static void sendEmail(PossibleAnomaly anomaly) throws IOException {
        Email from = new Email(FROM_EMAIL);
        Email to = new Email(anomaly.getEmailAddress().toString());
        String subject = makeSubject(anomaly);
        Content content = new Content("text/plain", makeContent(anomaly));
        Mail mail = new Mail(from, subject, to, content);

        Request request = new Request();
        try {
            request.setMethod(Method.POST);
            request.setEndpoint("mail/send");
            request.setBody(mail.build());
            Response response = sg.api(request);
            System.out.println("Attempted to send email!\n");
            System.out.println("Status code: " + response.getStatusCode());
            System.out.println("Body: " + response.getBody());
            System.out.println("Headers: " + response.getHeaders());
            System.out.println("======================");
        } catch (IOException ex) {
            throw ex;
        }
    }

    private static String makeSubject(PossibleAnomaly anomaly) {
        return "Suspicious activity detected for card " + anomaly.getCardNumber();
    }

    private static String makeContent(PossibleAnomaly anomaly) {
        return String.format("Found suspicious activity for card number %s. %s transactions were made for a total of %s between %s and %s",
                anomaly.getCardNumber(),
                anomaly.getNAttempts(),
                anomaly.getTotalAmount(),
                formatter.format(Instant.ofEpochMilli(anomaly.getStartBoundary())),
                formatter.format(Instant.ofEpochMilli(anomaly.getEndBoundary())));
    }

}
```


#### Restore mirroring after a failover with truncate-and-restore

If you want to restore mirroring after a `promote` or a `failover`, you can use the `truncate-and-restore` command.
After failing over or promoting a mirror topic, you can run `truncate-and-restore` on the original primary topic that
will make it a mirror fetching from the newly-stopped mirror topic. This command will also truncate and delete any divergent
records that were produced to the original primary cluster after the point of failover. This means that there could be some loss
of data if your clients are not set up to reprocess data. To learn more, see [Convert a mirror topic to a normal topic](mirror-topics-cp.md#convert-mirror-topic-to-normal-topic).

`truncate-and-restore` is available only on [“bidirectional” links](mirror-topics-cp.md#bidirectional-linking-cp), and only in KRaft mode.
To learn more about running Kafka in KRaft mode, see [KRaft Overview for Confluent Platform](../../kafka-metadata/kraft.md#kraft-overview), [KRaft Configuration for Confluent Platform](../../kafka-metadata/config-kraft.md#configure-kraft), and the [Platform Quick Start](../../get-started/platform-quickstart.md#cp-quickstart-step-1).
Also, the [basic Cluster Linking tutorial](topic-data-sharing.md#tutorial-topic-data-sharing) includes a full walkthrough of how to run Cluster Linking in KRaft mode.


#### IMPORTANT
As of Confluent Platform 8.0, ZooKeeper is no longer available for new deployments. Confluent recommends migrating to KRaft mode for new deployments.
To learn more about running Kafka in KRaft mode, see [/platform/current/KRaft Overview](/platform/current/KRaft Overview) steps in the [Platform Quick Start](/platform/current/get-started/platform-quickstart.html).
To learn about migrating from older versions, see [Migrate from ZooKeeper to KRaft on Confluent Platform](/platform/current/installation/migrate-zk-kraft.html).

This tutorial provides examples for KRaft mode only. Earlier versions of this documentation provide examples for both KRaft and ZooKeeper.

For KRaft, the examples show a *combined mode* configuration, where for each cluster the broker and controller run on the same server.
Currently, combined mode is not intended for production use but is shown here to simplify the tutorial.
If you want to run controllers and brokers on separate servers, use KRaft in isolated mode. To learn more, see [KRaft Overview](/platform/current/kafka-metadata/kraft.html#kraft-overview)
and [Kraft mode](https://docs.confluent.io/platform/current/installation/installing_cp/zip-tar.html#kraft-mode) under
[Configure Confluent Platform for production](https://docs.confluent.io/platform/current/installation/installing_cp/zip-tar.html#configure-cp-for-production).


### Create the Confluent Cloud to Confluent Platform link

1. Create another user API key for this cluster link on your Confluent Cloud cluster.
   ```bash
   confluent api-key create --resource $CC_CLUSTER_ID
   ```

   You use the same cluster that served as the destination in previous steps as the source cluster in the following steps, therefore, you create
   a different API key and secret for the same cluster to serve in this new role.
2. Keep the resulting API key and secret in a safe place. This tutorial refers to these as `<CC-src-api-key>` and `<CC-src-api-secret>`.
   You will add these to a configuration file in the next step.

   #### IMPORTANT
   If you are setting this up in production, you should use a service account API key instead of a user-associated key.
   To do this, you would create a service account for your cluster link, give the service account the requisite ACLs, then
   create an API key for the service account. It’s best practice for each cluster link to have its own API key and service account. A guide on
   [how to set up privileges to access Confluent Cloud clusters with a service account](https://docs.confluent.io/cloud/current/multi-cloud/cluster-linking/topic-data-sharing.html#set-up-privileges-for-the-cluster-link-to-access-topics-on-the-source-cluster)
   is provided in the topic data sharing tutorial.
3. Use `confluent kafka cluster describe` to get the Confluent Cloud cluster Endpoint URL.
   ```bash
   confluent kafka cluster describe $CC_CLUSTER_ID
   ```

   This Endpoint URL will be referred to as `<CC-BOOTSTRAP-SERVER>` in the following steps.
4. Save your API key and secret along with the following configuration entries in a file called `$CONFLUENT_CONFIG/clusterlink-cloud-to-CP.config` that the Confluent Platform commands will use to authenticate into Confluent Cloud:
   ```bash
   <vi | emacs> $CONFLUENT_CONFIG/clusterlink-cloud-to-CP.config
   ```

   The configuration entries you need in this file are as follows:
   ```bash
   bootstrap.servers=<CC-BOOTSTRAP-SERVER>
   security.protocol=SASL_SSL
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='<CC-src-api-key>' password='<CC-src-api-secret>';
   ```
5. Create the cluster link to Confluent Platform.

   If you want to follow this example exactly, name the cluster link `from-cloud-link` but you have the option to name it whatever you like.
   You will use the cluster link name to create and manipulate mirror topics. You cannot rename a cluster link once it’s created.

   The following command creates the cluster link on an unsecured Confluent Platform cluster. If you have security set up on your Confluent Platform cluster,
   you must pass security credentials to this command with `--command-config` as shown in
   [Setting Properties on a Cluster Link](https://docs.confluent.io/platform/current/multi-dc-deployments/cluster-linking/configs.html#setting-properties-on-a-cluster-link).
   ```bash
   kafka-cluster-links --bootstrap-server localhost:9092 \
         --create --link from-cloud-link \
         --config-file $CONFLUENT_CONFIG/clusterlink-cloud-to-CP.config \
         --cluster-id $CC_CLUSTER_ID --command-config $CONFLUENT_CONFIG/CP-command.config
   ```

   Your output should resemble the following:
   ```bash
   Cluster link 'from-cloud-link' creation successfully completed.
   ```
6. Check that the link exists with the `kafka-cluster-links --list` command, as follows.
   ```bash
   kafka-cluster-links --list --bootstrap-server localhost:9092 --command-config $CONFLUENT_CONFIG/CP-command.config
   ```

   Your output should resemble the following, showing the previous `from-on-prem-link` you created along with the new `from-cloud-link`
   ```none
   Link name: 'from-on-prem-link', link ID: '7eb4304e-b513-41d2-903e-147dea62a01c', remote cluster ID: 'lkc-1vgo6', local cluster ID: 'G1pnOMOxSjWYIX8xuR2cfQ'
   Link name: 'from-cloud-link', link ID: 'b1a56076-4d6f-45e0-9013-ff305abd0e54', remote cluster ID: 'lkc-1vgo6', local cluster ID: 'G1pnOMOxSjWYIX8xuR2cfQ'
   ```


### KRaft and ZooKeeper

- As of Confluent Platform 8.0, ZooKeeper is no longer available for new deployments. Confluent recommends migrating to KRaft mode for new deployments.
  To learn more about running Kafka in KRaft mode, see [KRaft Overview for Confluent Platform](../../kafka-metadata/kraft.md#kraft-overview) and the KRaft steps in the [Quick Start for Confluent Platform](../../get-started/platform-quickstart.md#quickstart).
  To learn about migrating from older versions, see [Migrate from ZooKeeper to KRaft on Confluent Platform](../../installation/migrate-zk-kraft.md#migrate-zk-kraft).
- Specifically, in relation to this migration to KRaft, `password.encoder.secret` is not required for KRaft mode, but is required when [migrating from ZooKeeper to KRaft](../../installation/migrate-zk-kraft.md#migrate-zk-kraft).
  Use of this parameter for Cluster Linking, when needed for older versions on ZooKeeper, is shown in [Tutorial: Link Confluent Platform and Confluent Cloud Clusters](hybrid-cp.md#cluster-link-hybrid-cp). To learn more about how this is handled in Confluent Platform 8.0 and later, see [Update password configurations dynamically](../../kafka/dynamic-config.md#dynamic-config-passwords-upgrade).
- This documentation provides examples for KRaft mode only. Earlier versions of this documentation provide examples for both KRaft and ZooKeeper.
- Some examples in the various tutorials show a *combined mode* configuration, where for each cluster the broker and controller run on the same server.
  Currently, combined mode is not intended for production use but is shown here to simplify the tutorial.
  If you want to run controllers and brokers on separate servers, use KRaft in isolated mode. To learn more, see [KRaft Overview for Confluent Platform](../../kafka-metadata/kraft.md#kraft-overview) and [KRaft Configuration for Confluent Platform](../../kafka-metadata/config-kraft.md#configure-kraft).


## Known Issues, Limitations, and Best Practices

* While the use of Single Message Transformations (SMTs) in Replicator is supported, it is not a best practice.
  The use of Apache Flink® or Kafka Streams is considered best practice because these are more scalable and easier to debug.
* Replicator should not be used for serialization changes. In these cases, the
  recommended method is to use ksqlDB. To learn more, see the documentation on [ksqlDB](../../ksqldb/overview.md#ksql-home) and the tutorial on
  [How to convert a stream’s serialization format](https://developer.confluent.io/tutorials/changing-serialization-format/ksql.html)
  on the Confluent Developer site.
* When running Replicator with version 5.3.0 or above, set
  `connect.protocol=eager` as there is a known issue where using the default
  of `connect.protocol=compatible` or `connect.protocol=sessioned` can cause
  issues with tasks rebalancing and duplicate records.
* If you encounter `RecordTooLargeException` when you use compressed records,
  set the record batch size for the Replicator producer to the highest possible value.
  When Replicator decompresses records while consuming from the source cluster, it
  checks the size of the uncompressed batch on the producer before recompressing
  them and may throw `RecordTooLargeException`. Setting the record batch size
  mitigates the exception, and compression proceeds as expected when the record
  is sent to the destination cluster.
* The Replicator latency metric is calculated by subtracting the time the record
  was produced to the source from the time it was replicated on the destination.
  This works in the real time case, when there is active production going on in
  the source cluster and the calculation we are doing is in real time. However if
  you are replicating old data, you will see very large latency due to the old
  record timestamps. In the historical data case, the latency does not indicate
  how long Replicator is taking to replicate data. It indicates how much time has
  passed between the original message and now for the message that Replicator is
  currently replicating. As Replicator proceeds over historical data, the latency
  metric should decrease quickly.
* There’s an issue with the Replicator lag metric where the value `NaN` is reported if
  there has not been a sample of lag being reported in a given time window. This
  can happen if you have limited production in the source cluster
  or if Replicator is not flushing data fast enough to the destination cluster,
  thus causing it to not be able to record enough samples in the given time window.
  This will cause the JMX metrics to report `NaN` for the Replicator
  metrics. `NaN` may not necessarily mean that the lag is 0; it means
  that there aren’t enough samples in the given time window to report lag.


## MirrorMaker

MirrorMaker is a stand-alone tool for copying data between two Kafka clusters. To learn more, see [Mirroring data between clusters](https://kafka.apache.org/documentation/#basic_ops_mirror_maker) in the Kafka documentation.

MirrorMaker 2 is supported as a stand-alone executable, but is not supported as a connector.

Confluent Replicator is a more complete solution that handles topic configuration and data, and integrates with Kafka Connect and Confluent Control Center to improve availability, scalability and ease of use. To learn more, try out the Quick Start [Tutorial: Replicate Data Across Kafka Clusters in Confluent Platform](replicator-quickstart.md#replicator-quickstart) and see [Migrate from Kafka MirrorMaker to Replicator in Confluent Platform](migrate-replicator.md#migrate-replicator).


# Replicate Topics Across Kafka Clusters in Confluent Platform


* [Overview](index.md)
* [Example: Active-active Multi-Datacenter](replicator-docker-tutorial.md)
* [Tutorial: Replicate Data Across Clusters](replicator-quickstart.md)
* [Tutorial: Run as an Executable or Connector](replicator-run.md)
* [Configure](configuration_options.md)
* [Verify Configuration](replicator-verifier.md)
* [Tune](replicator-tuning.md)
* [Monitor](replicator-monitoring.md)
* [Configure for Cross-Cluster Failover](replicator-failover.md)
* [Migrate from MirrorMaker to Replicator](migrate-replicator.md)
* [Replicator Schema Translation Example for Confluent Platform](replicator-schema-translation.md)


## Related content

* For a practical guide to designing and configuring multiple Apache
  Kafka clusters to be resilient in case of a disaster scenario, see
  the [Disaster Recovery white
  paper](https://www.confluent.io/white-paper/disaster-recovery-for-multi-datacenter-apache-kafka-deployments/).
  This white paper provides a plan for failover, failback, and ultimately successful recovery.
* For an overview of using Confluent Platform for data replication, see [Overview of Multi-Datacenter Deployment Solutions on Confluent Platform](../index.md#multi-dc).
* For a quick start on how to configure Replicator and set up your own multi-cluster deployment, see [Tutorial: Replicate Data Across Kafka Clusters in Confluent Platform](replicator-quickstart.md#replicator-quickstart).
* For an overview of Replicator see [Replicate Multi-Datacenter Topics Across Kafka Clusters in Confluent Platform](index.md#replicator-detail).
* For an introduction to using Confluent Platform to create stretch clusters with followers, observers, and replica placement, see [Configure Multi-Region Clusters in Confluent Platform](../multi-region.md#bmrr).


### Convert Replicator connector configurations to Replicator executable configurations

Replicator connect configuration can be converted to a Replicator executable configuration. One of the key differences between the two is that the Connect configuration has two configuration files (a worker properties file and a connector properties or JSON file) while Replicator executable has three configuration files (a consumer, a producer, and a replication properties file). It’s helpful to think about this in the following way:

* The consumer configuration file contains all the properties you need to configure the consumer embedded within Replicator that consumes from the source cluster. This would include any special configurations you want to use to tune the source consumer, in addition to the necessary security and connection details needed for the consumer to connect to the source cluster.
* The producer configuration file contains all the properties you need to configure the producer embedded within Replicator that produces to the destination cluster. This would include any special configurations you want to use to tune the destination producer, in addition to the necessary security and connection details needed for the producer to connect to the destination cluster.
* The replication configuration file contains all the properties you need to configure the actual Replicator that does the work of taking the data from the source consumer and passing it to the destination producer. This would include all Connect-specific configurations needed for Replicator as well as any necessary Replicator configurations.

If you have the following worker properties:

```none
config.storage.replication.factor=3
offset.storage.replication.factor=3
status.storage.replication.factor=3
connect.protocol=eager
connector.client.config.override.policy=All
bootstrap.servers=destination-cluster:9092
ssl.endpoint.identification.algorithm=https
sasl.mechanism=PLAIN
security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username=\"destUser\" password=\"destPassword\";
```

And the following Replicator JSON:

```json
{
  "connector.class":"io.confluent.connect.replicator.ReplicatorSourceConnector",
  "tasks.max":4,
  "topic.whitelist":"test-topic",
  "topic.rename.format":"${topic}.replica",
  "confluent.license":"XYZ"
  "name": "replicator",
  "header.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
  "key.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
  "value.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
  "src.consumer.max.poll.records":"10000",
  "producer.override.linger.ms":"10",
  "producer.override.compression.type":"lz4",
  "src.kafka.bootstrap.servers": "source-cluster:9092",
  "src.kafka.ssl.endpoint.identification.algorithm": "https",
  "src.kafka.security.protocol": "SASL_SSL",
  "src.kafka.sasl.mechanism": "PLAIN",
  "src.kafka.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"sourceUser\" password=\"sourcePassword\";",
  "dest.kafka.bootstrap.servers": "destination-cluster:9092",
  "dest.kafka.ssl.endpoint.identification.algorithm": "https",
  "dest.kafka.security.protocol": "SASL_SSL",
  "dest.kafka.sasl.mechanism": "PLAIN",
  "dest.kafka.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"destUser\" password=\"destPassword\";"
}
```

You can convert the configuration shown above in these two configuration files to the following Consumer, Producer, and Replication configurations needed to use Replicator executable:

**Consumer Configurations**:

```none
bootstrap.servers=source-cluster:9092
ssl.endpoint.identification.algorithm=https
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username=\"sourceUser\" password=\"sourcePassword\";
max.poll.records=10000
```

For the consumer configurations, strip the `src.kafka` and `src.consumer` prefixes and simply list the actual configuration you want for the source consumer. The Replicator executable will know that because this has been placed in the consumer configuration, it needs to apply these configurations to the source consumer that will poll the source cluster.

**Producer Configurations**:

```none
bootstrap.servers=destination-cluster:9092
ssl.endpoint.identification.algorithm=https
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username=\"destUser\" password=\"destPassword\";
linger.ms=10
compression.type=lz4
```

For the producer configurations, strip the `dest.kafka` and `producer.overrides` prefixes and simply list the actual configuration you want for the destination producer. The Replicator executable will know that because this has been placed in the producer configuration, it needs to apply these configurations to the destination producer that will write to the destination cluster.

**Replication Configurations**:

```none
config.storage.replication.factor=3
offset.storage.replication.factor=3
status.storage.replication.factor=3
connect.protocol=eager
tasks.max=4
topic.whitelist=test-topic
topic.rename.format=${topic}.replica
confluent.license=XYZ
name=replicator
header.converter=io.confluent.connect.replicator.util.ByteArrayConverter
key.converter=io.confluent.connect.replicator.util.ByteArrayConverter
value.converter=io.confluent.connect.replicator.util.ByteArrayConverter
```

For the replication configurations, only include the configurations that are important for Replicator or Connect. It’s important to note that here you don’t need the `connector.client.config.override.policy` configuration anymore, as the Replicator executable directly passes in the producer configurations specified in the configuration file.
This makes it easier to think about configuring the important consumers and producers for replication, rather than incorporating an extra Connect configuration.


### Run Replicator on the source cluster

Replicator should be run on the destination cluster if possible. If this is not practical it is possible to run Replicator on the source cluster from Confluent Platform 5.4.0 onwards. Make the following changes to run Replicator in this way:

* `connector.client.config.override.policy` to be set to `All` in the Connect worker configuration or in  `--replication.config` if using Replicator Executable.
* `bootstrap.servers` in the Connect worker configuration should point to the source cluster (for Replicator Executable specify this in `--producer.config`)
* any client configurations (security etc.) for the source cluster should be provided in the Connect worker configuration (for Replicator Executable specify these in `--producer.config`)
* `producer.override.bootstrap.servers` in the connector configuration should point to the destination cluster (for Replicator Executable specify this in `--replication.config`)
* any client configurations (security etc.) for the destination cluster should be provided in the connector configuration with prefix `producer.override.` (for Replicator Executable specify these in `--replication.config`)
* configurations with the prefix `src.kafka.` and `dest.kafka` should be provided as usual

An example configuration for Replicator running as a connector on the source cluster can be seen below:

```bash
{
  "connector.class": "io.confluent.connect.replicator.ReplicatorSourceConnector",
  "name": "replicator",
  "producer.override.ssl.endpoint.identification.algorithm": "https",
  "producer.override.sasl.mechanism": "PLAIN",
  "producer.override.request.timeout.ms": 20000,
  "producer.override.bootstrap.servers": "destination-cluster:9092",
  "producer.override.retry.backoff.ms": 500,
  "producer.override.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"someUser\" password=\"somePassword\";",
  "producer.override.security.protocol": "SASL_SSL",
  "key.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
  "value.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
  "topic.whitelist": "someTopic",
  "src.kafka.bootstrap.servers": "source-cluster:9092",
  "dest.kafka.bootstrap.servers": "destination-cluster:9092",
  "dest.kafka.ssl.endpoint.identification.algorithm": "https",
  "dest.kafka.security.protocol": "SASL_SSL",
  "dest.kafka.sasl.mechanism": "PLAIN",
  "dest.kafka.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"someUser\" password=\"somePassword\";"
}
```

In this configuration Replicator is producing between clusters rather than consuming and the default producer configurations are not optimal for this. Consider adjusting the following configurations to increase the throughput of the producer flow:

* `producer.override.linger.ms=500`
* `producer.override.batch.size=600000`

These values are provided as a starting point only and should be further tuned to your environment and use case.

For more detail on running Replicator on the source cluster when the destination is Confluent Cloud, see [Confluent Replicator to Confluent Cloud Configurations](/cloud/current/get-started/examples/ccloud/docs/replicator-to-cloud-configuration-types.html).


## Configuration Options

`schema.registry.url`
: Comma-separated list of URLs for Schema Registry instances that can be used to register or look up schemas.


  * Type: list
  * Default: “”
  * Importance: high

`auto.register.schemas`
: Specify if the Serializer should attempt to register the Schema with Schema Registry.


  * Type: boolean
  * Default: true
  * Importance: medium

`use.latest.version`
: Only applies when `auto.register.schemas` is set to `false`. If
  `auto.register.schemas` is set to `false` and `use.latest.version` is
  set to `true`, then instead of deriving a schema for the object passed to
  the client for serialization, Schema Registry will use the latest version of the schema in
  the subject for serialization. The property `use.latest.version` can be set on
  producers or consumers to serialize or deserialize messages per the latest version.


  * Type: boolean
  * Default: false
  * Importance: medium


  #### NOTE
  To learn more, see how to use schema references to combine [multiple event types in the same topic](fundamentals/serdes-develop/index.md#multiple-event-types-same-topic-sr) with [Avro](fundamentals/serdes-develop/serdes-avro.md#multiple-event-types-same-topic-avro), [JSON Schema](fundamentals/serdes-develop/serdes-json.md#multiple-event-types-same-topic-json), or [Protobuf](fundamentals/serdes-develop/serdes-protobuf.md#multiple-event-types-same-topic-protobuf).

`latest.compatibility.strict`
: Only applies when `use.latest.version` is set to `true`.


  If `latest.compatibility.strict` is `true` (the default), then when using `use.latest.version=true`
  during serialization, a check is performed to verify that the latest subject version is backward compatible with the schema of the object being serialized.
  If the check fails, then an error results. If the check succeeds, then serialization is performed.


  If `latest.compatibility.strict` is `false`, then the latest subject version is used for serialization,
  without any compatibility check. Serialization may fail in this case. Relaxing the compatibility requirement (by setting `latest.compatibility.strict` to `false`) may be useful, for example,
  when implementing [Kafka Connect converters](../connect/index.md#connect-converters) and [schema references](fundamentals/serdes-develop/index.md#referenced-schemas).


  * Type: boolean
  * Default: true
  * Importance: medium


  #### NOTE
  To learn more about this setting, see [Schema Evolution and Compatibility for Schema Registry on Confluent Platform](fundamentals/schema-evolution.md#schema-evolution-and-compatibility).

`max.schemas.per.subject`
: Maximum number of schemas to create or cache locally.


  * Type: int
  * Default: 1000
  * Importance: low

`key.subject.name.strategy`
: Determines how to construct the subject name under which the key schema is registered with Schema Registry. For additional information, see Schema Registry [Subject name strategy](fundamentals/serdes-develop/index.md#sr-schemas-subject-name-strategy).


  Any implementation of `io.confluent.kafka.serializers.subject.strategy.SubjectNameStrategy` can be specified. By default, `<topic>-key` is used as the subject.
  Specifying an implementation of `io.confluent.kafka.serializers.subject.SubjectNameStrategy` is deprecated as of `4.1.3` and if used may have some performance degradation.


  * Type: class
  * Default: class io.confluent.kafka.serializers.subject.TopicNameStrategy
  * Importance: medium

`value.subject.name.strategy`
: Determines how to construct the subject name under which the value schema is registered with Schema Registry. For additional information, see Schema Registry [Subject name strategy](fundamentals/serdes-develop/index.md#sr-schemas-subject-name-strategy).


  Any implementation of `io.confluent.kafka.serializers.subject.strategy.SubjectNameStrategy` can be specified. By default, `<topic>-value` is used as the subject.
  Specifying an implementation of `io.confluent.kafka.serializers.subject.SubjectNameStrategy` is deprecated as of `4.1.3` and if used may have some performance degradation.


  * Type: class
  * Default: class io.confluent.kafka.serializers.subject.TopicNameStrategy
  * Importance: medium

`basic.auth.credentials.source`
: Specify how to pick the credentials for the Basic authentication header. The supported values are URL,
  USER_INFO and SASL_INHERIT.


  * Type: string
  * Default: “URL”
  * Importance: medium

`basic.auth.user.info`
: Specify the user info for the Basic authentication in the form of {username}:{password}. schema.registry.basic.auth.user.info is a deprecated alias for this configuration.


  * Type: password
  * Default: “”
  * Importance: medium

The following Schema Registry dedicated properties, configurable on the client, are
available on Confluent Platform version 5.4.0 (and later). To learn more, see the information
on configuring clients in [Additional configurations for HTTPS](security/index.md#sr-https-additional).

`schema.registry.ssl.truststore.location`
: The location of the trust store file. For example, `schema.registry.kafkastore.ssl.truststore.location=/etc/kafka/secrets/kafka.client.truststore.jks`


  * Type: string
  * Default: “”
  * Importance: medium

`schema.registry.ssl.truststore.password`
: The password for the trust store file. If a password is not set, access to the truststore is still available but
  integrity checking is disabled.


  * Type: password
  * Default: “”
  * Importance: medium

`schema.registry.ssl.keystore.location`
: The location of the key store file. This is optional for the client and can be used for two-way authentication for the client.
  For example, `schema.registry.kafkastore.ssl.keystore.location=/etc/kafka/secrets/kafka.schemaregistry.keystore.jks`.


  * Type: string
  * Default: “”
  * Importance: medium

`schema.registry.ssl.keystore.password`
: The store password for the key store file. This is optional for the client and only needed if `ssl.keystore.location` is configured.


  * Type: password
  * Default: “”
  * Importance: medium

`schema.registry.ssl.key.password`
: The password of the private key in the key store file. This is optional for the client.


  * Type: password
  * Default: “”
  * Importance: medium


### GET /schemas/ids/{int: id}

Get the schema string identified by the input ID.

* **Parameters:**
  * **id** (*int*) – the globally unique identifier of the schema
  * **format** (*string*) – Desired output format, dependent on schema type. For AVRO schemas, valid values are: `""` (default) or `resolved`. For PROTOBUF schemas, valid values are: `""` (default), `ignore_extensions`, or `serialized`. (The parameter does not apply to JSON schemas.)
  * **subject** (*string*) – Add `?subject=<someSubjectName>` at the end of this request to look for the subject in all contexts starting with the default context, and return the schema with the id from that context. To learn more about contexts, see the [exporters](#schemaregistry-api-exporters) API reference and the quick start and concepts guides for [Schema Linking on Confluent Platform](../schema-linking-cp.md#schema-linking-cp-overview) and [Schema Linking on Confluent Cloud](/cloud/current/sr/schema-linking.html).
* **Response JSON Object:**
  * **schema** (*string*) – Schema string identified by the ID
* **Status Codes:**
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –
    * Error code 40403 – Schema not found
  * [500 Internal Server Error](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.1) –
    * Error code 50001 – Error in the backend datastore

**Example request**:

```http
GET /schemas/ids/1 HTTP/1.1
Host: schemaregistry.example.com
Accept: application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json
```

**Example response**:

```http
HTTP/1.1 200 OK
Content-Type: application/vnd.schemaregistry.v1+json

{
  "schema": "{\"type\": \"string\"}"
}
```


### GET /schemas/ids/{int: id}/schema

Retrieves only the schema identified by the input ID.

* **Parameters:**
  * **id** (*int*) – the globally unique identifier of the schema
  * **format** (*string*) – Desired output format, dependent on schema type. For AVRO schemas, valid values are: `""` (default) or `resolved`. For PROTOBUF schemas, valid values are: `""` (default), `ignore_extensions`, or `serialized`. (The parameter does not apply to JSON schemas.)
  * **subject** (*string*) – Add `?subject=<someSubjectName>` at the end of this request to look for the subject in all contexts starting with the default context, and return the schema with the ID from that context. To learn more about contexts, see the [exporters](#schemaregistry-api-exporters) API reference and the quick start and concepts guides for [Schema Linking on Confluent Platform](../schema-linking-cp.md#schema-linking-cp-overview) and [Schema Linking on Confluent Cloud](/cloud/current/sr/schema-linking.html).
* **Response JSON Object:**
  * **schema** (*string*) – Schema identified by the ID
* **Status Codes:**
  * [404 Not Found](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5) –
    * Error code 40403 – Schema not found
  * [500 Internal Server Error](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.1) –
    * Error code 50001 – Error in the backend datastore

**Example request**:

```http
GET /schemas/ids/1/schema HTTP/1.1
Host: schemaregistry.example.com
Accept: application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json
```

**Example response**:

```http
HTTP/1.1 200 OK
Content-Type: application/vnd.schemaregistry.v1+json

"string"
```


## Kafka producers and consumers for development and testing

The Confluent and open source Apache Kafka® scripts for basic actions on Kafka clusters and topics live in `$CONFLUENT_HOME/etc/bin`.
A full reference for Confluent premium command line tools and utilities is provided in [CLI Tools for Confluent Platform](/platform/current/installation/cli-reference.html).
These include Confluent provided producers and consumers that you can run locally against either self-managed locally installed Confluent Platform instance,
against the [Confluent Platform demo](/platform/current/tutorials/cp-demo/docs/overview.html), or Confluent Cloud clusters. In `$CONFLUENT_HOME/etc/bin`, you will find:

- `kafka-avro-console-consumer`
- `kafka-avro-console-producer`
- `kafka-protobuf-console-consumer`
- `kafka-protobuf-console-producer`
- `kafka-json-schema-console-consumer`
- `kafka-json-schema-console-producer`

These are provided in the same location along with the original, generic `kafka-console-consumer` and `kafka-console-producer`, which expect an Avro schema by default.
A reference for the open source utilities is provided in [Kafka Command-Line Interface (CLI) Tools](/kafka/operations-tools/kafka-tools.html).


### Topics and Schemas

Schemas are associated with Kafka topics, organized under subjects in Schema Registry. (See
[Terminology](../schema_registry_onprem_tutorial.md#schema-registry-terminology).)

The quick start below describes how to migrate Schema Registry and the schemas it contains,
but not Kafka topics.

For a continuous migration (extend to cloud), you need only do a schema migration, since
your topics continue to live in the primary, self-managed cluster.

For a one-time migration (lift and shift), you must follow schema migration
with topic migration, using [Replicator](../../multi-dc-deployments/replicator/index.md#replicator-detail) to migrate your topics to the
Confluent Cloud cluster, as mentioned in [Related Content](#sr-next-steps-topics) after the quick start.

The property `topic.rename.format` is described in [Destination Topics](../../multi-dc-deployments/replicator/configuration_options.md#rep-destination-topics) under [Replicator Configuration Reference for Confluent Platform](../../multi-dc-deployments/replicator/configuration_options.md#replicator-config-options).


### Single Datacenter Setup

Within a single datacenter or location, a multi-node, multi-broker cluster provides Kafka data
replication across the nodes.

Producers write and consumers read data to/from topic partition leaders. Leaders
replicate data to followers so that messages are copied to more than one broker.

You can configure parameters on producers and consumers to optimize your single
cluster deployment for various goals, including message durability and high
availability.

Kafka [producers can set the acks configuration
parameter](../installation/configuration/producer-configs.md#cp-config-producer) to control when a write is considered successful.
For example, setting producers to `acks=all` requires other brokers in the
cluster acknowledge receiving the data before the leader broker responds to the
producer.

If a leader broker fails, the Kafka cluster recovers when a follower broker is
elected leader and client applications can continue to write and read messages
through the new leader.


##### ACLs and Security

In a multi-DC setup with ACLs enabled, the schemas ACL topic must be replicated.

In the case of an outage, the ACLs will be cached along with the schemas. Schema Registry will continue to run READs with ACLs if the primary Kafka cluster goes down.

- For an overview of security strategies and protocols for Schema Registry, see [Secure Schema Registry for Confluent Platform](security/index.md#schemaregistry-security).
- To learn how to configure ACLs on roles related to Schema Registry, see [Schema Registry ACL Authorizer for Confluent Platform](../confluent-security-plugins/schema-registry/authorization/sracl_authorizer.md#confluentsecurityplugins-sracl-authorizer).
- To learn how to define Kafka topic based ACLs, see [Schema Registry Topic ACL Authorizer for Confluent Platform](../confluent-security-plugins/schema-registry/authorization/topicacl_authorizer.md#confluentsecurityplugins-topicacl-authorizer).
- To learn about using role-based authorization with Schema Registry, see [Configure Role-Based Access Control for Schema Registry in Confluent Platform](security/rbac-schema-registry.md#schemaregistry-rbac).
- To learn more about Replicator security, see [Security and ACL Configurations](../multi-dc-deployments/replicator/index.md#replicator-security-overview) in the Replicator documentation.


## Configuring Security for Schema ID Validation

In general, Schema Registry initiates the connection to the brokers. Schema ID Validation is unique in that
the broker(s) initiate the connection to Schema Registry. They do so in order to retrieve
schemas from the registry, and verify that the messages they receive from
producers match schemas associated with particular topics. With Schema ID Validation enabled,
the sequence of tasks looks something like this:

1. A broker receives a message from a producer, and sees that it’s directed to a topic that has a schema associated.
2. The broker initiates a connection to Schema Registry.
3. The broker asks for the schema associated with the topic (by schema ID).
4. Schema Registry receives the request, finds the requested schema in its schema storage, and returns it to the broker.
5. The broker validates the schema ID.

Therefore, to set up security on a cluster that has broker-side Schema ID Validation enabled on
topics, you must configure settings on the Kafka broker to support this broker-initiated
connection to Schema Registry. For multiple brokers, each broker must be configured. For example,
for mTLS, ideally you would have a different certificate for each broker.

Note that Schema Registry’s internal Kafka client to Kafka brokers is not relevant
at all to the connection between broker-side Schema ID Validation and Schema Registry’s HTTP listeners.
The security settings below do not reflect anything about the Schema Registry internal client-to-broker connection.

The broker configurations below include `confluent.schema.registry.url`, which tells the broker how to connect to Schema Registry.
You may already have configured this on your brokers, as a [prerequisite for using Schema Validation](#sv-set-sr-url-on-brokers).
The rest of the settings shown are specific to security configurations.


## Troubleshoot error  “Schema Registry is not set up”

If you get an error message on Control Center when you try to access a topic schema
(”Schema Registry is not set up”), first make sure that Schema Registry is running. Then verify that the
Schema Registry `listeners` configuration matches the Control Center `confluent.controlcenter.schema.registry.url`
configuration. Also check the HTTPS configuration parameters.

![image](images/c3-SR-not-set-up.png)

For more information, see [A schema for message values has not been set for this topic](https://docs.confluent.io/control-center/current/installation/troubleshooting.html#c3-schema-registry-not-set-up), and start-up
procedures for [Quick Start for Confluent Platform](../get-started/platform-quickstart.md#quickstart), or [Install Confluent Platform On-Premises](../installation/overview.md#installation),
depending on which one of these you are using to run Confluent Platform.


#### IMPORTANT
As of Confluent Platform 7.5, ZooKeeper is deprecated for new deployments. Confluent recommends KRaft mode for new deployments.
To learn more about running Kafka in KRaft mode, see [KRaft Overview for Confluent Platform](../kafka-metadata/kraft.md#kraft-overview), [KRaft Configuration for Confluent Platform](../kafka-metadata/config-kraft.md#configure-kraft), and the [Platform Quick Start](../get-started/platform-quickstart.md#cp-quickstart-step-1),
and [Settings for other Kafka and Confluent Platform components](../kafka-metadata/config-kraft.md#config-cp-components-kraft). The following example provides both KRaft (*combined mode*) configurations. Another example of running multi-cluster Schema Registry
in KRaft mode is shown in the [Schema Linking Quick Start for Confluent Platform](schema-linking-cp.md#schema-linking-cp-overview).
Note that KRaft combined mode is for local experimentation only and is not supported by Confluent.


### Auto Schema Registration


By default, client applications automatically register new schemas.
If they produce new messages to a new topic, then they will automatically try to register new schemas.
This is convenient in development environments, but in production environments it’s recommended that client applications do not automatically register new schemas.
Best practice is to register schemas outside of the client application to control when schemas are registered with Schema Registry and how they evolve.

Within the application, you can disable automatic schema registration by setting the configuration parameter `auto.register.schemas=false`, as shown in the following example.

```java
props.put(AbstractKafkaAvroSerDeConfig.AUTO_REGISTER_SCHEMAS, false);
```

To manually register the schema outside of the application, you can use Control Center.


First, create a new topic called `test` in the same way that you created a new topic called `transactions` earlier in the tutorial.
Then from the **Schema** tab, click **Set a schema** to define the new schema.
Specify values for:

* `namespace`: a fully qualified name that avoids schema naming conflicts
* `type`: [Avro data type](https://avro.apache.org/docs/1.8.1/spec.html#schemas), one of `record`, `enum`, `union`, `array`, `map`, `fixed`
* `name`: unique schema name in this namespace
* `fields`: one or more simple or complex data types for a `record`. The first field in this record is called `id`, and it is of type `string`. The second field in this record is called `amount`, and it is of type `double`.

If you were to define the same schema as used earlier, you would enter the following in the schema editor:

```java
{
  "type": "record",
  "name": "Payment",
  "namespace": "io.confluent.examples.clients.basicavro",
  "fields": [
    {
      "name": "id",
      "type": "string"
    },
    {
      "name": "amount",
      "type": "double"
    }
  ]
}
```

If you prefer to connect directly to the REST endpoint in Schema Registry, then to define a schema for a new subject for the topic `test`, run the command below.

```bash
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  --data '{"schema": "{\"type\":\"record\",\"name\":\"Payment\",\"namespace\":\"io.confluent.examples.clients.basicavro\",\"fields\":[{\"name\":\"id\",\"type\":\"string\"},{\"name\":\"amount\",\"type\":\"double\"}]}"}' \
  http://localhost:8081/subjects/test-value/versions
```

In this sample output, it creates a schema with id of `1`.:

```bash
{"id":1}
```


## Related content

- Blog post: [Ensure Data Quality and Data Evolvability with a Secured Schema Registry](https://www.confluent.io/blog/ensure-data-quality-and-evolvability-with-secured-schema-registry/)
- [Access Control (RBAC) for Schema Linking Exporters](../schema-linking-cp.md#cp-schema-linking-rbac)
- [Configure Metadata Service (MDS) in Confluent Platform](../../kafka/configure-mds/index.md#rbac-mds-config)
- [Use Role-Based Access Control (RBAC) for Authorization in Confluent Platform](../../security/authorization/rbac/overview.md#rbac-overview)
- [Configure Metadata Service (MDS) in Confluent Platform](../../kafka/configure-mds/index.md#rbac-mds-config)
- [Confluent CLI](https://docs.confluent.io/confluent-cli/current/installing.html)
- [Role-Based Access Control for Confluent Platform Quick Start](../../security/authorization/rbac/rbac-cli-quickstart.md#rbac-cli-quickstart)
- [Use Predefined RBAC Roles in Confluent Platform](../../security/authorization/rbac/rbac-predefined-roles.md#rbac-predefined-roles)
- [Schema Registry Security Plugin for Confluent Platform](../../confluent-security-plugins/schema-registry/introduction.md#confluentsecurityplugins-schema-registry-security-plugin)
- [Operation and Resource Support for Schema Registry in Confluent Platform](../../confluent-security-plugins/schema-registry/authorization/index.md#confluentsecurityplugins-schema-registry-authorization)


### Clients

The new Producer and Consumer clients support security for Kafka versions 0.9.0 and higher.

If you are using the Kafka Streams API, you can read on how to configure equivalent
[SSL](/platform/current/clients/javadocs/javadoc/org/apache/kafka/common/config/SslConfigs.html) and
[SASL](/platform/current/clients/javadocs/javadoc/org/apache/kafka/common/config/SaslConfigs.html) parameters.

1. Configure the following properties in a client properties file `client.properties`.
   ```bash
   sasl.mechanism=GSSAPI
   # Configure SASL_SSL if TLS/SSL encryption is enabled, otherwise configure SASL_PLAINTEXT
   security.protocol=SASL_SSL
   ```
2. Configure a service name that matches the primary name of the Kafka server configured in the broker JAAS file.
   ```bash
   sasl.kerberos.service.name=kafka
   ```
3. Configure the JAAS configuration property with a unique principal, i.e., usually the same name as the user running the client, and keytab, i.e., secret key, for each client.
   ```bash
   sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
       useKeyTab=true \
       storeKey=true \
       keyTab="/etc/security/keytabs/kafka_client.keytab" \
       principal="kafkaclient1@EXAMPLE.COM";
   ```
4. For command-line utilities like `kafka-console-consumer` or `kafka-console-producer`, `kinit` can be used along with `useTicketCache=true`.
   ```bash
   sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required \
       useTicketCache=true;
   ```


# Configure Clients for SASL/OAUTHBEARER authentication in Confluent Platform

To configure Confluent Platform and Kafka clients to use SASL/OAUTHBEARER authentication with
TLS encryption when connecting to Confluent Server brokers, add the following properties
to your client’s `properties` file, replacing the placeholders with your
actual values:

```none
sasl.mechanism=OAUTHBEARER
security.protocol=SASL_SSL
ssl.truststore.location=<path/to/client.truststore.jks>
ssl.truststore.password=<truststore-password>
sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginCallbackHandler
sasl.login.connect.timeout.ms=15000 # optional
sasl.oauthbearer.token.endpoint.url=<idp-token-endpoint>
sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  clientId="<client-id>" \
  clientSecret="<client-secret>" \
  scope="<scope>"; # optional
```

The optional `scope` parameter defines the level of access the client is
requesting, but is required if your identity provider does not have a default
scope or your groups claim is linked to a scope.

For Kafka Java clients supporting SASL OAUTHBEARER, allow specific IdP endpoints by setting the following configuration property:

```properties
org.apache.kafka.sasl.oauthbearer.allowed.urls=<idp_jwks_url>,<idp_token_url>,...
```

This property specifies a comma-separated list of allowed IdP JWKS (JSON Web Key
Set) and token endpoint URLs. Use \* (asterisk) as the value to allow any endpoint.

```properties
org.apache.kafka.sasl.oauthbearer.allowed.urls=*
```

You should consult the specific Kafka client and IdP documentation for the
exact interpretation and security implications of such a broad setting.

Java applications should set this property as a JVM system property when
launching the application:

```bash
-Dorg.apache.kafka.sasl.oauthbearer.allowed.urls=<idp_jwks_url>,<idp_token_url>,...
```

For other clients (for example, Python, Go, .NET) that are built on
librdkafka, these clients use different property names and configuration
mechanisms. So, refer to specific client library documentation for the
equivalent OAuthBEARER configuration properties.

For details on the client configuration properties used in this example, see
[Client Configuration Properties for Confluent Platform](../../../../clients/client-configs.md#client-producer-consumer-config-recs-cp).


### Configure JavaScript clients

Configure your Node.js client with UAMI-specific properties using the
confluent-kafka-javascript client:

```javascript
const { Kafka } = require('@confluentinc/kafka-javascript').KafkaJS;

const bootstrapServers = '<BOOTSTRAP_SERVERS>';
// Azure IMDS API version - use 2025-04-07 or later
const azureIMDSApiVersion = '2025-04-07';
const bootstrapEndpoint = '<BOOTSTRAP_ENDPOINT>';
const uamiClientId = '<UAMI_CLIENT_ID>';
const azureIMDSQueryParams =
    `api-version=${azureIMDSApiVersion}&resource=${bootstrapEndpoint}&client_id=${uamiClientId}`;
const logicalCluster = '<your-logical-cluster>';
const identityPoolId = '<your-identity-pool-id>';

const kafka = new Kafka({
    'bootstrap.servers': bootstrapServers,
    'security.protocol': 'SASL_SSL',
    'sasl.mechanisms': 'OAUTHBEARER',
    'sasl.oauthbearer.method': 'oidc',
    'sasl.oauthbearer.metadata.authentication.type': 'azure_imds',
    'sasl.oauthbearer.config': `query=${azureIMDSQueryParams}`,
    'sasl.oauthbearer.extensions': `logicalCluster=${logicalCluster},identityPoolId=${identityPoolId}`
});

const producer = kafka.producer();
await producer.connect();
```


## Related content

- [Use Centralized ACLs with MDS for Authorization in Confluent Platform](../rbac/authorization-acl-with-mds.md#authorization-acl-with-mds)
- [Schema Registry ACL Authorizer for Confluent Platform](../../../confluent-security-plugins/schema-registry/authorization/sracl_authorizer.md#confluentsecurityplugins-sracl-authorizer)
- [Confluent Replicator to Confluent Cloud ACL Configurations](/cloud/current/get-started/examples/ccloud/docs/replicator-to-cloud-configuration-types.html)
- [Configure Authorization of ksqlDB with Kafka ACLs](../../../ksqldb/operate-and-deploy/installation/security.md#ksqldb-installation-security-auth-with-acls)
- [Required ACL setting for secure Kafka clusters](../../../streams/developer-guide/security.md#streams-developer-guide-security-acls)
- [Cluster Linking Authorization (ACLs)](../../../multi-dc-deployments/cluster-linking/security.md#cluster-link-acls)
- [Configure Control Center to work with Kafka ACLs on Confluent Platform](/control-center/current/security/config-c3-for-kafka-acls.html)
- [Confluent CLI confluent iam acl](https://docs.confluent.io/confluent-cli/current/command-reference/iam/acl/index.html#confluent-iam-acl)
- [Access Control Lists (ACLs) for Confluent Cloud](/cloud/current/access-management/acl.html)


### Authentication and group-based authorization using an LDAP server

A Kerberos-enabled LDAP server (for example, Active Directory or Apache Directory Server) may be used for authentication as
well as group-based authorization if users and groups are managed by this server. The instructions below use
`SASL/GSSAPI` for authentication using AD or DS and obtain group membership of the users from the same server.

The example is based on the assumption that you have the following three user principals and keytabs for these principals:

* `kafka/localhost@EXAMPLE.COM`: Service principal for brokers
* `alice@EXAMPLE.COM`: Client principal, member of group `Kafka Developers`
* `ldap@EXAMPLE.COM` : Principal used by LDAP Authorizer

Note that the user principal used for authorization is the local name (for example, `kafka`, `alice`) by default and
these short principals are used to determine group membership. Brokers may be configured with custom
`principal.builder.class` or `sasl.kerberos.principal.to.local.rules` to override this behavior.
The attributes used for mapping users to groups may also be customized to match your LDAP server.

If you have already started the broker using `SASL/SCRAM-SHA-256` following the instructions above,
stop the server first. The instructions below are based on the assumption that you have already updated configuration for brokers,
producers, and consumers as described earlier.

Configure listeners to use `GSSAPI` by updating the following properties in your broker configuration file
(for example, `etc/kafka/server.properties`).

```bash
sasl.enabled.mechanisms=GSSAPI
sasl.mechanism.inter.broker.protocol=GSSAPI
sasl.kerberos.service.name=kafka
listener.name.sasl_plaintext.gssapi.sasl.jaas.config= \
  com.sun.security.auth.module.Krb5LoginModule required \
  keyTab="/tmp/keytabs/kafka.keytab" \
  principal="kafka/localhost@EXAMPLE.COM" \
  debug="true" \
  storeKey="true" \
  useKeyTab="true";
```

Add or update the following properties in your producer and consumer configuration files (e.g. `etc/kafka/producer.properties`
and `etc/kafka/consumer.properties`)

```bash
sasl.mechanism=GSSAPI
sasl.kerberos.service.name=kafka
sasl.jaas.config= com.sun.security.auth.module.Krb5LoginModule required \
  keyTab="/tmp/keytabs/alice.keytab" \
  principal="alice@EXAMPLE.COM" \
  debug="true" \
  storeKey="true" \
  useKeyTab="true";
```

Restart the broker, and run the producer and consumer as described earlier. Producers and consumers
are now authenticated using your Kerberos server. Group information is also obtained from the same
server using LDAP.


## RBAC benefits

RBAC helps you:

* Manage security access across|cp| including Kafka, ksqlDB, Connect,
  Schema Registry, Confluent Control Center and Confluent Platform for Apache Flink® by using granular permissions to control user and group access.
  For example, with RBAC you can specify permissions for each connector or Flink job in a cluster,
  making it easier to get multiple workloads up and running.
* Manage authorization at scale. Administrators can centrally manage the
  assignment of predefined roles, and also delegate the responsibility
  of managing access and permissions to the different departments or business units
  who are the true owners and most familiar with those resources.
* Centrally manage authentication and authorization for multiple clusters, which
  includes: MDS, Kafka Clusters, Connect, ksqlDB, Schema Registry clusters, Confluent Platform for Apache Flink applications, and a single
  Confluent Control Center.


## Connect

To configure [Connect RBAC](../../../connect/rbac/connect-rbac-getting-started.md#connect-rbac-getting-started) role bindings using the REST API:

1. Get the MDS token:
   ```none
   curl --cacert <path-to-your-cacert> --key <path-to-your-private-key> --cert <path-to-your-cert> -u <mds-super-user>:<mds-super-user-password> -s https://<localhost>:8090/security/1.0/authenticate
   ```
2. Grant the Security Admin role to a Connect cluster:
   ```none
   curl --cacert <path-to-your-cacert> --key <path-to-your-private-key> --cert <path-to-your-cert> -X POST https://<mds-hostname>:8090/security/1.0/principals/User:<your-connect-principal>/roles/SecurityAdmin -H "accept: application/json" -H "Authorization: Bearer <your-token>" -H "Content-Type: application/json" -d '{"clusters":{"kafka-cluster":"<kafka-cluster-id>","connect-cluster":"<connect-cluster-id>"}}
   ```
3. Grant the Connect user the ResourceOwner role on the group that Connect
   nodes use to coordinate across the cluster:
   ```none
   curl --cacert <path-to-your-cacert> --key <path-to-your-private-key> --cert <path-to-your-cert> -X POST https://<mds-hostname>:8090/security/1.0/principals/User:<connect-principal>/roles/ResourceOwner/bindings -H "accept: application/json" -H "Authorization: Bearer <your-token>" -H "Content-Type: application/json" -d '{"scope":{"clusters":{"kafka-cluster":"<kafka-cluster-id>"}},"resourcePatterns":[{"resourceType":"Group","name":"connect-cluster","patternType":"LITERAL"}]}'
   ```
4. Grant the Resource Owner role on the configuration storage topic:
   ```none
   curl --cacert <path-to-your-cacert> --key <path-to-your-private-key> --cert <path-to-your-cert> -X POST https://<mds-hostname>:8090/security/1.0/principals/User:<connect-principal>/roles/ResourceOwner/bindings -H "accept: application/json" -H "Authorization: Bearer <your-token>" -H "Content-Type: application/json" -d '{"scope":{"clusters":{"kafka-cluster":"<kafka-cluster-id>"}},"resourcePatterns":[{"resourceType":"Topic","name":"connect-configs","patternType":"LITERAL"}]}'
   ```
5. Grant the Resource Owner role on the offset storage topic:
   ```none
   curl --cacert <path-to-your-cacert> --key <path-to-your-private-key> --cert <path-to-your-cert> -X POST https://<mds-hostname>:8090/security/1.0/principals/User:<connect-principal>/roles/ResourceOwner/bindings -H "accept: application/json" -H "Authorization: Bearer <your-token>" -H "Content-Type: application/json" -d '{"scope":{"clusters":{"kafka-cluster":"<kafka-cluster-id>"}},"resourcePatterns":[{"resourceType":"Topic","name":"connect-offsets","patternType":"LITERAL"}]}'
   ```
6. Grant the Resource Owner role on the status storage topic:
   ```none
   curl --cacert <path-to-your-cacert> --key <path-to-your-private-key> --cert <path-to-your-cert> -X POST https://<mds-hostname>:8090/security/1.0/principals/User:<connect-principal>/roles/ResourceOwner/bindings -H "accept: application/json" -H "Authorization: Bearer <your-token>" -H "Content-Type: application/json" -d '{"scope":{"clusters":{"kafka-cluster":"<kafka-cluster-id>"}},"resourcePatterns":[{"resourceType":"Topic","name":"connect-status","patternType":"LITERAL"}]}'
   ```


### Verify the audit log retention setting

These procedures only affect your retention policy.  It is recommended that
you make minor changes only.

1. Use the Confluent CLI to modify the audit log configuration and update the
   `retention_ms` of one or more destination topics:
   ```json
   # Capture the current configuration from MDS
   confluent audit-log config describe > /tmp/audit-log-config.json

   # View what was captured
   cat /tmp/audit-log-config.json
   {
     "destinations": {
       "bootstrap_servers": [
         "logs1.example.com:9092",
         "logs2.example.com:9092"
       ],
       "topics": {
         "confluent-audit-log-events": {
           "retention_ms": 7776000000
         }
       }
     },
     "default_topics": {
       "allowed": "confluent-audit-log-events",
       "denied": "confluent-audit-log-events"
     }
   }

   # Make a small change
   vim /tmp/audit-log-config.json # e.g. - change 7776000000 to 7776000001

   # Post the change back to MDS
   confluent audit-log config update < /tmp/audit-log-config.json
   ```
2. Verify that the topic’s `retention.ms` setting reflects the new value on the
   destination cluster:
   ```json
   cat /tmp/destination-cluster-admin-client.properties

   bootstrap.servers=<logs1.example.com:9092>
   security.protocol=SASL_SSL
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
     username="<destination-cluster-admin-client-username>" \
     password="<destination-cluster-admin-client-password>";
   ssl.endpoint.identification.algorithm=https
   sasl.mechanism=PLAIN
   truststore.location=<path-to-truststore.jks>
   truststore.password=<trust-store-password>

   kafka-topics --bootstrap-server <logs1.example.com:9092> \
   --command-config /tmp/destination-cluster-admin-client.properties \
   --describe --topic <confluent-audit-log-events>

   Topic: confluent-audit-log-events  PartitionCount: 6 ReplicationFactor: 3
      Configs: min.insync.replicas=2,cleanup.policy=delete,retention.ms=7776000001
   Topic: confluent-audit-log-events Partition: 0  Leader: 2 Replicas: 2,1,0 Isr: 2,0,1
   Topic: confluent-audit-log-events Partition: 1  Leader: 1 Replicas: 1,0,2 Isr: 2,0,1
   Topic: confluent-audit-log-events Partition: 2  Leader: 0 Replicas: 0,2,1 Isr: 2,0,1
   Topic: confluent-audit-log-events Partition: 3  Leader: 2 Replicas: 2,0,1 Isr: 2,0,1
   Topic: confluent-audit-log-events Partition: 4  Leader: 1 Replicas: 1,2,0 Isr: 2,0,1
   Topic: confluent-audit-log-events Partition: 5  Leader: 0 Replicas: 0,1,2 Isr: 2,0,1
   ```
3. Alter the `retention.ms` value of one of the destination topics directly
   on the destination cluster:
   ```json
   kafka-topics --bootstrap-server <destination-cluster>:9092 \
   --command-config /tmp/destination-cluster-admin-client.properties \
   --alter --topic confluent-audit-log-events \
   --config retention.ms=7776000002
   ```
4. Verify that the audit log configuration shows the new `retention_ms` setting:
   ```json
   confluent audit-log config describe

   {
     "destinations": {
       "bootstrap_servers": [
         "logs1.example.com:9092",
         "logs2.example.com:9092"
       ],
       "topics": {
         "confluent-audit-log-events": {
           "retention_ms": 7776000002
         }
       }
     },
     "default_topics": {
       "allowed": "confluent-audit-log-events",
       "denied": "confluent-audit-log-events"
     }
   }
   ```

If this troubleshooting procedure doesn’t work (for example, if audit logging is
not configured properly, you will get an error when you attempt to run the `describe`
command), check to ensure that the connection and credentials in your MDS broker
properties (the properties prefixed by
`confluent.security.event.logger.destination.admin.`) are working. Also verify
that you’ve granted sufficient permissions to the admin client principal on the
destination cluster. Note that the minimum role binding should grant
the ResourceOwner role on topics with the prefix `confluent-audit-log-events`
on the destination cluster. Finally, confirm that the destination cluster is
reachable and listening for connections from the MDS cluster’s network address.


### Verify the audit log configuration is synchronized to registered clusters

Use the following command to verify that the audit log configuration is
synchronized to registered clusters:

```none
kafka-configs --bootstrap-server <mds-managed-cluster>:9092 \
--command-config /tmp/managed-cluster-admin-client.properties \
--entity-type brokers \
--entity-default \
--describe \
| grep confluent.security.event.router.config
```

You should see the same JSON audit log configuration you get when you run
`confluent audit-log config describe`. It is possible that `retention_ms`
values may differ if the audit topics have been altered directly on the
destination cluster, in which case the metadata in the JSON may also be different.
Everything else should be the same. If this verification fails check the following
for the MDS cluster registry:

- The clusters expose an auth token listener
  (`listener.name.<example>.sasl.enabled.mechanism=OAUTHBEARER`)
- The clusters’ TLS keys are verifiable by certificates in the MDS
  server’s trust store.

Also look for error status messages when making an audit log API update request
to MDS.


## Identity and access management

Confluent Platform offers several built-in features to help you
enforce who can access your Confluent cluster and what they can do is
foundational to security.

- The role-based access control (RBAC) system lets you assign roles like “ClusterAdmin” or
  “DeveloperRead” to users and service accounts. You can scope permissions to
  individual clusters, topics, or consumer groups.
- For environments not using RBAC, can use Apache Kafka® Access Control Lists
  (ACLs) to control producer and consumer access at the topic or group level.
  ACLs also provide compatibility for existing Kafka security setups.
- OAuth 2.0 supported integrations with identity providers like Okta, Keycloak,
  and Entra ID allow centralized user management as well as single
  sign-on (SSO) for Confluent Cloud.
- Confluent supports TLS for secure communication and can enforce mutual
  authentication (mTLS) between clients and brokers. By issuing client
  certificates, you can authenticate both ends of every connection.


### Step 2 - Start the producer

To start the producer, run the `kafka-avro-console-producer` command for the KMS
provider that you want to use, where `<bootstrap-url>` is the bootstrap URL for
your Confluent Platform cluster and `<schema-registry-url>` is the URL for your Schema Registry
instance.

```shell
./bin/kafka-avro-console-producer --bootstrap-server <bootstrap-url> \
  --property schema.registry.url=<schema-registry-url> \
  --topic test  \
  --producer.config config.properties \
  --property basic.auth.credentials.source=USER_INFO \
  --property basic.auth.user.info=${SR_API_KEY}:${SR_API_SECRET} \
  --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string","confluent:tags":["PII"]}]}' \
  --property value.rule.set='{ "domainRules": [ { "name": "encryptPII", "type": "ENCRYPT", "tags":["PII"], "params": { "encrypt.kek.name": "aws-kek1", "encrypt.kms.key.id": "arn:aws:kms:us-east-1:xxxx:key/xxxx", "encrypt.kms.type": "aws-kms" }, "onFailure": "ERROR,NONE"}]}'
```


### Overview

All components and clients in `cp-demo` make full use of Confluent Platform’s extensive [security features](../../security/overview.md#security).

- [Role-Based Access Control (RBAC)](../../security/authorization/rbac/overview.md#rbac-overview) for authorization. Give principals access to resources using role-bindings.

  #### NOTE
  RBAC is powered by the [Metadata Service (MDS)](../../kafka/configure-mds/index.md#rbac-mds-config) which uses Confluent Server Authorizer to connect to an OpenLDAP directory service. This enables group-based authorization for scalable access management.
- [SSL](../../security/authentication/mutual-tls/overview.md#kafka-ssl-authentication) for encryption and mTLS for authentication. The example [automatically generates](https://github.com/confluentinc/cp-demo/tree/latest/scripts/security/certs-create.sh) SSL certificates and creates keystores, truststores, and secures them with a password.
- [HTTPS for Control Center](https://docs.confluent.io/platform/current/control-center/installation/configuration.html#https-settings).
- [HTTPS for Schema Registry](../../schema-registry/security/index.md#schemaregistry-security).
- [HTTPS for Connect](../../connect/security.md#connect-security).

You can see each component’s security configuration in the example’s [docker-compose.yml](https://github.com/confluentinc/cp-demo/tree/latest/docker-compose.yml) file.


### Embedded REST Proxy

For the next few steps, use the REST Proxy that is embedded on the Kafka brokers. Only [REST Proxy API v3](../../kafka-rest/api.md#rest-proxy-v3) is supported.

1. Create a role binding for the client to be granted `ResourceOwner` role for the topic `dev_users`.

   Get the Kafka cluster ID:
   ```none
   KAFKA_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id --tlsv1.2 --cacert scripts/security/snakeoil-ca-1.crt | jq -r ".id")
   ```

   Create the role binding:
   ```text
   # Create the role binding for the topic ``dev_users``
   docker compose exec tools bash -c "confluent iam rbac role-binding create \
       --principal User:appSA \
       --role ResourceOwner \
       --resource Topic:dev_users \
       --kafka-cluster-id $KAFKA_CLUSTER_ID"
   ```
2. Create the topic `dev_users` with embedded REST Proxy.

   Get the Kafka cluster ID:
   ```none
   KAFKA_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id --tlsv1.2 --cacert scripts/security/snakeoil-ca-1.crt | jq -r ".id")
   ```

   Use `curl` to create the topic:
   ```text
   docker exec restproxy curl -s -X POST \
      -H "Content-Type: application/json" \
      -H "accept: application/json" \
      -d "{\"topic_name\":\"dev_users\",\"partitions_count\":64,\"replication_factor\":2,\"configs\":[{\"name\":\"cleanup.policy\",\"value\":\"compact\"},{\"name\":\"compression.type\",\"value\":\"gzip\"}]}" \
      --cert /etc/kafka/secrets/mds.certificate.pem \
      --key /etc/kafka/secrets/mds.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      -u appSA:appSA \
      "https://kafka1:8091/kafka/v3/clusters/$KAFKA_CLUSTER_ID/topics" | jq
   ```
3. List topics with embedded REST Proxy to find the newly created `dev_users`.

   Get the Kafka cluster ID:
   ```none
   KAFKA_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id --tlsv1.2 --cacert scripts/security/snakeoil-ca-1.crt | jq -r ".id")
   ```

   Use `curl` to list the topics:
   ```text
   docker exec restproxy curl -s -X GET \
      -H "Content-Type: application/json" \
      -H "accept: application/json" \
      --cert /etc/kafka/secrets/mds.certificate.pem \
      --key /etc/kafka/secrets/mds.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      -u appSA:appSA \
      https://kafka1:8091/kafka/v3/clusters/$KAFKA_CLUSTER_ID/topics | jq '.data[].topic_name'
   ```

   Your output should resemble below.  Output may vary, depending on other topics you may have created, but at least you should see the topic `dev_users` created in the previous step.
   ```text
   "_confluent-monitoring"
   "dev_users"
   "users"
   "wikipedia-activity-monitor-KSTREAM-AGGREGATE-STATE-STORE-0000000003-changelog"
   "wikipedia-activity-monitor-KSTREAM-AGGREGATE-STATE-STORE-0000000003-repartition"
   "wikipedia.failed"
   "wikipedia.parsed"
   "wikipedia.parsed.count-by-domain"
   "wikipedia.parsed.replica"
   ```


## Use case

The use case for this application is an Kafka event streaming application that processes real-time edits to real Wikipedia pages.
The following image shows the application topology:

![image](tutorials/cp-demo/images/cp-demo-overview-with-ccloud.svg)

The full event streaming platform based on Confluent Platform is described as follows:

1. Wikimedia’s [EventStreams](https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams) publishes a continuous stream of real-time edits happening to real wiki pages.
2. A Kafka source connector [kafka-connect-sse](https://www.confluent.io/hub/cjmatta/kafka-connect-sse) streams the server-sent events (SSE) from [https://stream.wikimedia.org/v2/stream/recentchange](https://stream.wikimedia.org/v2/stream/recentchange), and a custom Connect transform [kafka-connect-json-schema](https://www.confluent.io/hub/jcustenborder/kafka-connect-json-schema)
   extracts the JSON from these messages and then are written to a Kafka cluster.
3. Data processing is done with [ksqlDB](../../ksqldb/overview.md#ksql-home) and a [Kafka Streams](../../streams/overview.md#kafka-streams) application.
4. A Kafka sink connector [kafka-connect-elasticsearch](https://www.confluent.io/hub/confluentinc/kafka-connect-elasticsearch) streams the data out of Kafka and is materialized into [Elasticsearch](https://www.elastic.co/products/elasticsearch) for analysis by [Kibana](https://www.elastic.co/products/kibana).

All data is in Avro format, uses Confluent Schema Registry, and [Confluent Control Center](https://www.confluent.io/product/control-center/) is managing and monitoring the deployment.


#### Requirements

* [An OIDC-compliant identity provider (IdP)](https://docs.confluent.io/platform/current/security/authentication/sso-for-c3/configure-sso-using-oidc.html#step-1-establish-a-trust-relationship-between-cp-and-identity-provider).
* Port 8090 must be opened on the Kafka brokers and accessible by all hosts.
* Set up one principal in OIDC for the MDS admin user to bootstrap roles and
  permissions for the Confluent Platform component principals. It is recommended that you
  create a user named `superuser`.
* Set up one principal per Confluent Platform component in your OIDC server. These users are
  used by the Confluent Platform components to authenticate to MDS and access their respective
  resources. In the below examples, the following component users are used:
  * Confluent Server: `kafka_broker`
  * Schema Registry: `schema_registry`
  * Connect: `kafka_connect`
  * ksqlDB: `ksql`
  * REST Proxy: `kafka_rest`
  * Confluent Server REST API: `kafka_broker`
  * Control Center: `control_center`
* Set up Confluent Platform with [OAuth/OIDC authentication](ansible-authenticate.md#ansible-oauth).


## Metrics API

The [Confluent Cloud Metrics](../monitoring/metrics-api.md#metrics-api) provides programmatic access to actionable metrics for
your Confluent Cloud deployment, including server-side metrics for the
Confluent-managed services. However, the Metrics API does not allow you to get
client-side metrics. To retrieve client-side metrics, see [Producers](#ccloud-monitoring-producers) and [Consumers](#ccloud-monitoring-consumers).

The Metrics API, enabled by default, aggregates metrics at the topic and cluster
level. Any authorized user can gain access to the metrics that allow you to
monitor overall usage and performance. To get started with the Metrics API, see
the [Confluent Cloud Metrics](../monitoring/metrics-api.md#metrics-api) documentation.

You can use the Metrics API to query metrics at the following granularities (other
resolutions are available if needed):

- Bytes produced per minute grouped by topic
- Bytes consumed per minute grouped by topic
- Max retained bytes per hour over two hours for a given topic
- Max retained bytes per hour over two hours for a given cluster

You can retrieve the metrics easily over the internet using HTTPS, capturing
them at regular intervals to get a time series and an operational view of
cluster performance. You can integrate the metrics into any cloud provider
monitoring tools like [Azure Monitor](https://azure.microsoft.com/en-us/services/monitor/#product-overview),
[Google Cloud’s operations suite](https://cloud.google.com/products/operations) (formerly Stackdriver), or
[Amazon CloudWatch](https://aws.amazon.com/cloudwatch/), or into existing
monitoring systems like [Prometheus](https://prometheus.io/) and [Datadog](https://www.datadoghq.com/), and then plot them in a time series graph to
see usage over time. When writing your own application to use the Metrics API,
see the [full API specification](https://api.telemetry.confluent.cloud/docs)
to use advanced features.


## Schema Management and Evolution

There is an implicit contract that Kafka producers write data with a schema that can be read by Kafka consumers, even as producers and consumers evolve their schemas.
Kafka applications depend on these schemas and expect that any changes made to schemas are still compatible and able to run.
This is where Confluent Schema Registry helps: It provides centralized schema management and compatibility checks as schemas evolve.

If your application is using Schema Registry, you can simulate a Schema Registry instance in your unit testing.
Use Confluent Schema Registry’s [MockSchemaRegistryClient](https://github.com/confluentinc/schema-registry/blob/master/client/src/main/java/io/confluent/kafka/schemaregistry/client/SchemaRegistryClient.java) to register and retrieve schemas that enable you to serialize and deserialize data.
`MockSchemaRegistryClient` example in [Kafka Tutorial](https://developer.confluent.io/tutorials/create-stateful-aggregation-minmax/kstreams.html).

As you start building examples of streaming applications and tests using the Kafka Streams API along with Schema Registry, for integration testing, use any of the tools described in [Integration Testing](#ccloud-testing-integration).

After your applications are running in production, schemas may evolve but still need to be compatible for all applications that rely on both old and new versions of a schema.
Confluent Schema Registry allows for [schema evolution and provides compatibility checks](/cloud/current/sr/fundamentals/schema-evolution.html) to ensure that the contract between producers and consumers is not broken.
This allows producers and consumers to update independently as well as evolve their schemas independently, with assurances that they can read new and legacy data.
Confluent provides a [Schema Registry Maven plugin](/cloud/current/sr/develop/maven-plugin.html), which checks the compatibility of a new schema against previously registered schemas.
Refer to an example of this plugin in [Java example client pom.xml](https://github.com/confluentinc/examples/blob/latest/clients/cloud/java/pom.xml).


# Manage Kafka Cluster Configuration Settings in Confluent Cloud

This topic describes the default Apache Kafka® cluster configuration settings in Confluent Cloud. For a complete
description of all Kafka configurations, see [Confluent Platform Configuration Reference](/platform/current/installation/configuration/index.html).

Considerations:

- You cannot edit cluster settings on Confluent Cloud on Basic, Standard, Enterprise, and Freight clusters, but many configuration settings are available at the topic
  level instead. For more information, see [Manage Topics in Confluent Cloud](../topics/overview.md#cloud-topics-manage).
- You can change some configuration settings on Dedicated clusters using the Confluent CLI or REST API. See [Change cluster settings for Dedicated clusters](#custom-settings-dedicated).
- The default maximum timeout for registered consumers is different for Confluent Cloud Kafka clusters than for Confluent Platform clusters and cannot be changed.
  - `group.max.session.timeout.ms` default is 1200000 ms (20 minutes)


## Cluster limit comparison

Use the table below to compare cluster limits across cluster types.

For Enterprise clusters, the following table shows the current maximum (10 eCKU). If you’re participating
in the 32 eCKU Limited Availability for Enterprise clusters, your cluster limits are higher.

| Dimension                                                                 | [Basic](#basic-cluster)   | [Standard](#standard-cluster)   | [Enterprise](#enterprise-cluster)                                     | [Dedicated](#dedicated-cluster)   | [Freight](#freight-cluster)   |
|---------------------------------------------------------------------------|---------------------------|---------------------------------|-----------------------------------------------------------------------|-----------------------------------|-------------------------------|
| [Maximum eCKU/CKU](#min-max-ecku)                                         | 50                        | 10                              | 10 <sub>(current maximum)</sub>/ 32 <sub>(Limited Availability)</sub> | 152 <sub>†</sub>                  | 152                           |
| Ingress (MBps) <sub>\*</sub> <sub>†</sub>                                 | 250                       | 250                             | 600                                                                   | 9,120                             | 9,120                         |
| Egress (MBps) <sub>\*</sub> <sub>†</sub>                                  | 750                       | 750                             | 1800                                                                  | 27,360                            | 27,360                        |
| Partitions (pre-replication) <sub>\*</sub> <sub>†</sub>                   | 1500                      | 2500                            | 30,000                                                                | 100,000                           | 50,000                        |
| Number of partitions you can compact <sub>\*</sub> <sub>†</sub>           | 1500                      | 2500                            | 3,600                                                                 | 100,000                           | None                          |
| Total client connections <sub>\*</sub> <sub>†</sub>                       | 1000                      | 10,000                          | 180,000                                                               | 2,736,000                         | 2,736,000                     |
| Connection attempts (per second) <sub>\*</sub> <sub>†</sub>               | 80                        | 800                             | 5,000                                                                 | 76,000                            | 76,000                        |
| Requests (per second) <sub>\*</sub> <sub>†</sub>                          | 15,000                    | 15,000                          | 75,000                                                                | 2,280,000                         | 2,280,000                     |
| Message size (MB)                                                         | 8                         | 8                               | 20                                                                    | 20                                | 20                            |
| Client version (minimum)                                                  | 0.11.0                    | 0.11.0                          | 0.11.0                                                                | 0.11.0                            | 0.11.0                        |
| Request size (MB)                                                         | 100                       | 100                             | 100                                                                   | 100                               | 100                           |
| Fetch bytes (MB)                                                          | 55                        | 55                              | 55                                                                    | 55                                | 55                            |
| API keys                                                                  | 50                        | 100                             | 500                                                                   | 2,000                             | 500                           |
| Partition creation and deletion (per five minute period)                  | 250                       | 500                             | 500                                                                   | 5,000                             | 500                           |
| Connector tasks per Kafka cluster                                         | 250 <sub>\*†</sub>        | 250                             | 250                                                                   | 250                               | N/A                           |
| ACLs                                                                      | 1,000                     | 1,000                           | 4,000                                                                 | 10,000                            | 10,000                        |
| Kafka REST Produce v3 - Max throughput (MBps):                            | 10                        | 10                              | 10                                                                    | 7,600 <sub>†</sub>                | 10                            |
| Kafka REST Produce v3 - Max connection requests (per second):             | 25                        | 25                              | 25                                                                    | 45,600 <sub>†</sub>               | 25                            |
| Kafka REST Produce v3 - Max streamed requests (per second):               | 1000                      | 1000                            | 1000                                                                  | 456,000 <sub>†</sub>              | 1000                          |
| Kafka REST Produce v3 - Max message size for Kafka REST Produce API (MB): | 8                         | 8                               | 8                                                                     | 20                                | 8                             |
| Kafka REST Admin v3 - Max connection requests (per second):               | 25                        | 25                              | 25                                                                    | 45,600 <sub>†</sub>               | 25                            |

\* Limit based on Elastic Confluent Unit for Kafka (eCKU). You only pay for the capacity you use up to the limit. For more information, see [Elastic Confluent Unit for Kafka](../billing/overview.md#e-cku-definition).

† Limit based on a Dedicated Kafka cluster with 152 CKU. For more information, see [CKU purchase limits](#cku-limits-per-cluster) and [Confluent Unit for Kafka](../billing/overview.md#cku-definition).

\*† Basic clusters are limited to one task per connector. You can deploy 250 connectors to a Basic cluster but each connector can only have one task. If you need more than one task, upgrade your cluster.

The capabilities provided in this topic are for planning purposes, and are not a guarantee of performance, which varies depending on each unique configuration.


### Resources you can manage in code

- [API keys](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/apikey)
- [Connectors](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/connector)
- [Confluent Cloud Environments](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/environment/)
- [Kafka ACLs](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/kafkaacl/)
- [Kafka clusters](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/kafkacluster/)
- [Kafka topics](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/kafkatopic/)
- [Networks](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/network/)
- [Peering networks](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/peering/)
- [Private Link Access](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/privatelinkaccess/)
- [Role bindings](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/rolebinding/)
- [Service accounts](https://www.pulumi.com/registry/packages/confluentcloud/api-docs/serviceaccount/)

[Get started with Pulumi](https://www.pulumi.com/docs/get-started/) and install
the [Confluent Cloud provider for Pulumi](https://www.pulumi.com/registry/packages/confluentcloud/).


### Configuration

1. Add the following details:
   - Select the output record value format (data going to the Kafka topic):
     AVRO, JSON, or JSON_SR (JSON Schema). [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for
     example, Avro, or JSON Schema). For additional information, see
     [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits).
   - **Amazon CloudWatch Logs Endpoint URL**: The URL to use as the
     endpoint for connecting to Amazon CloudWatch for Logs. For example,
     `https://logs.us-east-1.amazonaws.com`.
   - **Amazon CloudWatch Logs Group Name**: The name of the log group on
     Amazon CloudWatch under which the desired log streams are contained.

   ### **Show advanced configurations**

   - **Schema context**: Select a schema context to use for this connector, if using
     a schema-based data format. This property defaults to the **Default** context,
     which configures the connector to use the default schema set up for Schema Registry in your
     Confluent Cloud environment. A schema context allows you to use separate schemas (like
     schema sub-registries) tied to topics in different Kafka clusters that share the
     same Schema Registry environment. For example, if you select a non-default context, a
     **Source** connector uses only that schema context to register a schema and a
     **Sink** connector uses only that schema context to read from. For more
     information about setting up a schema context, see [What are schema contexts and when should you use them?](../sr/faqs-cc.md#faq-schema-contexts).
   - **CloudWatch Log Stream Name(s)**: List of the log streams on
     Amazon CloudWatch where you want to track log records. If the
     field is left empty, all log streams under the log group are
     tracked.
   - **AWS Poll Interval in Milliseconds**: Time in milliseconds (ms)
     the connector waits between polling the endpoint for updates. The
     default value is `1000` ms (1 second).

   **Auto-restart policy**
   - **Enable Connector Auto-restart**: Control the auto-restart behavior of the connector and its
     task in the event of user-actionable errors. Defaults to `true`, enabling the connector to
     automatically restart in case of user-actionable errors. Set this property to `false` to
     disable auto-restart for failed connectors. In such cases, you would need to manually restart
     the connector.

   **Transforms**
   - **Single Message Transforms**: To add a new SMT, see [Add transforms](single-message-transforms.md#cc-single-message-transforms-ui).
     For more information about unsupported SMTs, see
     [Unsupported transformations](single-message-transforms.md#cc-single-message-transforms-unsupported-transforms).

   For all property values and definitions, see [Configuration Properties](#cc-amazon-cloudwatch-logs-source-config-properties).
2. Click **Continue**.


## Features

The Amazon SQS Source connector provides the following features:

* **Topics created automatically**: The connector can automatically create Kafka topics.
* **At least once delivery**: The connector guarantees that records are delivered at least once to the Kafka topic.
* **Supports multiple tasks**: The connector supports running one or more tasks. More tasks may improve performance.
* **Automatic retries**: The connector will retry all requests (that can be retried) when the Amazon SQS service is unavailable. This value defaults to three retries.
* **Supported data formats**: The connector supports Avro, JSON Schema (JSON-SR), Protobuf, and JSON (schemaless) output formats. [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) must be enabled to use a Schema Registry-based format (for example, Avro, JSON Schema, or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional information.
* **Provider integration support**: The connector supports IAM role-based authorization using Confluent Provider Integration.
  For more information about provider integration setup, see the [IAM roles authentication](#cc-amazon-sqs-source-setup-connection).

For more information and examples to use with the Confluent Cloud API for Connect,
see the [Confluent Cloud API for Connect Usage Examples](connect-api-section.md#ccloud-connect-api) section.


# Stream Processing with Confluent Cloud for Apache Flink

Apache Flink® is a powerful, scalable stream processing framework for running
complex, stateful, low-latency streaming applications on large volumes of data.
Flink excels at complex, high-performance, mission-critical streaming workloads
and is used by many companies for production stream processing applications.
Flink is the de facto industry standard for stream processing.


Confluent Cloud for Apache Flink provides a cloud-native, serverless service for Flink that enables
simple, scalable, and secure stream processing that integrates seamlessly with
Apache Kafka®. Your Kafka topics appear automatically as queryable Flink tables, with
schemas and metadata attached by Confluent Cloud.

Confluent Cloud for Apache Flink supports creating stream-processing applications by using Flink SQL,
the [Flink Table API](reference/table-api.md#flink-table-api) (Java and Python), and custom
[user-defined functions](concepts/user-defined-functions.md#flink-sql-udfs).

To run Flink on-premises with Confluent Platform, see
[Confluent Platform for Apache Flink](/platform/current/flink/overview.html).

- [What is Confluent Cloud for Apache Flink?](#ccloud-flink-overview-what-is-flink)
- [Cloud native](#ccloud-flink-overview-cloud-native)
- [Complete](#ccloud-flink-overview-complete)
- [Everywhere](#ccloud-flink-overview-everywhere)
- [Program Flink with SQL, Java, and Python](#ccloud-flink-overview-program-flink)
- [Confluent for VS Code](#ccloud-flink-overview-vs-code)


# Manage Topics in Confluent Cloud

An Apache Kafka® [topic](../_glossary.md#term-topic) is a category or feed that stores messages. [Producers](../_glossary.md#term-producer)
send messages and write data to topics, and [consumers](../_glossary.md#term-consumer) read messages from topics. Topics are
grouped by cluster within [environments](../security/access-control/hierarchy/cloud-environments.md#cloud-environments).

You can apply [schemas](../_glossary.md#term-schema) to topics.

This page provides steps to create, edit, and delete Kafka topics in Confluent Cloud using the Cloud Console or the
[Confluent CLI](https://docs.confluent.io/confluent-cli/current/command-reference/kafka/topic/index.html). You
can also list, create or delete topics
with [REST APIs](https://docs.confluent.io/cloud/current/api.html#tag/Topic-(v3)). If you have more than 1000
topics, Cloud Console may not display metrics for all the topics. For complete monitoring of all topics,
use the Metrics API. For more information, see [Confluent Cloud Metrics](../monitoring/metrics-api.md#metrics-api).


# Configure a service account

The Unified Stream Manager (USM) Agent requires a service account to securely authenticate with Confluent Cloud and
collect metadata from your Confluent Platform cluster. The service account must have both the `USMAgent` and
`DataSteward` roles. Additionally, it requires the following API keys:

* An API key with `Schema Registry` scope.
* An API key with `Cloud resource management` scope.

Use separate service accounts for each logically separate Confluent Platform environments
that connects to Confluent Cloud though USM. For example, if you have development
and production environments, use a separate service account for each.

You have two options for configuration: creating a new service account dedicated to this
purpose or using an existing service account.

* When you create a new account in the wizard, the roles
  `USMAgent` and `DataSteward` and API keys `schema registry` and
  `Cloud resource management` with the necessary permissions are assigned automatically.
* If you choose to use an existing account, you must manually verify that it has
  the `USMAgent` and `DataSteward` roles assigned. If these roles are not assigned, the registration fails.
  To add role bindings to a principal, see [Add role bindings to a principal](https://docs.confluent.io/cloud/current/security/access-control/rbac/manage-role-bindings.html#add-role-bindings-to-a-principal).
  Also, verify the service account has required API keys `schema registry` and `Cloud resource management`
  for the USM Agent to use. For details, see [Add an API key](../../security/authenticate/workload-identities/service-accounts/api-keys/manage-api-keys.md#create-api-key).


### Confluent Platform versions

- For the compatible Confluent Platform versions for this version of Confluent CLI, see the [compatibility table](https://docs.confluent.io/platform/current/installation/versions-interoperability.html#confluent-cli).
- The Confluent CLI for Confluent Platform requires that you have the [Confluent REST Proxy server for Apache Kafka](/platform/current/kafka-rest/index.html) running. The Confluent REST Proxy server mediates
  uses the [APIs](/platform/current/kafka-rest/api.html) and mediate between the Confluent CLI and your clusters.

  This is not required for the Apache Kafka® tools or “scripts” that come with Kafka and ship with Confluent Platform. These alternatives to the Confluent CLI for Confluent Platform do not require
  the Confluent REST Proxy service to be running. Therefore, the Confluent Platform tutorials in the documentation sometimes feature Kafka scripts rather than the Confluent CLI commands
  in order to simplify setup for getting started tasks. For example, the basic Cluster Linking tutorial for Confluent Platform that describes how to [Share data across topics](/platform/current/multi-dc-deployments/cluster-linking/topic-data-sharing.html)
  uses the Kafka scripts throughout (such as `kafka-cluster-links --list` to [list mirror topics](/platform/current/multi-dc-deployments/cluster-linking/topic-data-sharing.html#list-mirror-topics)
  rather than [confluent kafka topic list](/platform/current/command-reference/kafka/topic/confluent_kafka_topic_list.html) or [confluent kafka link list](/platform/current/command-reference/kafka/link/confluent_kafka_link_list.html)).
  In such scenarios, running the Confluent CLI commands would fail to work if you did not have the REST Proxy server running.

  (This is also not an issue for the Confluent CLI on Confluent Cloud, which is fully-managed, and integrates with the [Confluent Cloud APIs](https://docs.confluent.io/cloud/current/api.html) under the hood.)


### Prerequisites

- [Confluent Platform](/platform/current/installation/installing_cp/index.html) is
  installed and services are running by using the Confluent CLI
  [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands.
- Kafka and Schema Registry are running locally on the default ports.

Note that this quick start assumes that you are using the
[Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands; however,
[standalone installations](/platform/current/installation/installing_cp/index.html) are
also supported. By default ZooKeeper, Kafka, Schema Registry, Kafka Connect REST API, and
Kafka Connect are started with the `confluent local start` command.
Note that as of Confluent Platform 7.5, ZooKeeper is deprecated for new deployments. Confluent
recommends KRaft mode for new deployments.


## Prerequisites

- [Confluent Platform](/platform/current/installation/installing_cp/index.html) is
  installed and services are running by using the [Confluent
  CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands. This quick start assumes that you are using the
  Confluent CLI. By default ZooKeeper, Kafka, Schema Registry, Kafka Connect REST API, and
  Kafka Connect are started with the `confluent local start` command.
  For more information, see [Confluent
  Platform](/platform/current/installation/installing_cp/index.html). Note that as of Confluent Platform 7.5,
  ZooKeeper is deprecated for new deployments. Confluent recommends KRaft mode for
  new deployments.
- [Kudu](https://kudu.apache.org/releases/) and [Impala](https://impala.apache.org/downloads.html) are installed and configured properly ([Using Kudu with Impala](https://kudu.apache.org/docs/kudu_impala_integration.html)). For DECIMAL type support, we need at least Kudu 1.7.0, and Impala 3.0.
- Verify that the [Impala JDBC driver](https://www.cloudera.com/downloads/connectors/impala/jdbc/2-6-15.html) is available on the Kafka Connect process’s `CLASSPATH`.
- Kafka and Schema Registry are running locally on the default ports.


### Prerequisites

- [Confluent Platform](/platform/current/installation/installing_cp/index.html) is
  installed and services are running by using the [Confluent
  CLI](https://docs.confluent.io/confluent-cli/current/index.html) commands. This quick start assumes that you are using the
  Confluent CLI. By default ZooKeeper, Kafka, Schema Registry, Kafka Connect REST API, and
  Kafka Connect are started with the `confluent local start` command.
  For more information, see [Confluent
  Platform](/platform/current/installation/installing_cp/index.html). Note that as of Confluent Platform 7.5,
  ZooKeeper is deprecated for new deployments. Confluent recommends KRaft mode for
  new deployments.
- [Kudu](https://kudu.apache.org/releases/) and [Impala](https://impala.apache.org/downloads.html) are installed and configured properly ([Using Kudu with Impala](https://kudu.apache.org/docs/kudu_impala_integration.html)). For DECIMAL type support, we need at least Kudu 1.7.0, and Impala 3.0.
- Verify that the [Impala JDBC driver](https://www.cloudera.com/downloads/connectors/impala/jdbc/2-6-15.html) is available on the Kafka Connect process’s `CLASSPATH`.
- Kafka and Schema Registry are running locally on the default ports.


#### IMPORTANT
Starting with Confluent Platform version 8.0, ZooKeeper is no longer part of Confluent Platform.

* A multi-region cluster deployment across three Kubernetes regions, where each
  cluster hosts CFK, Kafka brokers, and ZooKeeper servers:
  ![image](images/co-mrc-3clusters.png)


* A multi-region cluster deployment across three Kubernetes regions, where each
  cluster hosts CFK, ZooKeeper servers, and two clusters host Kafka brokers:
  ![image](images/co-mrc-2.5clusters.png)

You can set up an MRC with the following communication methods among ZooKeeper, Kafka,
Connect, and Schema Registry deployed across regions:

* Use internal listeners among ZooKeeper, Kafka, Connect, and Schema Registry across regions.

  You set up a DNS resolution that allows each region in the MRC configuration
  to be able to resolve internal pods in other regions. Internal listeners are
  used among the MRC components (ZooKeeper, Kafka, Connect, and Schema Registry).
* Use external access among ZooKeeper, Kafka, Connect, Connect, and Schema Registry across
  regions.

  Without the required networking configuration, CFK redirects internal
  communication among the MRC components (ZooKeeper, Kafka, Connect, and Schema Registry) to
  use endpoints that can be accessed externally by each region.

  For MRC, if there are other components that depend on Kafka, you need to
  configure an external listener for Kafka. If you want to reduce the number of
  load balancers, you can use an alternative way for external access, such as
  Ingress.

The supported security features work in multi-region cluster deployments. For
specific configurations, see [Configure Security for Confluent Platform with Confluent for Kubernetes](co-security-overview.md#co-security-overview).


## Prerequisites

- Confluent for VS Code: Follow the steps in [Installation](overview.md#vscode-installation)
- A running Kafka cluster


With Confluent for VS Code, you can connect to any Kafka API-compatible
cluster and any Confluent Schema Registry API-compatible server.


## Security considerations

If you are running Self-Balancing with security configured, you must [configure authentication for REST endpoints
on the brokers](configuration-options.md#sbc-rest-endpoint-configs-secure-setup). Without these configurations in the
broker properties files, Control Center will not have access to Self-Balancing in a secure environment.

If you are using role-based access control (RBAC), the user interacting with Self-Balancing on Control Center must be have the
[RBAC role](../../security/authorization/rbac/rbac-predefined-roles.md#rbac-predefined-roles) `SystemAdmin` on the Kafka cluster to be able to add
or remove brokers, and to perform other Self-Balancing related tasks.

For information on setting up security on Confluent Platform, see the sections on [authentication methods](../../security/authentication/overview.md#authentication-overview),
[role-based access control](../../security/authorization/rbac/overview.md#rbac-overview), [RBAC and ACLs](../../security/authorization/rbac/overview.md#rbac-and-acls), and [Enable Security for a KRaft-Based Cluster in Confluent Platform](../../security/security_tutorial.md#security-tutorial).

For information about setting up security on Confluent Platform, see [Security Overview](../../security/overview.md#security), [Enable Security for a KRaft-Based Cluster in Confluent Platform](../../security/security_tutorial.md#security-tutorial),
and the overviews on [authentication methods](../../security/authentication/overview.md#authentication-overview). Also, the [Scripted Confluent Platform Demo](../../tutorials/cp-demo/index.md#cp-demo) shows
various types of security enabled on an example deployment.


### Self-Balancing options do not show up on Control Center

When Self-Balancing Clusters are enabled, status and configuration options are available on
Control Center **Cluster Settings** > **Self-balancing** tab. If, instead, this tab
displays a message about Confluent Platform version requirements and configuring HTTP servers
on the brokers, this indicates something is missing from your configurations or that
you are not running the required version of Confluent Platform.

Also, if you are running Self-Balancing with security enabled, you may get an error message such as: **Error 504 Gateway Timeout**,
which indicates that you also must configure authentication for REST endpoints in your broker files, as described below.

**Solution:** Verify that you have the following settings and update your
configuration as needed.

- In the Kafka broker files, [confluent.balancer.enable](configuration-options.md#sbc-config-enable) must be set to `true` to enable Self-Balancing.
- In the Control Center properties file, `confluent.controlcenter.streams.cprest.url` must specify the associated URL for each broker in the cluster as REST endpoints for `controlcenter.cluster`,
  as described in [Required Configurations for Control Center](configuration-options.md#sbc-configs-c3).
- Security is not a requirement for Self-Balancing but if security is enabled, you must also [configure authentication for REST endpoints on the brokers](configuration-options.md#sbc-rest-endpoint-configs-secure-setup).
  In this case, you would use use `confluent.metadata.server.listeners` (which enables the [Metadata Service](../../kafka/configure-mds/index.md#rbac-mds-config)) instead of `confluent.http.server.listeners` to listen for API requests
  To learn more, see [Security considerations](#sbc-security-considerations).


## Architecture

Kafka Connect has three major models in its design:

* **Connector model**: A connector is defined by specifying a `Connector` class and configuration
  options to control what data is copied and how to format it. Each `Connector`
  instance is responsible for defining and updating a set of `Tasks` that actually copy the data.
  Kafka Connect manages the `Tasks`; the `Connector` is only responsible for generating the set of
  `Tasks` and indicating to the framework when they need to be updated. `Source` and `Sink`
  `Connectors`/`Tasks` are distinguished in the API to ensure the simplest possible API for both.
* **Worker model**: A Kafka Connect cluster consists of a set of `Worker` processes that are containers
  that execute `Connectors` and `Tasks`. `Workers` automatically coordinate with each other to distribute work and
  provide scalability and fault tolerance. The `Workers` will distribute work among any available processes,
  but are not responsible for management of the processes; any process management strategy can be used for
  `Workers` (e.g. cluster management tools like YARN or Mesos, configuration management tools like Chef or Puppet,
  or direct management of process lifecycles).
* **Data model**: Connectors copy streams of messages from a partitioned input stream to a partitioned output stream,
  where at least one of the input or output is *always* Kafka. Each of these
  streams is an ordered set messages where each message has an associated offset. The format and
  semantics of these offsets are defined by the Connector to support integration with a wide
  variety of systems; however, to achieve certain delivery semantics in the face of faults
  requires that offsets are unique within a stream and streams can seek to arbitrary offsets.
  The message contents are represented by `Connectors` in a serialization-agnostic format, and Kafka Connect supports
  pluggable `Converters` for storing this data in a variety of serialization formats. Schemas are built-in,
  allowing important metadata about the format of messages to be propagated through complex data pipelines.
  However, schema-free data can also be use when a schema is simply unavailable.

The connector model addresses three key user requirements. First, Kafka Connect performs
**broad copying by default** by having users define jobs at the level of `Connectors` which then
break the job into smaller `Tasks`. This two level scheme strongly encourages connectors to use
configurations that encourage copying broad swaths of data since they should have enough inputs
to break the job into smaller tasks. It also provides one point of **parallelism** by requiring
`Connectors` to immediately consider how their job can be broken down into subtasks, and select an
appropriate granularity to do so. Finally, by specializing source and sink interfaces,
Kafka Connect provides an **accessible connector API** that makes it very easy to implement connectors
for a variety of systems.

The worker model allows Kafka Connect to **scale to the application**. It can run scaled down to a
single worker process that also acts as its own coordinator, or in clustered mode where
connectors and tasks are dynamically scheduled on workers. However, it assumes very little about
the *process management* of the workers, so it can easily run on a variety of cluster managers or
using traditional service supervision. This architecture allows scaling up and down, but
Kafka Connect’s implementation also adds utilities to support both modes well. The REST interface for
managing and monitoring jobs makes it easy to run Kafka Connect as an organization-wide service that
runs jobs for many users. Command line utilities specialized for ad hoc jobs make it easy to get
up and running in a development environment, for testing, or in production environments where an
agent-based approach is required.

The data model addresses the remaining requirements. Many of the benefits come from coupling
tightly with Kafka. Kafka serves as a natural buffer for both **streaming and batch** systems,
removing much of the burden of managing data and ensuring delivery from connector developers.
Additionally, by always requiring Kafka as one of the endpoints, the larger data pipeline can
leverage the many tools that integrate well with Kafka. This allows Kafka Connect to
**focus only on copying data** because a variety of stream processing tools are available to
further process the data, which keeps Kafka Connect simple, both conceptually and in its implementation.
This differs greatly from other systems where ETL must occur before hitting a sink. In
contrast, Kafka Connect can bookend an ETL process, leaving any transformation to tools specifically
designed for that purpose. Finally, Kafka includes partitions in its core abstraction, providing
another point of **parallelism**.


# Kafka Connectors

Self-Managed Connectors for Confluent Platform

You can use self-managed Apache Kafka® connectors to move data in and out of Kafka. The self-managed connectors are for
use with Confluent Platform. For more information on fully-managed connectors, see
[Confluent Cloud](https://docs.confluent.io/cloud/current/connectors/index.html).


Popular connectors

[![image](connect/images/logo/jdbc.png)](https://docs.confluent.io/kafka-connectors/jdbc/current/)

**JDBC Source and Sink**

The Kafka Connect JDBC Source connector imports data from any relational
database with a JDBC driver into an Kafka topic. The Kafka Connect
JDBC Sink connector exports data from Kafka topics to any relational
database with a JDBC driver.

[![image](connect/images/logo/jms.jpg)](https://docs.confluent.io/kafka-connectors/jms-source/current/overview.html)

**JMS Source**

The Kafka Connect JMS Source connector is used to move messages from any JMS-compliant broker into Kafka.

[![image](connect/images/logo/connect-logo.svg)](https://docs.confluent.io/kafka-connectors/elasticsearch/current/overview.html)

**Elasticsearch Service Sink**

The Kafka Connect Elasticsearch Service Sink connector moves data from Kafka to Elasticsearch. It writes data from a topic in Kafka to an index in Elasticsearch.

[![image](connect/images/logo/s3.png)](https://docs.confluent.io/kafka-connectors/s3-sink/current/overview.html)

**Amazon S3 Sink**

The Kafka Connect Amazon S3 Sink connector exports data from Kafka topics to S3 objects in either Avro, JSON, or Bytes formats.

[![image](connect/images/logo/hdfs.png)](https://docs.confluent.io/kafka-connectors/hdfs/current/overview.html)

**HDFS 2 Sink**

The Kafka Connect HDFS 2 Sink connector allows you to export data from
Apache Kafka topics to HDFS 2.x files in a variety of formats. The connector
integrates with Hive to make data immediately available for querying with
HiveQL.

[![image](connect/images/logo/replicator.png)](https://docs.confluent.io/platform/current/multi-dc-deployments/replicator/)

**Replicator**

Replicator allows you to easily and reliably replicate topics from one Kafka cluster to another.

Managing connectors


    <i class="title">Supported connectors</i>
    <i class="description">
        Confluent supports many self-managed connectors that import and export data from some of the most commonly used data systems. Practically all connectors are available from Confluent Hub.
    </i>


    <i class="title">Preview connectors</i>
    <i class="description">
        Confluent introduces preview connectors to gain early feedback from users. Preview connectors are only suitable for evaluation and non-production purposes.
    </i>

    <i class="title">Installing connectors</i>
    <i class="description">
        Install the connectors by using the Confluent Hub client (recommended) or manually install by downloading the plugin file.
    </i>


    <i class="title">Configuring connectors</i>
    <i class="description">
        Connector configurations are key-value mappings. In distributed mode, they are included in the JSON payload sent over the REST API request that creates (or modifies) the connector.
    </i>

    <i class="title">Licensing connectors</i>
    <i class="description">
        With a Developer License, you can use Confluent Platform commercial connectors on an unlimited basis in Connect clusters that use a single-broker Apache Kafka cluster. A 30-day trial period is available when using a multi-broker cluster.
    </i>


    <i class="title">Monitoring connectors</i>
    <i class="description">
        You can manage and monitor Connect, connectors, and clients using JMX and the REST interface.
    </i>

    <i class="title">Adding connectors or software</i>
    <i class="description">
        The Kafka Connect Base image contains Kafka Connect and all of its dependencies. To add new connectors to this image, you need to build a new Docker image that has the new connectors installed.
    </i>


    <i class="title">Upgrading a connector plugin</i>
    <i class="description">
        Upgrading a connector is similar to upgrading any other Apache Kafka client application. Refer to the documentation for individual connector plugins if you have a need for rolling upgrades.
    </i>

    <i class="title">Manually installing Community connectors</i>
    <i class="description">
        If a connector is not available on Confluent Hub, you can use the JARs to directly install the connectors into your Apache Kafka installation.
    </i>


    <i class="title">Kafka Connect</i>
    <i class="description">
        Kafka Connect, an open source component of Apache Kafka, is a framework for connecting Kafka with external systems such as databases, key-value stores, search indexes, and file systems.
    </i>


All connectors


### Configure CSFLE without sharing KEK

If you do not want share your Key Encryption Key (KEK) with Confluent, follow the steps below:

- Define the [schema for the topic](https://docs.confluent.io/cloud/current/sr/schemas-manage.html#cloud-schema-create) and add [tags](https://docs.confluent.io/cloud/current/sr/schemas-manage.html#cloud-schema-tagging) to the fields in the schema that you want to encrypt.
- Create [encryption keys](https://docs.confluent.io/cloud/current/security/encrypt/csfle/manage-csfle.html#add-encryption-key-csfle) for each KMS.
- Add [encryption rules](https://docs.confluent.io/cloud/current/security/encrypt/csfle/manage-csfle.html#add-encryption-rule-csfle) that specify the encryption key you want to use to encrypt the tags.
- Grant DeveloperWrite permission for encryption key and DeveloperRead permission for the Schema Registry API keys.
- Add the following parameters in the connector configuration:

  ### AWS CSFLE Rule Executor

  For AWS, pass the following configuration parameters:

  | Parameter                                            | Description                    |
  |------------------------------------------------------|--------------------------------|
  | `rule.executors._default_.param.access.key.id=?`     | The AWS access key identifier. |
  | `rule.executors._default_.param.secret.access.key=?` | The AWS secret access key.     |

  ### Azure CSFLE Rule Executor

  For Azure, pass the following configuration parameters:

  | Parameter                                      | Description                  |
  |------------------------------------------------|------------------------------|
  | `rule.executors._default_.param.tenant.id`     | The Azure tenant identifier. |
  | `rule.executors._default_.param.client.id`     | The Azure client identifier. |
  | `rule.executors._default_.param.client.secret` | The Azure client secret.     |

  ### Google Cloud CSFLE Rule Executor

  For Google Cloud, pass the following configuration parameters:

  | Parameter                                       | Description                                            |
  |-------------------------------------------------|--------------------------------------------------------|
  | `rule.executors._default_.param.account.type`   | This parameter contains the Google Cloud account type. |
  | `rule.executors._default_.param.client.id`      | The Google Cloud client identifier.                    |
  | `rule.executors._default_.param.client.email`   | The Google Cloud client email address.                 |
  | `rule.executors._default_.param.private.key.id` | The Google Cloud private key identifier.               |
  | `rule.executors._default_.param.private.key`    | The Google Cloud private key.                          |

  ### HashiCorp Vault CSFLE Rule Executor

  For HashiCorp Vault, pass the following configuration parameters:

  | Parameter                                  | Description                                              |
  |--------------------------------------------|----------------------------------------------------------|
  | `rule.executors._default_.param.token.id`  | The token identifier for HashiCorp Vault.                |
  | `rule.executors._default_.param.namespace` | The namespace for HashiCorp Vault Enterprise (optional). |

For more information, see [CSFLE without sharing access to your Key Encryption Keys (KEKs)](https://docs.confluent.io/cloud/current/security/encrypt/csfle/overview.html#csfle-with-shared-confluent-access-to-kek) .


## Prerequisites

- You must [download](https://www.confluent.io/download/#confluent-platform) self-managed Confluent Platform for
  your environment.
- If your environment already includes, or will include, Active Directory (LDAP service), it must be
  configured as well. The configurations on this page are based on Microsoft Active Directory (AD). You must
  update these configurations to match your LDAP service. Nested LDAP groups are not supported.
- Brokers running MDS must be configured with a separate listener for inter-broker
  communication.
  If required, you can configure these users as `super.users`, but they
  cannot rely on access to resources using role-based or group-based
  access. The broker user must be configured as a super user or granted access
  using [ACLs](../../security/authorization/acls/overview.md#kafka-authorization).
- Brokers will accept requests on the inter-broker listener port before the
  metadata for RBAC authorization has been initialized. However, requests on
  other ports are only accepted after the required metadata has been initialized,
  including any available LDAP metadata. Broker initialization only completes
  after all relevant metadata has been obtained and cached. When starting multiple
  brokers in an MDS cluster with a replication factor of 3 (default) for a
  metadata topic, at least three brokers must also be started simultaneously to
  enable initialization to complete on the brokers. Note that there is a
  timeout/retry limitation for this initialization, which you can specify in
  `confluent.authorizer.init.timeout.ms`. For details, refer to
  [Configure Confluent Server Authorizer in Confluent Platform](../../security/csa-introduction.md#confluent-server-authorizer).
- REST Proxy services that integrate with AD/LDAP using MDS will use the user login
  name as the user principal for authorization decisions. By default, this is
  also the principal used by brokers for users authenticating using SASL/GSSAPI
  (Kerberos). If your broker configuration overrides `principal.builder.class` or
  `sasl.kerberos.principal.to.local.rules` to create a different principal,
  the user principal used by brokers may be different from the principal used by
  other Confluent Platform components. In this case, you should configure ACLs and role bindings
  for your customized principal for broker resources.


## Configuration Options for TLS Encryption between REST Proxy and Apache Kafka Brokers

Note that all the TLS configurations (for REST Proxy to Broker communication) are prefixed with
`client.`. If you want the configuration to apply just to admins, consumers or producers, you can
replace the prefix with `admin.`, `consumer.` or `producer.` respectively.

In addition to these configurations, make sure `bootstrap.servers` configuration
is set with SSL://host:port end-points, or you’ll accidentally open a TLS connection
to a non-TLS port.

Keep in mind that authenticated and encrypted connection to Kafka Brokers will only work when Kafka is
running with appropriate security configuration. For details, see [Kafka Security](../../../security/overview.md#security).

`client.security.protocol`
: Protocol used to communicate with brokers. Valid values are: PLAINTEXT, SSL,
  SASL_PLAINTEXT, SASL_SSL.


  * Type: string
  * Default: PLAINTEXT
  * Importance: high

`client.ssl.key.password`
: The password of the private key in the key store file. This is optional for
  client.


  * Type: password
  * Default: null
  * Importance: high

`client.ssl.keystore.location`
: The location of the key store file. This is optional for client and can be used
  for two-way client authentication.


  * Type: string
  * Default: null
  * Importance: high

`client.ssl.keystore.password`
: The store password for the key store file. This is optional for client and only
  needed if ssl.keystore.location is configured.


  * Type: password
  * Default: null
  * Importance: high

`client.ssl.truststore.location`
: The location of the trust store file.


  * Type: string
  * Default: null
  * Importance: high

`client.ssl.truststore.password`
: The password for the trust store file.


  * Type: string
  * Default: null
  * Importance: high

`client.ssl.enabled.protocols`
: The comma-separated list of protocols enabled for TLS connections. The default
  value is `TLSv1.2,TLSv1.3` when running with Java 11 or later, `TLSv1.2`
  otherwise. With the default value for Java 11 (`TLSv1.2,TLSv1.3`), Kafka
  clients and brokers prefer TLSv1.3 if both support it, and falls back to
  TLSv1.2 otherwise (assuming both support at least TLSv1.2).


  * Type: list
  * Default: `TLSv1.2,TLSv1.3`
  * Importance: medium

`client.ssl.keystore.type`
: The file format of the key store file. This is optional for client.


  * Type: string
  * Default: JKS
  * Importance: medium

`client.ssl.protocol`
: The TLS protocol used to generate the SSLContext. The default is `TLSv1.3`
  when running with Java 11 or newer, `TLSv1.2` otherwise. This value should
  be fine for most use cases. Allowed values in recent JVMs are `TLSv1.2` and
  `TLSv1.3`. `TLS`, `TLSv1.1`, `SSL`, `SSLv2` and `SSLv3` might be
  supported in older JVMs, but their usage is discouraged due to known security
  vulnerabilities. With the default value for this configuration and `ssl.enabled.protocols`,
  clients downgrade to `TLSv1.2` if the server does not support `TLSv1.3`.
  If this configuration is set to `TLSv1.2`, clients do not use `TLSv1.3`,
  even if it is one of the values in `ssl.enabled.protocols` and the server
  only supports `TLSv1.3`.


  * Type: string
  * Default: `TLSv1.3`
  * Importance: medium

`client.ssl.provider`
: The name of the security provider used for TLS connections. Default value is
  the default security provider of the JVM.


  * Type: string
  * Default: null
  * Importance: medium

`client.ssl.truststore.type`
: The file format of the trust store file.


  * Type: string
  * Default: JKS
  * Importance: medium

`client.ssl.cipher.suites`
: A list of cipher suites. This is a named combination of authentication, encryption,
  MAC, and key exchange algorithms used to negotiate the security settings for a
  network connection using the TLS network protocol. By default, all the available
  cipher suites are supported.


  * Type: list
  * Default: null
  * Importance: low

`client.ssl.endpoint.identification.algorithm`
: The endpoint identification algorithm to validate server hostname using server
  certificate.


  * Type: string
  * Default: null
  * Importance: low

`client.ssl.keymanager.algorithm`
: The algorithm used by key manager factory for TLS connections. Default value
  is the key manager factory algorithm configured for the Java Virtual Machine.


  * Type: string
  * Default: SunX509
  * Importance: low

`client.ssl.secure.random.implementation`
: The SecureRandom PRNG implementation to use for TLS cryptography operations.


  * Type: string
  * Default: null
  * Importance: low

`client.ssl.trustmanager.algorithm`
: The algorithm used by trust manager factory for TLS connections. Default value
  is the trust manager factory algorithm configured for the Java Virtual Machine.


  * Type: string
  * Default: PKIX
  * Importance: low


### RBAC REST Proxy workflow

Here is a summary of the RBAC REST Proxy security workflow:

1. A user makes REST API call to REST Proxy using LDAP credentials for HTTP Basic Authentication.
2. REST Proxy authenticates the user with the MDS by acquiring a token for the authenticated user.
3. The generated token is used to impersonate the user request and authenticate between Kafka clients
   and the Kafka cluster. For Kafka clients, the `SASL_PLAINTEXT`/`SASL_SSL` security protocol is used and
   the proprietary callback handler passes the token to the Kafka cluster. Similarly, when
   communicating with Schema Registry, the authentication token is passed to the Schema Registry client using a
   proprietary implementation of the `BearerAuthCredentialProvider` interface.
4. If the user does not have the requisite role or ACL permission for the requested resource (for
   example, topic, group, or cluster), then the REST API call fails and returns an error with the
   HTTP 403 status code.

![image](images/rbac-rest-proxy-security.png)


## Securing interactive deployments

Securing the interactive ksqlDB installation involves securing the HTTP
endpoints that the ksqlDB server is listening on.

As well as accepting connections and requests from clients, a multi-node
ksqlDB cluster also requires inter-node communications. You can choose
to configure the external client and internal inter-node communication
separately or over a single listener:

- [Securing single listener setup](#ksqldb-installation-security-securing-single-listener):
  Ideal for single-node installations, or where the inter-node
  communication is over the same network interface as client
  communication.
- [Securing dual listener setup](#ksqldb-installation-security-securing-dual-listener):
  Useful where inter-node communication is over a different network interface
  or requires different authentication or encryption configuration.


## Securing dual-listener setup

Using dual listeners for ksqlDB is appropriate when the client and
inter-node communication utilize different authentication and security
configurations. This is most likely the case when ksqlDB is deployed as
an IaaS service.

The supported setups are SSL-mutual auth for the internal communication
combined with SSL encryption and authentication for the external client:

- [Configuring internal for SSL-mutual authentication](#ksqldb-installation-security-configuring-internal-for-ssl-mutual-authentication):
  Creates secure and authenticated connections for inter-node
  communication, but leaves the external client API unsecured. This is
  most appropriate when clients are trusted, but the internal APIs are
  protected from use.
- [Configuring internal for SSL-mutual authentication and external for HTTP-BASIC authentication](#ksqldb-installation-security-configuring-internal-for-ssl-mutual-and-external-for-http-basic):
  Creates secure and authenticated connections for inter-node
  communication and uses basic authentication for the external client API.
  This is most likely to be used with SSL above.
- [Configuring internal for SSL-mutual authentication and external for SSL encryption](#ksqldb-installation-security-configuring-internal-for-ssl-mutual-external-for-ssl-encryption):
  Creates secure and authenticated connections for inter-node
  communication and uses SSL for the external client API. This is most
  likely to be used with authentication below.


### Configuration Options

These properties are available to specify for the cluster link.

If you disable a feature that has filters (ACL sync, consumer offset sync, auto create mirror topics) after having it enabled initially,
then any existing filters will be cleared (deleted) from the cluster link.

`acl.filters`
: JSON string that lists the ACLs to migrate. Define the ACLs in a file, `acl.filters.json`, and pass the file name as an argument to `--acl-filters-json-file`.
  See [Migrating ACLs from Source to Destination Cluster](security.md#cluster-link-acls-migrate) for examples of how to define the ACLs in the JSON file.


  * Type: string
  * Default: “”


  #### NOTE
  Populate `acl.filters` by passing a JSON file on the command line that specifies the ACLs as described in [Migrating ACLs from Source to Destination Cluster](security.md#cluster-link-acls-migrate).

`acl.sync.enable`
: Whether or not to migrate ACLs. To learn more, see [Migrating ACLs from Source to Destination Cluster](security.md#cluster-link-acls-migrate).


  * Type: boolean
  * Default: false

`acl.sync.ms`
: How often to refresh the ACLs, in milliseconds (if ACL migration is enabled). The default is 5000 milliseconds (5 seconds).


  * Type: int
  * Default: 5000

`auto.create.mirror.topics.enable`
: Whether or not to auto-create mirror topics based on topics on the source cluster. When set to “true”,
  mirror topics will be auto-created. Setting this option to “false” disables mirror topic creation and clears any existing filters.
  For details on this option, see [auto-create mirror topics](/platform/current/multi-dc-deployments/cluster-linking/mirror-topics-cp.html#auto-create-mirror-topics).

`auto.create.mirror.topics.filters`
: A JSON object with one property, `topicFilters`, that contains an array of filters to apply to indicate which topics should be mirrored.
  For details on this option, see [auto-create mirror topics](/platform/current/multi-dc-deployments/cluster-linking/mirror-topics-cp.html#auto-create-mirror-topics).


`cluster.link.prefix`
: A prefix that is applied to the names of the mirror topics. The same prefix is applied to consumer groups when [consumer.group.prefix.enable](#consumer-group-prefix) is set to `true`.
  To learn more, see “Prefixing Mirror Topics and Consumer Group Names” in [Mirror Topics](mirror-topics-cp.md#mirror-topics-concepts).


  #### NOTE
  The prefix cannot be changed after the cluster link is created.


  * Type: string
  * Default: null

`cluster.link.paused`
: Whether or not the cluster link is running or paused. The default is false.


  * Type: boolean
  * Default: false

`cluster.link.retry.timeout.ms`
: The number of milliseconds after which failures are no longer retried and partitions are
  marked as failed. If the source topic is deleted and re-created within this timeout, the link
  may contain records from the old as well as the new topic.


  * Type: int
  * Default: 300000 (5 minutes)


`availability.check.ms`
: How often the cluster link checks to see if the source cluster is available.
  The frequency with which the cluster link checks is specified in milliseconds.


  * Type: int
  * Default: 60000 (1 minute)


  A cluster link regularly checks whether the source cluster is still available for mirroring data
  by performing a `DescribeCluster` operation (bounded by `default.api.timeout.ms`).
  If the source cluster becomes unavailable (for example, because of an outage or disaster),
  then the cluster link signals this by updating its status and the status of its mirror topics.
  `availability.check.ms` works in tandem [availability.check.consecutive.failure.threshold](#cluster-link-availability-check-consecutive-failure-threshold).


`availability.check.consecutive.failure.threshold`
: The number of consecutive failed availability checks the source cluster is allowed before the
  cluster link status becomes `SOURCE_UNAVAILABLE`.


  * Type: int
  * Default: 5


  If, for example, the default (5) is used, the source cluster is determined to be unavailable after 5 failed checks in a row.
  If [availability.check.ms](#cluster-link-availability-check-ms) and `default.api.timeout.ms` are also set to their
  defaults of 1 minute and there are 5 failed checks, then the cluster link will show as `SOURCE_UNAVAILABLE` after 5 \* (1+1) mins = 10 minutes.
  Note that this reflects that source unavailability is detected after `availability.check.consecutive.failure.threshold` \* (`default.api.timeout.ms` + `availability.check.ms`),
  taking into account the `DescribeCluster` operation performed as a part of [availability.check.ms](#cluster-link-availability-check-ms).

`connections.max.idle.ms`
: Idle connections timeout. The server socket processor threads close any connections that idle longer than this.


  * Type: int
  * Default: 600000

`connection.mode`
: Used only for source-initiated links. Set this to INBOUND on the destination cluster’s link (which you create first).
  Set this to OUTBOUND on the source cluster’s link (which you create second). You must use this in combination with `link.mode`.
  This property should only be set for source-initiated cluster links.


  * Type: string
  * Default: OUTBOUND

`consumer.offset.group.filters`
: JSON to denote the list of consumer groups to be migrated. To learn more, see [Migrating consumer groups from source to destination cluster](commands.md#cluster-link-migrate-consumer-groups).


  * Type: string
  * Default: “”


  #### NOTE
  Consumer group filters should only include groups that are not being used on the destination.
  This will help ensure that the system does not override offsets committed by other consumers on the destination.
  The system attempts to work around filters containing groups that are also used on the destination,
  but in these cases there are no guarantees; offsets may be overwritten. For mirror topic “promotion” to work, the system must
  be able to roll back offsets, which cannot be done if the group is being used by destination consumers.

`consumer.offset.sync.enable`
: Whether or not to migrate consumer offsets from the source cluster.


  If you set this up and run Cluster Linking, then later disable it, the filters will be cleared (deleted) from the cluster link.


  * Type: boolean
  * Default: false

`consumer.offset.sync.ms`
: How often to sync consumer offsets, in milliseconds, if enabled.


  * Type: int
  * Default: 30000


`consumer.group.prefix.enable`
: When set to `true`, the prefix specified for the [cluster link prefix](#cluster-link-prefix) is also applied to the names of consumer groups.
  The cluster link prefix must be specified in order for the consumer group prefix to be applied. To learn more, see “Prefixing Mirror Topics and Consumer Group Names” in [Mirror Topics](mirror-topics-cp.md#mirror-topics-concepts).


  * Type: boolean
  * Default: false


  #### NOTE
  Consumer group prefixing cannot be enabled for bidirectional links.

`num.cluster.link.fetchers`
: Number of fetcher threads used to replicate messages from source brokers in cluster links.


  * Type: int
  * Default: 1

`topic.config.sync.ms`
: How often to refresh the topic configs, in milliseconds.


  * Type: int
  * Default: 5000

`topic.config.sync.include`
: The list of topic configs to sync from the source topic. By default, certain topic configurations are synced from the source topic to the mirror topic to ensure consistency.
  This parameter allows you to explicitly specify which topic configurations should be synced, giving you control over which properties are copied from source to destination.
  To learn more, see [Override default syncing to specify independent mirror topic behavior](mirror-topics-cp.md#override-default-mirror-topic-syncs-cp) in [Mirror Topics](mirror-topics-cp.md#mirror-topics-concepts).


  * Type: string
  * Default: (all default sync configs are included)

`link.fetcher.flow.control`
: Maximum lag between high watermark and log end offset after which Cluster Linking will stop fetching.
  This is to synchronize the Cluster Linking fetch rate and the in-sync replica (ISR) fetch rate to avoid being under the minimum ISR.
  Setting this value specifies the flow control approach.


  * Type: int
  * Default: 0


  The following values for this configuration option apply to the approach:


  - `>=0`: Lag approach.
  - `-1`: Under min ISR approach. `-1` means the maximum lag is not enforced. Cluster Linking fetch will stop when the partition is under the minimum ISR.
  - `-2`: Under-replicated partition approach. `-2` specifies that Cluster Linking fetch will stop when the partition is under-replicated.


  If a broker goes down on the destination cluster due to an outage or planned failover (for example, proactively shutting down a broker),
  mirror topics will lag source topics on under-replicated partitions at the destination. To minimize or resolve mirror topic lag in these scenarios, set `link.fetcher.flow.control=-1`.

`local.listener.name`
: For a source initiated link, an alternative listener to be used by the cluster link on
  the source cluster. For more, see [Understanding Listeners in Cluster Linking](#cluster-link-listeners)

`link.mode`
: Used only for source-initiated links. Set this to DESTINATION on the destination cluster’s link (which you create first).
  Set this to SOURCE on the source cluster’s link (which you create second). For [bidirectional mode](#bidirectional-cluster-linking),
  set this to BIDIRECTIONAL on both clusters. You must use this in combination with `connection.mode`.
  This property should only be set for source-initiated cluster links.


  * Type: string
  * Default: DESTINATION


`mirror.start.offset.spec`
: Whether to get the full history of a mirrored topic (`earliest`), exclude the history and get only  the `latest` version,
  or to get the history of the topic starting at a given timestamp.


  * Type: string
  * Default: earliest


  - If set to a value of `earliest` (the default), new mirror topics get the full history of their associated topics.
  - If set to a value of `latest`, new mirror topics will exclude the history and only replicate messages sent after the mirror topic is created.
  - If set to a timestamp in ISO 8601 format (`YYYY-MM-DDTHH:mm:SS.sss`), new mirror topics get the history of the topics starting from the timestamp.


  When a mirror topic is created, it reads the value of this configuration and
  begins replication accordingly. If the setting is changed, it does not affect
  existing mirror topics; new mirror topics use the new value when they’re
  created.


  If some mirror topics need to start from earliest and some need to start from
  latest, there are two options:


  - Change the value of the cluster link’s `mirror.start.offset.spec` to the desired
    starting position before creating the mirror topic, or
  - Use two distinct cluster links, each with their own value for `mirror.start.offset.spec`,
    and create mirror topics on the appropriate cluster link as desired.

`topic.config.sync.ms`
: How often to refresh the topic configs, in milliseconds.


  * Type: int
  * Default: 5000


#### Default security config for bidirectional connectivity

By default, a cluster link in bidirectional mode is configured similar to the
default configuration for two cluster links.

![image](multi-dc-deployments/cluster-linking/images/cluster-link-bidirectional-security.png)

Each cluster requires:

- The ability to connect (outbound) to the other cluster. (If this is not possible, see [Advanced options for bidirectional Cluster Linking](#cluster-linking-bidirectional-advanced).)
- A user to create a cluster link object on it with:
  - An authentication configuration (such as API key or OAuth) for a principal on its remote cluster with ACLs or
    RBAC role bindings giving permission to read topic data and metadata.
    - The Describe:Cluster ACL
    - The `DescribeConfigs:Cluster` ACL if consumer offset sync is enabled (which is recommended)
    - The required ACLs or RBAC role bindings for a cluster link, as described in [Authorization (ACLs)](security.md#cluster-link-acls)
      (the rows for a cluster link on a source cluster).
    - `link.mode=BIDIRECTIONAL`


### ACLs Overview

Replicator supports communication with secure Kafka over TLS/SSL for both the source and destination clusters. Replicator also supports TLS/SSL or SASL for authentication. Differing security  configurations can be used on the source and destination clusters.

All properties documented here are additive (i.e. you can apply both TLS/SSL Encryption and SASL Plain authentication properties) except for `security.protocol`. The following table can be used to determine the correct value for this:

| Encryption   | Authentication   | security.protocol   |
|--------------|------------------|---------------------|
| TLS/SSL      | None             | SSL                 |
| TLS/SSL      | TLS/SSL          | SSL                 |
| TLS/SSL      | SASL             | SASL_SSL            |
| Plaintext    | SASL             | SASL_PLAINTEXT      |

You can configure Replicator connections to source and destination Kafka with:

- [TLS/SSL Encryption](../../security/protect-data/encrypt-tls.md#encryption-ssl-replicator). You can use different TLS/SSL configurations on the source and destination clusters.
- [SSL Authentication](../../security/authentication/mutual-tls/overview.md#authentication-ssl-replicator)
- [SASL/SCRAM](../../security/authentication/sasl/scram/overview.md#sasl-scram-replicator)
- [SASL/GSSAPI](../../security/authentication/sasl/gssapi/overview.md#sasl-gssapi-replicator)
- [SASL/PLAIN](../../security/authentication/sasl/plain/overview.md#sasl-plain-replicator)

To configure security on the source cluster, see the connector configurations for [Source Kafka: Security](configuration_options.md#source-security-config).
To configure security on the destination cluster, see the connector configurations [Destination Kafka: Security](configuration_options.md#destination-security-config) and the general security configuration for Connect workers [here](../../connect/security.md#connect-security).


#### IMPORTANT
- For current versions of Replicator, it is recommended that you use the previously mentioned JMX metrics to monitor Replicator lag as
  it is more accurate than using the consumer group lag tool. The following methodology to monitor Replicator lag is only recommended if you are
  using Replicator with a legacy version below 5.4.0.
- Replicator latency is calculated by taking the timestamp of the record consumed on the source and subtracting that from when the message offset to the destination is flushed.
  If old records are processed or if the time setting on the source records is not the same for producers and consumers, then the metric will spike. This is misleading
  and should not be construed as a latency issue, but rather is a limitation of this type of metrics calculation.

You can monitor Replicator lag by using the [Consumer Group Command tool](https://kafka.apache.org/documentation/#basic_ops_consumer_lag)
(`kafka-consumer-groups`). To use this functionality, you must set the Replicator `offset.topic.commit` config to
`true` (the default value).

Replicator does not consume using a consumer group, instead it manually assigns partitions. When `offset.topic.commit` is true, Replicator commits consumer offsets (again manually),
but these are for reference only and do not represent an active consumer group. Since Replicator only commits offsets and does not actually form a consumer group, the `kafka-consumer-groups`
command output will show no active members in the group (correctly); only the committed offsets. This is expected behavior for Replicator. To check membership information, use
Connect status endpoints rather than `kafka-consumer-groups`.

Replication lag is the number of messages that were produced to the origin cluster, but have not yet arrived to the destination cluster. It can also be measured as the amount of time it currently takes for a message to get replicated from origin to
destination. Note that this can be higher than the latency between the two datacenters if Replicator is behind for some
reason and needs time to catch up.

The main reasons to monitor replication lag are:

* If there is a need to failover from origin to destination and *if the origin cannot be restored*, all events that were produced to origin and not replicated to the target will be lost. (If the origin *can* be restored, the events will not be lost.)
* Any event processing that happens at the destination will be delayed by the lag.

The lag is typically just a few hundred milliseconds (depending on the network latency between the two datacenters), but
it can grow larger if network partitions or configuration changes temporarily pause replication and the replicator needs
to catch up. If the replication lag keeps growing, it indicates that Replicator throughput is lower than what gets produced
to the origin cluster and that additional Replicator tasks or Connect Workers are necessary. For example, if producers are
writing 100 MBps to the origin cluster, but the Replicator only replicates 50 MBps.

To increase the throughput, the TCP socket buffer should be increased on the Replicator and the brokers. When
Replicator is running in the destination cluster (recommended), you must also increase the following:

- The TCP send socket buffer (`socket.send.buffer.bytes`) on the source cluster brokers.
- The receive TCP socket buffer (`socket.receive.buffer.bytes`) on the consumers. A value of 512 KB is reasonable
  but you may want to experiment with values up to 12 MB.

If you are using Linux, you might need to change the default socket buffer maximum for the Kafka settings to take
effect. For more information about tuning your buffers, [see this article](https://www.cyberciti.biz/faq/linux-tcp-tuning/).


## Unified Stream Manager

Unified Stream Manager (USM) is now generally available with Confluent Platform 8.1. USM registers your on-premises Confluent Platform cluster
with Confluent Cloud to provide a single pane of glass for your data streams.

With USM, you can do the following:

- Use a global policy catalog to enforce data contracts and encryption rules.
- View unified data lineage across all your clusters and infrastructures.
- View and troubleshoot topics and connectors across Confluent Platform and Confluent Cloud from a single, centralized interface.

This release introduces the USM agent component to enable a secure, private network
connection to Confluent Cloud:

- All communication occurs over private networking.
- The agent initiates connections from your private environment over a limited set of endpoints.
- Only telemetry and resource metadata are shared with Confluent Cloud.

Kafka brokers and Connect workers are included to update embedded reporters to send the necessary telemetry and resource metadata.

To support the global policy catalog, Confluent Platform Schema Registry provides a read-only mode and a Schema Importer.
With these features, Confluent Cloud Schema Registry can serve as the source of truth, while the on-premises
Confluent Platform Schema Registry acts as a read-only cache and forwards all write requests to Confluent Cloud.

USM is designed to share only telemetry and metadata between Confluent Platform and Confluent Cloud.
This limited data sharing lets you adopt USM without needing to accept the full Confluent Cloud data
processing addendum.

For more information about USM, see [Unified Stream Manager in Confluent Platform](../usm/overview.md#usm-overview).


### What is the Schema Registry endpoint and how is it surfaced on Confluent Cloud Console?

The Schema Registry endpoint (also referred to as `schema-registry-url`) is the API endpoint URL for your Confluent Cloud Schema Registry cluster in a specific environment.
It’s used to make REST API calls to the Schema Registry service for operations such as:

- Creating, reading, updating, and deleting schemas
- Managing schema subjects and versions
- Configuring compatibility settings
- Performing schema validation and evolution operations

This is surfaced on Confluent Cloud Console as the Schema Registry endpoint. To find it, navigate to your Confluent Cloud environment,
select a cluster, click **Schema Registry** on the left menu, and click the **Endpoints** tab.

To find the Schema Registry endpoints using [Confluent CLI Command reference](https://docs.confluent.io/confluent-cli/current/command-reference/overview.html), run `confluent schema-registry cluster describe` (after selecting the appropriate environment with `confluent environment use <environment-ID>`).
On the Confluent CLI, the flag for the Schema Registry URL is `--schema-registry-endpoint`, as described in [confluent schema-registry cluster describe](https://docs.confluent.io/confluent-cli/current/command-reference/schema-registry/cluster/confluent_schema-registry_cluster_describe.html).
You can also list endpoints using [confluent schema-registry endpoint list](https://docs.confluent.io/confluent-cli/current/command-reference/schema-registry/endpoint/confluent_schema-registry_endpoint_list.html).
Note that in current versions, `confluent schema-registry cluster describe` returns only the PrivateLink Attachment private endpoints, whereas `confluent schema-registry endpoint list` lists all endpoints including the Confluent Cloud network.

To learn more about working with schemas on the Cloud Console, see the [Schema Management Quick Start](/cloud/current/get-started/schema-registry.html) and [Manage Schemas on Confluent Cloud](/cloud/current/sr/schemas-manage.html).

To learn about working with Schema Registry endpoints using the APIs, see the [Stream Catalog REST API Usage and Examples Guide](/cloud/current/stream-governance/stream-catalog-rest-apis.html) and [Confluent Cloud Schema Registry REST API Usage Examples](/cloud/current/sr/sr-rest-apis.html).


## Compatibility and schema evolution

Apache Kafka® producers write data to Kafka topics and Kafka consumers read data from
Kafka topics. There is an implicit “contract” that producers write data with a
schema that can be read by consumers, even as producers and consumers evolve
their schemas. Schema Registry helps ensure that this contract is met with compatibility
checks.

It is useful to think about schemas as APIs. Applications depend on APIs and
expect any changes made to APIs are still compatible and applications can still
run. Similarly, streaming applications depend on schemas and expect any changes
made to schemas are still compatible and they can still run. Schema evolution
requires compatibility checks to ensure that the producer-consumer contract is
not broken. This is where Schema Registry helps: it provides centralized schema management
and compatibility checks as schemas evolve.

To learn more about how Schema Registry manages compatibility, see the following topic in either the Confluent Cloud or Confluent Platform documentation:

- Confluent Cloud documentation: [Schema Evolution and Compatibility for Schema Registry](/cloud/current/sr/fundamentals/schema-evolution.html)
- Confluent Platform documentation: [Schema Evolution and Compatibility for Schema Registry](/platform/current/schema-registry/fundamentals/schema-evolution.html)


## Limitations

- Currently, when using Confluent Replicator to migrate schemas, Confluent Cloud is not supported as the source cluster. Confluent Cloud can only be the destination cluster.
  As an alternative, you can migrate schemas using the [REST API for Schema Registry](../develop/api.md#schemaregistry-api) to achieve the desired deployments.
  Specifics regarding Confluent Cloud limits on schemas and managing storage space are described in the APIs reference in [Manage Schemas in Confluent Cloud](/cloud/current/sr/index.html).
- Replicator does not support an “active-active” Schema Registry setup. It only supports migration (either one-time or continuous) from an active Schema Registry to a passive Schema Registry.

- Newer versions of Replicator cannot be used to replicate data from early version Kafka clusters to [Confluent Cloud](/cloud/current/index.html).
  Specifically, Replicator version 5.4.0 or later cannot be used to replicate data from clusters Apache Kafka® v0.10.2 or earlier
  nor from Confluent Platform v3.2.0 or earlier, to Confluent Cloud. If you have clusters on these earlier versions, use Replicator 5.0.x to replicate
  to Confluent Cloud until you can upgrade. Keep in mind the following, and plan your upgrades accordingly:
  - Kafka Connect workers included in Confluent Platform 3.2 and later are compatible with any Kafka broker that is included in Confluent Platform 3.0 and later as documented in [Cross-component compatibility](../../installation/versions-interoperability.md#cross-component-compatibility).
  - Confluent Platform 5.0.x has an end-of-support date of July 31, 2020 as documented in [Supported Versions and Interoperability for Confluent Platform](../../installation/versions-interoperability.md#interoperability-versions).


### Contexts and exporters

Schema Registry introduces two new concepts to support Schema Linking:

- **Contexts** - A [context](#schema-contexts) represents an independent scope in Schema Registry,
  and can be used to create any number of separate “sub-registries” within one Schema Registry cluster.
  Each schema context is an independent grouping of schema IDs and subject names, allowing
  the same schema ID in different contexts to represent completely different schemas.
  Any schema ID or subject name without an explicit context lives in the default context,
  denoted by a single dot `.`.  An explicit context starts with a dot and can contain any
  parts separated by additional dots, such as `.mycontext.subcontext`.
  Context names operate similar to absolute Unix paths, but with dots instead of forward
  slashes (the default schema is like the root Unix path). However, there is no relationship
  between two contexts that share a prefix.
- **Exporters** - A [schema exporter](#schema-exporters) is a component that resides in Schema Registry for exporting schemas
  from one Schema Registry cluster to another. The lifecycle of a schema exporter is managed through APIs,
  which are used to create, pause, resume, and destroy a schema exporter. A schema exporter is like
  a “mini-connector” that can perform change data capture for schemas.

The Quick Start below shows you how to get started using schema exporters and contexts for Schema Linking.

For in-depth descriptions of these concepts, see [Contexts](#schema-contexts) and [Exporters](#schema-exporters)


# Authentication in Confluent Platform

* [Overview](overview.md)
* [Mutual TLS](mutual-tls/index.md)
  * [Overview](mutual-tls/overview.md)
  * [Use Principal Mapping](mutual-tls/tls-principal-mapping.md)
* [OAuth/OIDC](oauth-oidc/index.md)
  * [Overview](oauth-oidc/overview.md)
  * [Claim Validation for OAuth JWT tokens](oauth-oidc/configure-oauth-jwt.md)
  * [OAuth/OIDC Service-to-Service Authentication](oauth-oidc/service-to-service.md)
  * [Configure Confluent Server Brokers](oauth-oidc/configure-cs.md)
  * [Configure Confluent Schema Registry](oauth-oidc/configure-sr.md)
  * [Configure Metadata Service](oauth-oidc/configure-mds.md)
  * [Configure Kafka Connect](oauth-oidc/configure-connect.md)
  * [Configure Confluent Control Center](oauth-oidc/configure-c3.md)
  * [Configure REST Proxy](oauth-oidc/configure-rest-proxy.md)
  * [Configure Truststores for TLS Handshake with Identity Providers](oauth-oidc/configure-truststore.md)
  * [Migrate from mTLS to OAuth Authentication](oauth-oidc/migrate-from-mtls-to-oauth.md)
  * [Use OAuth with ksqlDB](oauth-oidc/ksql-integration.md)
* [Multi-Protocol Authentication](multi-protocol/index.md)
  * [Overview](multi-protocol/overview.md)
  * [Use AuthenticationHandler Class](multi-protocol/authenticationhandler.md)
* [REST Proxy](rest-proxy/index.md)
  * [Overview](rest-proxy/overview.md)
  * [Principal Propagation for mTLS](rest-proxy/principal-propagation.md)
* [SSO for Confluent Control Center](sso-for-c3/index.md)
  * [Overview](sso-for-c3/overview.md)
  * [Configure OIDC SSO for Control Center](sso-for-c3/configure-sso-using-oidc.md)
  * [Configure OIDC SSO for Confluent CLI](sso-for-c3/configure-sso-for-cli.md)
  * [Troubleshoot](sso-for-c3/troubleshoot.md)
* [HTTP Basic Authentication](http-basic-auth/index.md)
  * [Overview](http-basic-auth/overview.md)
* [SASL](sasl/index.md)
  * [Overview](sasl/overview.md)
  * [SASL/GSSAPI (Kerberos)](sasl/gssapi/index.md)
  * [SASL/OAUTHBEARER](sasl/oauthbearer/index.md)
  * [SASL/PLAIN](sasl/plain/index.md)
  * [SASL/SCRAM](sasl/scram/index.md)
* [LDAP](ldap/index.md)
  * [Overview](ldap/overview.md)
  * [Configure Kafka Clients](ldap/client-authentication-ldap.md)
* [Delegation Tokens](delegation-tokens/index.md)
  * [Overview](delegation-tokens/overview.md)


# Confluent Metadata API Reference for Confluent Platform

The Confluent Metadata API has many endpoints, conceptually grouped as follows:

**Authentication**

Authenticates users against LDAP and returns user bearer tokens that can be used
with the other MDS endpoints and components in Confluent Platform (when configured to do so).

**Authorization**

Authorizes users to perform specific actions. Clients are not expected to use
these endpoints, which are used by Confluent Platform components (such as Connect and ksqlDB) to
authorize user actions.

**Role Based Access Control**

* Role binding CRUD
* Role binding summaries (used by Confluent CLI)
* High-level role binding management and rollups (used by Confluent Control Center )

**Centralized ACL control**

ACL CRUD for legacy Kafka-managed and centralized MDS-based ACLs

**Audit log configuration**

Configuration governing which events get logged, and where those audit log events are sent.
Works in conjunction with the Cluster Registry to push configuration changes to Kafka clusters.

**Cluster registry**

Tracking and naming CP components and clusters.

* Manually populated and updated by Admins.
* Leveraged by the Audit Log configuration.
* Leveraged by RBAC APIs to allow for RoleBinding calls to use “nice names” for clusters instead of
  cluster IDs.


## Use CSFLE for Confluent Enterprise

Client-side field level encryption (CSFLE) is available in Confluent Enterprise to help you
protect sensitive data in your Confluent Enterprise and perform stream processing on
encrypted data. You can use CSFLE with Confluent Enterprise without sharing access to your
[Key Encryption Keys (KEKs)](../../../_glossary.md#term-key-encryption-key-KEK). Here are
some key points about using CSFLE:

* You must use a key management service (KMS) to manage access to your
  Key Encryption Keys (KEKs).
* Extensive security checks and balances provided by Confluent protect your
  sensitive data.
* No user or application in Confluent Enterprise can access your encrypted fields in plaintext.
* Stream processing in Confluent Enterprise using Flink and ksqlDB is not possible because the
  data is encrypted and cannot be decrypted to perform operations.
* Your organization manages running producers and consumers with the proper
  configurations to access the KEKs and encrypt or decrypt data.
* You must use a key management service (KMS) to manage access to your Key
  Encryption Keys (KEKs).
* Extensive security checks and balances provided by Confluent protect your
  sensitive data.
* You own and manage your Key Encryption Keys (KEKs) and are responsible for
  overseeing the entire lifecycle of the KEKs.
* Sharing access to KEKs is not supported.
* Confluent never directly accesses your Key Encryption Keys (KEKs). Each KEK
  remains securely stored in your key management service (KMS) that is owned
  and managed by you. Confluent interacts with two APIs that use a KEK identifier
  and a payload to either encrypt or decrypt the payload with the specified KEK.
  Confluent can only see the KEK identifier and the payloads (encrypted or decrypted
  DEKs).
* Use the logging and auditing capabilities provided by your KMS to monitor and
  trace all access to KEKs to address any compliance or regulatory requirements.

The steps are summarized in the diagram below.

![Steps for client-side field level encryption and access control](images/csfle-no-cmk-access.png)


## Overview

This tutorial provides a step-by-step example to enable [TLS/SSL
encryption](protect-data/encrypt-tls.md#kafka-ssl-encryption), [SASL authentication](authentication/overview.md#kafka-sasl-auth),
and [authorization](authorization/acls/overview.md#kafka-authorization) on Confluent Platform with monitoring using
Confluent Control Center. Follow the steps to walk through configuration settings for securing
Apache Kafka® brokers, Kafka Connect, and Confluent Replicator, plus all the components
required for monitoring, including the Confluent Metrics Reporter.

When working through the tutorial, be aware of the following:

* For simplicity, this tutorial uses [SASL/PLAIN (or PLAIN)](authentication/sasl/plain/overview.md#kafka-sasl-auth-plain), a simple username/password authentication mechanism
  typically used with TLS encryption to implement secure authentication.
* For production deployments of Confluent Platform, [SASL/GSSAPI (Kerberos)](authentication/sasl/gssapi/overview.md#kafka-sasl-auth-gssapi) or [SASL/SCRAM](authentication/sasl/scram/overview.md#kafka-sasl-auth-scram) is
  recommended.
* Confluent Cloud uses [SASL/PLAIN (or PLAIN)](authentication/sasl/plain/overview.md#kafka-sasl-auth-plain) over TLS v1.2 encryption for authentication because it offers broad client support while providing a good level of security. The usernames and passwords used in the SASL exchange are API keys and secrets that should be securely managed using a secrets store and rotated periodically.


## Next steps

To see a fully secured multi-node cluster, check out the Docker-based
[Confluent Platform demo](../tutorials/cp-demo/index.md#cp-demo). It shows entire configurations,
including security-related and non security-related configuration parameters, on
all components in Confluent Platform, and the demo’s playbook has a security section for
further learning.

Read the [documentation](overview.md#security) for more details about security design and configuration on all components in Confluent Platform. While this tutorial uses the PLAIN
mechanism for the SASL examples, Confluent additionally supports [GSSAPI (Kerberos)](authentication/sasl/gssapi/overview.md#kafka-sasl-auth-gssapi) and [SCRAM](authentication/sasl/scram/overview.md#kafka-sasl-auth-scram), which are more suitable for production.

We welcome feedback in the [Confluent community](https://launchpass.com/confluentcommunity)  security channel in Slack!


## Overview

This example shows users how to build pipelines with Apache Kafka® in Confluent Platform.

![image](streams/images/pipeline.jpg)

It showcases different ways to produce data to Kafka topics, with and without Kafka Connect, and various ways to serialize it for the Kafka Streams API and ksqlDB.

| Example                                 | Produce to Kafka Topic         | Key    | Value        | Stream Processing   |
|-----------------------------------------|--------------------------------|--------|--------------|---------------------|
| Confluent CLI Producer with String      | CLI                            | String | String       | Kafka Streams       |
| JDBC source connector with JSON         | JDBC with SMT to add key       | Long   | Json         | Kafka Streams       |
| JDBC source connector with SpecificAvro | JDBC with SMT to set namespace | null   | SpecificAvro | Kafka Streams       |
| JDBC source connector with GenericAvro  | JDBC                           | null   | GenericAvro  | Kafka Streams       |
| Java producer with SpecificAvro         | Producer                       | Long   | SpecificAvro | Kafka Streams       |
| JDBC source connector with Avro         | JDBC                           | Long   | Avro         | ksqlDB              |

Detailed walk-thru of this example is available in the whitepaper [Kafka Serialization and Deserialization (SerDes) Examples](https://www.confluent.io/resources/kafka-streams-serialization-deserialization-code-examples) and the blog post [Building a Real-Time Streaming ETL Pipeline in 20 Minutes](https://www.confluent.io/blog/building-real-time-streaming-etl-pipeline-20-minutes/)


### Optional configuration parameters

Here are the optional [Streams configuration parameters](../javadocs.md#streams-javadocs),
with the level of importance indicated for each:

- High: These parameters can have a significant impact on performance. Take care when deciding the values of these parameters.
- Medium: These parameters can have some impact on performance. Your specific environment will determine how much tuning effort should be focused on these parameters.
- Low: These parameters have a less general or less significant impact on performance.

| Parameter Name                                | Importance   | Description                                                                                                                                                                                                                                                                                              | Default Value                                                                 |
|-----------------------------------------------|--------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------|
| acceptable.recovery.lag                       | Medium       | The maximum acceptable lag (number of offsets to catch up) for an instance to be considered caught-up and ready
for the active task.                                                                                                                                                                 | 10,000                                                                        |
| application.server                            | Low          | A host:port pair pointing to an embedded user defined endpoint that can be used for discovering the locations of
state stores within a single Kafka Streams application. The value of this must be different for each instance
of the application.                                               | the empty string                                                              |
| buffered.records.per.partition                | Low          | The maximum number of records to buffer per partition.                                                                                                                                                                                                                                                   | 1000                                                                          |
| cache.max.bytes.buffering                     | Medium       | Deprecated in Confluent Platform 7.4. Use `statestore.cache.max.bytes` instead.                                                                                                                                                                                                                          | 10485760 bytes                                                                |
| client.id                                     | Medium       | An ID string to pass to the server when making requests.
(This setting is passed to the consumer/producer clients used internally by Kafka Streams.)                                                                                                                                                 | the empty string                                                              |
| commit.interval.ms                            | Low          | The frequency with which to save the position (offsets in source topics) of tasks.

- For at-least-once processing, committing means saving the position (offsets) of the processor.
- For exactly-once processing, it means to commit the transaction, which includes saving the position.  | 30000 ms (`at_least_once`) / 100 ms (`exactly_once_v2`)                       |
| connections.max.idle.ms                       | Low          | The number of milliseconds to wait before closing idle connections.                                                                                                                                                                                                                                      | 540000 ms (9 minutes)                                                         |
| default.client.supplier                       | Low          | Client supplier class that implements the `org.apache.kafka.streams.KafkaClientSupplier` interface.                                                                                                                                                                                                      |                                                                               |
| default.deserialization.exception.handler     | Medium       | Deprecated. Use `default.production.exception.handler` instead.                                                                                                                                                                                                                                          | See [default.deserialization.exception.handler](#streams-developer-guide-deh) |
| default.dsl.store                             | Low          | Deprecated in Confluent Platform 7.7. The default state store type used by DSL operators.                                                                                                                                                                                                                | “rocksDB”                                                                     |
| default.key.serde                             | Medium       | Default serializer/deserializer class for record keys, implements the `Serde` interface (see also value.serde).                                                                                                                                                                                          | `null`                                                                        |
| default.production.exception.handler          | Medium       | Exception handling class that implements the `ProductionExceptionHandler` interface.                                                                                                                                                                                                                     | See [default.production.exception.handler](#streams-def-prod-exc-hand)        |
| default.timestamp.extractor                   | Medium       | Default timestamp extractor class that implements the `TimestampExtractor` interface.                                                                                                                                                                                                                    | See [Timestamp Extractor](#streams-developer-guide-timestamp-extractor)       |
| default.value.serde                           | Medium       | Default serializer/deserializer class for record values, implements the `Serde` interface (see also key.serde).                                                                                                                                                                                          | `null`                                                                        |
| default.windowed.key.serde.inner              | Medium       | Deprecated in Confluent Platform 7.9. Use `windowed.inner.class.serde` instead.                                                                                                                                                                                                                          | `Serdes.ByteArray().getClass().getName()`                                     |
| default.windowed.value.serde.inner            | Medium       | Deprecated in Confluent Platform 7.9. Use `windowed.inner.class.serde` instead.                                                                                                                                                                                                                          | `Serdes.ByteArray().getClass().getName()`                                     |
| dsl.store.suppliers.class                     | Low          | Defines a default state store implementation.                                                                                                                                                                                                                                                            | `BuiltInDslStoreSuppliers`                                                    |
| enable.metrics.push                           | Medium       | Push client metrics to the cluster, if the cluster has a client metrics subscription that matches this client.                                                                                                                                                                                           |                                                                               |
| ensure.explicit.internal.resource.naming      | Medium       | Enables enforcement of explicit naming for all internal resources of the topology, including internal topics.                                                                                                                                                                                            | `false`                                                                       |
| group.protocol                                | Low          | The protocol used for group coordination.                                                                                                                                                                                                                                                                | `classic`                                                                     |
| log.summary.interval.ms                       | Low          | Added to a window’s `maintainMs` to ensure data is not deleted from the log prematurely. Allows for clock drift.                                                                                                                                                                                         | 120000 milliseconds (2 minutes)                                               |
| max.task.idle.ms                              | Medium       | Maximum amount of time Kafka Streams waits to fetch data to ensure in-order processing semantics.                                                                                                                                                                                                        | 0 milliseconds                                                                |
| max.warmup.replicas                           | Medium       | The maximum number of warmup replicas (extra standbys beyond the configured num.standbys) that can be assigned
at once.                                                                                                                                                                              | 2                                                                             |
| metadata.max.age.ms                           | Low          | The period of time in milliseconds after which a refresh of metadata is forced.                                                                                                                                                                                                                          | 300000 ms (5 minutes)                                                         |
| metric.reporters                              | Low          | A list of classes to use as metrics reporters.                                                                                                                                                                                                                                                           | the empty list                                                                |
| metrics.num.samples                           | Low          | The number of samples maintained to compute metrics.                                                                                                                                                                                                                                                     | 2                                                                             |
| metrics.recording.level                       | Low          | The highest recording level for metrics.                                                                                                                                                                                                                                                                 | `INFO`                                                                        |
| metrics.sample.window.ms                      | Low          | The window of time a metrics sample is computed over.                                                                                                                                                                                                                                                    | 30000 milliseconds                                                            |
| num.standby.replicas                          | High         | The number of standby replicas for each task.                                                                                                                                                                                                                                                            | 0                                                                             |
| num.stream.threads                            | Medium       | The number of threads to execute stream processing.                                                                                                                                                                                                                                                      | 1                                                                             |
| poll.ms                                       | Low          | The amount of time in milliseconds to block waiting for input.                                                                                                                                                                                                                                           | 100 milliseconds                                                              |
| probing.rebalance.interval.ms                 | Low          | The maximum time to wait before triggering a rebalance to probe for warmup replicas that have sufficiently
caught up.                                                                                                                                                                                | 600000 milliseconds (10 minutes)                                              |
| processing.exception.handler                  | Medium       | Exception handling class that implements the `ProcessingExceptionHandler` interface.                                                                                                                                                                                                                     | `LogAndFailProcessingExceptionHandler`                                        |
| processing.guarantee                          | Medium       | The processing mode. Can be either `at_least_once` (default), or `exactly_once_v2` (for EOS version 2,
requires Confluent Platform version 5.5.x / Kafka version 2.5.x or higher). Deprecated config options are `exactly_once`
(for EOS version 1) and `exactly_once_beta` (for EOS version 2). | See [Processing Guarantee](#streams-developer-guide-processing-guarantee)     |
| production.exception.handler                  | Medium       | Exception handling class that implements the `ProductionExceptionHandler` interface.
For more information, see [production.exception.handler](#streams-developer-guide-production-exception-handler).                                                                                                | `DefaultProductionExceptionHandler`                                           |
| rack.aware.assignment.non_overlap_cost        | Low          | Cost associated with moving tasks from existing assignment.
For more information, see [rack.aware.assignment.non_overlap_cost](#streams-developer-guide-rack-aware-assignment-non-overlap-cost).                                                                                                     | `null`                                                                        |
| rack.aware.assignment.strategy                | Low          | The strategy used for rack-aware assignment. Values are “none” (default), “min_traffic”, and
“balance_subtopology”. For more information, see [rack.aware.assignment.strategy](#streams-developer-guide-rack-aware-assignment-strategy).                                                             | `none`                                                                        |
| rack.aware.assignment.tags                    | Low          | List of tag keys used to distribute standby replicas across Kafka Streams clients.                                                                                                                                                                                                                       | the empty list                                                                |
| rack.aware.assignment.traffic_cost            | Low          | Cost associated with cross-rack traffic.
For more information, see [rack.aware.assignment.traffic_cost](#streams-developer-guide-rack-aware-assignment-traffic-cost).                                                                                                                                | `null`                                                                        |
| replication.factor                            | High         | The replication factor for changelog topics and repartition topics created by the application.
If your broker cluster is on version Confluent Platform 5.4.x (Kafka 2.4.x) or newer,
you can set -1 to use the broker default replication factor.                                                | 1                                                                             |
| retries                                       | Medium       | The number of retries for broker requests that return a retryable error.                                                                                                                                                                                                                                 | 0                                                                             |
| retry.backoff.ms                              | Medium       | The amount of time in milliseconds, before a request is retried.
This applies if the `retries` parameter is configured to be greater than 0.                                                                                                                                                         | 100                                                                           |
| rocksdb.config.setter                         | Medium       | The RocksDB configuration.                                                                                                                                                                                                                                                                               |                                                                               |
| state.cleanup.delay.ms                        | Low          | The amount of time in milliseconds to wait before deleting state when a partition has migrated.                                                                                                                                                                                                          | 600000 milliseconds                                                           |
| state.dir                                     | High         | Directory location for state stores.                                                                                                                                                                                                                                                                     | `/${java.io.tmpdir}/kafka-streams`                                            |
| statestore.cache.max.bytes                    | Medium       | Maximum number of memory bytes to be used for record caches across all threads.                                                                                                                                                                                                                          | 10485760 bytes                                                                |
| task.assignor.class                           | Medium       | A task assignor class or class name implementing the `TaskAssignor` interface.                                                                                                                                                                                                                           | The high-availability task assignor.                                          |
| task.timeout.ms                               | Medium       | The maximum amount of time in ms a task might stall due to internal errors and retries until an error is raised.                                                                                                                                                                                         | 300000 milliseconds (5 minutes)                                               |
| topology.optimization                         | Low          | Enables/Disables topology optimization.                                                                                                                                                                                                                                                                  | `NO_OPTIMIZATION`                                                             |
| upgrade.from                                  | Medium       | The version you are upgrading from during a rolling upgrade.                                                                                                                                                                                                                                             | See [Upgrade From](#streams-developer-guide-upgrade-from)                     |
| windowed.inner.class.serde                    | Medium       | Serde for the inner class of a windowed record.                                                                                                                                                                                                                                                          |                                                                               |
| windowstore.changelog.additional.retention.ms | Low          | Added to a windows maintainMs to ensure data is not deleted from the log prematurely. Allows for clock drift.                                                                                                                                                                                            | 86400000 milliseconds = 1 day                                                 |
| window.size.ms                                | Low          | Sets window size for the deserializer in order to calculate window end times.                                                                                                                                                                                                                            | `null`                                                                        |


##### Join co-partitioning requirements

For equi-joins, input data must be co-partitioned when joining. This ensures
that input records with the same key, from both sides of the join, are
delivered to the same stream task during processing.
**It is your responsibility to ensure data co-partitioning when joining**.

Co-partitioning is not required when performing
[KTable-KTable Foreign-Key](#streams-developer-guide-dsl-joins-ktable-ktable-foreign-key)
joins and [GlobalKTable](../concepts.md#streams-concepts-globalktable) joins.

The requirements for data co-partitioning are:

* The input topics of the join (left side and right side) must have the **same number of partitions**.
* All applications that *write* to the input topics must have the **same partitioning strategy** so that records with
  the same key are delivered to same partition number.  In other words, the keyspace of the input data must be
  distributed across partitions in the same manner.
  This means that, for example, applications that use Kafka’s [Java Producer API](../../clients/overview.md#kafka-clients) must use the
  same partitioner (cf. the producer setting `"partitioner.class"` aka `ProducerConfig.PARTITIONER_CLASS_CONFIG`),
  and applications that use the Kafka’s Streams API must use the same `StreamPartitioner` for operations such as
  `KStream#to()`.  The good news is that, if you happen to use the default partitioner-related settings across all
  applications, you do not need to worry about the partitioning strategy.

Why is data co-partitioning required? Because
[KStream-KStream](#streams-developer-guide-dsl-joins-kstream-kstream),
[KTable-KTable](#streams-developer-guide-dsl-joins-ktable-ktable), and
[KStream-KTable](#streams-developer-guide-dsl-joins-kstream-ktable) joins
are performed based on the keys of records, for example, `leftRecord.key == rightRecord.key`.
It is required that the input streams/tables of a join are co-partitioned by key.

There are two exceptions in which co-partitioning is not required.
: - For [KStream-GlobalKTable](#streams-developer-guide-dsl-joins-kstream-globalktable)
    joins, co-partitioning is not required because *all* partitions of the
    `GlobalKTable`’s underlying changelog stream are made available to
    each `KafkaStreams` instance, so each instance has a full copy of the
    changelog stream. Further, a `KeyValueMapper` allows for non-key based
    joins from the `KStream` to the `GlobalKTable`.
  - [KTable-KTable Foreign-Key](#streams-developer-guide-dsl-joins-ktable-ktable-foreign-key)
    joins do not require co-partitioning. Kafka Streams internally ensures
    co-partitioning for Foreign-Key joins.

Kafka Streams partly verifies the co-partitioning requirement
: During the partition assignment step, that is, at runtime, Kafka Streams verifies
  whether the number of partitions for both sides of a join are the same. If
  they’re not, a `TopologyBuilderException` (runtime exception) is being
  thrown. Note that Kafka Streams can’t verify whether the partitioning strategy
  matches between the input streams/tables of a join. You must ensure that this
  is the case.

Ensuring data co-partitioning
: If the inputs of a join are not co-partitioned yet, you must ensure this
  manually. You may follow a procedure such as outlined below.

To avoid bottlenecks, we recommend repartitioning the topic with fewer partitions
to match the larger partition number. It’s also possible to repartition the
topic with more partitions to match the smaller partition number. For
stream-table joins, we recommended repartitioning the KStream, because
repartitioning a KTable may result is a second state store. For table-table
joins, consider the size of the KTables and repartition the smaller KTable.

1. Identify the input KStream/KTable in the join whose underlying Kafka topic has the smaller number of partitions.
   Let’s call this stream/table “SMALLER”, and the other side of the join “LARGER”.  To learn about the number of
   partitions of a Kafka topic you can use, for example, the CLI tool `bin/kafka-topics` with the `--describe`
   option.
2. Within your application, re-partition the data of “SMALLER”. You must ensure
   that, when repartitioning the data with repartition, the same partitioner is
   used as for “LARGER”.
   - If “SMALLER” is a KStream: `KStream#repartition(Repartitioned.numberOfPartitions(...))`.
   - If “SMALLER” is a KTable: `KTable#toStream#repartition(Repartitioned.numberOfPartitions(...).toTable())`.
3. Within your application, perform the join between “LARGER” and the new stream/table.


# Integration with Confluent Control Center

Since the 3.2 release, [Confluent Control Center](https://docs.confluent.io/control-center/current/overview.html) displays
the underlying [producer metrics](../kafka/monitoring.md#kafka-monitoring-metrics-producer) and
[consumer metrics](../kafka/monitoring.md#kafka-monitoring-metrics-consumer) of a Kafka Streams
application, which the Kafka Streams API uses internally whenever data needs to be
read from or written to Kafka topics. These metrics can be used, for example, to
monitor the so-called “consumer lag” of an application, which indicates whether
an application at its
[current capacity and available computing resources](developer-guide/running-app.md#streams-developer-guide-execution-scaling)
is able to keep up with the incoming data volume.

In Control Center, all of the running instances of a Kafka Streams application appear
as a single consumer group.

Restore consumers of an application are displayed separately. Behind the
scenes, the Streams API uses a dedicated “restore” consumer for the purposes
of fault tolerance and state management. This restore consumer manually assigns
and manages the topic partitions it consumes from and is not a member of the
application’s consumer group. As a result, the restore consumers are displayed
separately from their application.


# Kafka Streams for Confluent Platform

Kafka Streams is a client library for building applications and microservices,
where the input and output data are stored in an Apache Kafka® cluster. It combines
the simplicity of writing and deploying standard Java and Scala applications
on the client side with the benefits of Kafka’s server-side
[cluster](../_glossary.md#term-Kafka-cluster) technology.


If your Kafka Streams applications use Confluent Cloud resources, you can monitor them
with Confluent Cloud Console. For more information, see
[Monitor Kafka Streams Applications in Confluent Cloud](/cloud/current/kafka-streams/monitor-kafka-streams-apps.html).

Free Video Course
: [The free Kafka Streams 101 course](https://developer.confluent.io/learn-kafka/kafka-streams/get-started/)
  shows what Kafka Streams is and how to get started with it.

Quick Start Guide
: [Build your first Kafka Streams application](https://developer.confluent.io/tutorials/creating-first-apache-kafka-streams-application/confluent.html)
  shows how to run a Java application that uses the Kafka Streams library by
  demonstrating a simple end-to-end data pipeline powered by Kafka.

Streams Podcasts
: [Streaming Audio](https://developer.confluent.io/podcast/) is a podcast
  from Confluent, the team that built Kafka. Confluent developer advocates and
  guests unpack a variety of topics surrounding Kafka, [event stream](../_glossary.md#term-event-stream) processing,
  and real-time data.


  - [Capacity Planning Your Apache Kafka Cluster](https://developer.confluent.io/podcast/capacity-planning-your-apache-kafka-cluster/)
  - [Real-Time Stream Processing with Kafka Streams ft. Bill Bejeck](https://developer.confluent.io/podcast/real-time-stream-processing-with-kafka-streams-ft-bill-bejeck)
  - [Running Hundreds of Stream Processing Applications with Apache Kafka at Wise](https://developer.confluent.io/podcast/running-hundreds-of-stream-processing-applications-with-apache-kafka-at-wise)
  - [Apache Kafka Fundamentals: The Concept of Streams and Tables ft. Michael Noll](https://confluent.buzzsprout.com/186154/3559354-apache-kafka-fundamentals-the-concept-of-streams-and-tables-ft-michael-noll)
  - [Introducing JSON and Protobuf Support ft. David Araujo and Tushar Thole](https://confluent.buzzsprout.com/186154/3970760-introducing-json-and-protobuf-support-ft-david-araujo-and-tushar-thole)

Recommended Reading
: - Blog post: [Introducing Apache Kafka 4.1](https://www.confluent.io/blog/introducing-apache-kafka-4-1/)
  - Blog post: [Streams and Tables in Apache Kafka: A Primer](https://www.confluent.io/blog/kafka-streams-tables-part-1-event-streaming/)
  - Blog post: [Introducing Kafka Streams: Stream Processing Made Simple](https://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/)
  - Course: [Kafka Streams 101](https://developer.confluent.io/learn-kafka/kafka-streams/get-started/)
  - Course: [Kafka Storage and Processing Fundamentals](https://developer.confluent.io/learn/kafka-storage-and-processing/)

Screencasts
: Watch [Apache Kafka 4.1: Enhanced Stability, New OAuth Support, Scalable Queues, Broker-Side Rebalancing](https://www.youtube.com/watch?v=cr9cDJGjm2E) on YouTube.


      <iframe src="https://www.youtube.com/embed/cr9cDJGjm2E?si=z7yJ37LIKoSMFbTx" frameborder="0" allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe>


  Watch the [Intro to Streams API](https://www.youtube.com/watch?v=Z3JKCLG3VP4) on YouTube.


      <iframe src="https://www.youtube.com/embed/Z3JKCLG3VP4" frameborder="0" allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe>


## Deploy Confluent Replicator

Starting in the 6.1.0 release, Ansible Playbooks for Confluent Platform supports deployment of
[Confluent Replicator](/platform/current/multi-dc-deployments/replicator/replicator-run.html#replicator-executable).

Using Ansible, you can deploy Replicator with the following security mechanisms:

* SASL/PLAIN
* SASL/SCRAM
* Kerberos
* mTLS
* Plaintext (which is no auth no encryption)

The general deployment model is to deploy Replicator after both the source and
destination clusters have been deployed.

We recommend creating an inventory file specifically for the Replicator deployment,
excluding other cluster deployment-related configuration. In this section, an
example file, `replicator-hosts.yml`, is used.

There are two clusters in this example, the source cluster and the destination
cluster.

Replicator has four client connections split across the two clusters:

* Replicator configuration connection to the cluster which is used for storing
  configuration information in topics. See
  [Configure Replicator configuration connection](#ansible-replicator-client-connection).
* Replicator monitoring connection which is used to produce metrics to the metrics
  cluster. This is often the same cluster as the cluster used to store
  configuration information. See [Configure monitoring connection](#ansible-replicator-monitoring-connection).
* Replicator consumer connection which is used to consume data from the source
  cluster. See
  [Configure consumer connection](#ansible-replicator-consumer-connection).
* Replicator producer connection which is used to produce data to the destination
  cluster. See
  [Configure producer connection](#ansible-replicator-producer-connection).

The following sections list the configuration properties required in the Replicator
inventory file. The examples use:

* SASL/PLAIN with TLS on the source cluster
* Kerberos with TLS on the destination cluster

After configuring the replicator, you deploy the replicator with the following
command. The command uses the example inventory file, `replicator-hosts.yml`.

```bash
ansible-playbook -i replicator-hosts.yml playbooks/all.yml
```


#### NOTE
JWT assertion retrieval from file flow is not recommended for production
environments. Use [local client assertion flow](#ansible-oauth-client-local-client-assertion) instead.

To configure JWT assertion retrieval from file flow:

1. Set the [OAuth client assertion variables](#ansible-oauth-client-local-client-assertion)
2. Enable JWT assertion retrieval from file flow using the following
   variables for Confluent Platform components. Set the variable to the directory where client
   assertion files exist.
   ```yaml
   oauth_superuser_oauth_client_assertion_file_base_path:
   kafka_broker_oauth_client_assertion_file_base_path:
   kafka_controller_oauth_client_assertion_file_base_path:
   schema_registry_oauth_client_assertion_file_base_path:
   kafka_connect_oauth_client_assertion_file_base_path:
   ksql_oauth_client_assertion_file_base_path:
   kafka_rest_oauth_client_assertion_file_base_path:
   kafka_connect_replicator_oauth_client_assertion_file_base_path:
   kafka_connect_replicator_producer_oauth_client_assertion_file_base_path:
   kafka_connect_replicator_erp_oauth_client_assertion_file_base_path:
   kafka_connect_replicator_consumer_erp_oauth_client_assertion_file_base_path:
   ```
3. Each component acting as a client to the server component
   must have an individual assertion file at the above base file path you set
   above (`<server component>_oauth_client_assertion_file_base_path:`) to prevent
   token reuse issues.

   The following is an example ksqlDB directory structure for JWT assertion
   retrieval from file flow:
   ```bash
   ksql_oauth_client_assertion_file_base_path/kafka_client.jwt
   ksql_oauth_client_assertion_file_base_path/schema_registry_client.jwt
   ksql_oauth_client_assertion_file_base_path/mds_client.jwt
   ksql_oauth_client_assertion_file_base_path/ksql_client.jwt
   ```

   For a full list of client assertion files, see the Confluent Ansible
   variables file at:
   ```html
   https://github.com/confluentinc/cp-ansible/blob/8.1.0-post/roles/variables/vars/main.yml
   ```


#### Settings for RBAC with mTLS

Sample inventory files for RBAC configurations are provided in the
`sample_inventories` directory under the Confluent Ansible home directory:

```html
https://github.com/confluentinc/cp-ansible/blob/8.1.0-post/docs/sample_inventories/
```

Add the required variables in your inventory file to enable and configure RBAC
with mTLS.

The following are the most commonly used variables to enable RBAC with mTLS:

* `rbac_enabled`

  Set to `true` for RBAC.
* `auth_mode`

  Authorization mode on all Confluent Platform components.

  Set to `mtls` for RBAC with mTLS only.
* `mds_ssl_client_authentication`

  The configuration of the MDS server to enforce SSL client authentication on
  MDS.

  The MDS server will use mTLS certificates for authentication and the principal
  extracted from certificates for authorization.

  Options are:
  * `none`: The client does not need to to send certificate. If the client
    sends certificate it will be ignored.
  * `requested`: The clients may or may not send the certificates. In case
    clients do not send certificates, there should be either LDAP or OAuth
    credentials/token which should provide the principal. This option is used
    during upgrades.
  * `required`: The client must send certificates to the server.

  Default:  `none`
* `ssl_client_authentication`

  Kafka broker listeners configuration to enforce SSL client authentication.

  Options are:
  * `none`: The client does not need to to send certificate. If the client
    sends certificate it will be ignored.
  * `requested`: The clients may or may not send the certificates. In case
    clients do not send certificates, there should be either LDAP or OAuth
    credentials/token which should provide the principal. This option is used
    during upgrades.
  * `required`: The client must send certificates to the server.

  Default:  `none`
* `<component>_ssl_client_authentication`

  The component-level setting for Schema Registry, Connect, REST Proxy to enforce SSL
  client authentication.

  Options are:
  * `none`: The client does not need to to send certificate. If the client
    sends certificate it will be ignored.
  * `requested`: The clients may or may not send the certificates. In case
    clients do not send certificates, there should be either LDAP or OAuth
    credentials/token which should provide the principal. This option is used
    during upgrades.
  * `required`: The client must send certificates to the server.

  Default: The value of `ssl_client_authentication`
* `erp_ssl_client_authentication`

  Embedded REST Proxy server’s configuration to enforce SSL client
  authentication on Embedded REST Proxy.

  Options are:
  * `none`: The client does not need to to send certificate. If the client
    sends certificate it will be ignored.
  * `requested`: The clients may or may not send the
    certificates. In case clients do not send certificates, there should be
    either LDAP or OAuth credentials/token which should provide the principal.
    This option is used during upgrades.
  * `required`: The client must send certificates to the server.

  Default: `mds_ssl_client_authentication` value
* `impersonation_super_users`

  Required for `auth_mode: mtls`.

  A list of principals allowed to get an impersonation token for other users
  except the impersonation-protected users (`impersonation_protected_users`).

  For more information, see [Enable Token-based Authentication for RBAC](https://docs.confluent.io/platform/current/security/authorization/rbac/configure-mtls-rbac.html).

  For example:
  ```yaml
  impersonation_super_users:
    - 'kafka_broker'
    - 'kafka_rest'
    - 'schema_registry'
    - 'kafka_connect'
  ```

  Default: None
* `impersonation_protected_users`

  Required for RBAC with mTLS only.

  A list of principals who cannot be impersonated by REST Proxy. Super users
  should be added here to disallow them from being impersonated.

  For example:
  ```yaml
  impersonation_protected_users:
    - 'super_user'
  ```
* `principal_mapping_rules`

  The rules to map a distinguished name from the certificates to a short
  principal name.

  Default: `DEFAULT`

  For example:
  ```yaml
  principal_mapping_rules:
     - "RULE:.*CN=([a-zA-Z0-9.-_]*).*$/$1/"
     - "DEFAULT"
  ```

  For details about principal mapping rules, see [Principal Mapping Rules for
  SSL Listeners](https://docs.confluent.io/platform/current/kafka/configure-mds/mutual-tls-auth-rbac.html#principal-mapping-rules-for-ssl-listeners-extract-a-principal-from-a-certificate).
* `rbac_super_users`

  Additional list of super user principals for RBAC-enabled Confluent Platform clusters.
  When mTLS is enabled on Kafka brokers or KRaft controllers, their
  certificate principals should be passed in this list.
  You can add certificate principals and any other super users you want in this
  variable, and it will reach to both broker and controller.

  If you define this, Confluent Ansible does not pick the certificate principal of
  KRaft in Kafka brokers and Kafka brokers in KRaft automatically. You must
  explicitly add those principals in this `rbac_super_users` list.

  Default: None

  For example:
  ```yaml
  all:
    rbac_super_users:
      - User:C=US,ST=Ca,L=PaloAlto,O=CONFLUENT,OU=TEST,CN=kafka_broker
      - User:C=US,ST=Ca,L=PaloAlto,O=CONFLUENT,OU=TEST,CN=kafka_controller
      - User:CN=kafka_user1
  ```


## Cluster registry

You can use Ansible Playbooks for Confluent Platform to name your clusters within the [cluster
registries](/platform/current/security/cluster-registry.html) in Confluent Platform.

Cluster registry provides a way to centrally register and identify Kafka
clusters in the metadata service (MDS) to simplify the RBAC role binding process
and to enable centralized audit logging.

Register the Kafka clusters in the MDS cluster registry using the following variables
in the inventory file of the cluster.

* To register a Kafka cluster in the MDS:
  ```none
  kafka_broker_cluster_name:
  ```
* To register a Schema Registry cluster in the MDS:
  ```none
  schema_registry_cluster_name:
  ```
* To register a Kafka Connect cluster in the MDS:
  ```none
  kafka_connect_cluster_name:
  ```
* To register a ksqlDB cluster in the MDS:
  ```none
  ksql_cluster_name:
  ```


## Add Confluent license

To add a Confluent license key for Confluent Platform components, use a custom property for
each Confluent Platform component in the `hosts.yml` file as following:

```yaml
all:
  vars:
    kafka_broker_custom_properties:
     confluent.license: <license-key>
     kafka.rest.confluent.license.topic: "_confluent-command"
   schema_registry_custom_properties:
     confluent.license: <license-key>
   kafka_connect_custom_properties:
     confluent.license: <license-key>
   control_center_next_gen_custom_properties:
     confluent.license: <license-key>
   kafka_rest_custom_properties:
     confluent.license: <license-key>
   ksql_custom_properties:
     confluent.license: <license-key>
```

Note that Confluent Server (Kafka broker) contains Kafka REST Server, and this
component also requires a valid license configuration. Set the
`kafka.rest.confluent.license.topic` property to the `_confluent-command`
topic that stores the Confluent license.

To add license to a connector, use the following config in the `hosts.yaml`
file:

```yaml
all:
  vars:
    kafka_connect_connectors:
    - name: sample-connector
      config:
        confluent.license: <license-key>
```

The following example adds a license key for Kafka and Schema Registry. The example creates a
variable for the license key and uses the variable in the custom properties.

```yaml
vars:
  confluent_license: asdfkjkadslkfjaslkdf
  kafka_broker_custom_properties:
    confluent.license: "{{ confluent_license }}"
    kafka.rest.confluent.license.topic: "_confluent-command"
  schema_registry_custom_properties:
    confluent.license: "{{ confluent_license }}"
```

For additional license configuration parameters you can set with the above custom
properties, see [License Configurations for Confluent Platform](https://docs.confluent.io/platform/current/installation/configuration/license-configs.html#license-configurations-for-cp).


#### Produce Records

1. Run the producer, passing in arguments for:
   - the local file with configuration parameters to connect to your Kafka cluster
   - the topic name

   ```bash
   lein producer $HOME/.confluent/java.config test1
   ```

   You should see:
   ```text
   …
   Producing record: alice     {"count":0}
   Producing record: alice     {"count":1}
   Producing record: alice     {"count":2}
   Producing record: alice     {"count":3}
   Producing record: alice     {"count":4}
   Produced record to topic test1 partition [0] @ offest 0
   Produced record to topic test1 partition [0] @ offest 1
   Produced record to topic test1 partition [0] @ offest 2
   Produced record to topic test1 partition [0] @ offest 3
   Produced record to topic test1 partition [0] @ offest 4
   Producing record: alice     {"count":5}
   Producing record: alice     {"count":6}
   Producing record: alice     {"count":7}
   Producing record: alice     {"count":8}
   Producing record: alice     {"count":9}
   Produced record to topic test1 partition [0] @ offest 5
   Produced record to topic test1 partition [0] @ offest 6
   Produced record to topic test1 partition [0] @ offest 7
   Produced record to topic test1 partition [0] @ offest 8
   Produced record to topic test1 partition [0] @ offest 9
   10 messages were produced to topic test1!
   ```
2. View the [producer code](https://github.com/confluentinc/examples/tree/latest/clients/cloud/clojure/src/io/confluent/examples/clients/clj/producer.clj)


### Produce Records

1. Build the client examples:
   ```text
   ./gradlew clean build
   ```
2. Run the producer, passing in arguments for:
   - the local file with configuration parameters to connect to your Kafka cluster
   - the topic name

   ```text
   ./gradlew runApp -PmainClass="io.confluent.examples.clients.cloud.ProducerExample" \
      -PconfigPath="$HOME/.confluent/java.config" \
      -Ptopic="test1"
   ```
3. Verify the producer sent all the messages. You should see:
   ```text
   ...
   Producing record: alice        {"count":0}
   Producing record: alice        {"count":1}
   Producing record: alice        {"count":2}
   Producing record: alice        {"count":3}
   Producing record: alice        {"count":4}
   Producing record: alice        {"count":5}
   Producing record: alice        {"count":6}
   Producing record: alice        {"count":7}
   Producing record: alice        {"count":8}
   Producing record: alice        {"count":9}
   Produced record to topic test1 partition [0] @ offset 0
   Produced record to topic test1 partition [0] @ offset 1
   Produced record to topic test1 partition [0] @ offset 2
   Produced record to topic test1 partition [0] @ offset 3
   Produced record to topic test1 partition [0] @ offset 4
   Produced record to topic test1 partition [0] @ offset 5
   Produced record to topic test1 partition [0] @ offset 6
   Produced record to topic test1 partition [0] @ offset 7
   Produced record to topic test1 partition [0] @ offset 8
   Produced record to topic test1 partition [0] @ offset 9
   10 messages were produced to topic test1
   ...
   ```
4. View the [producer code](https://github.com/confluentinc/examples/tree/latest/clients/cloud/groovy/src/main/groovy/io/confluent/examples/clients/cloud/ProducerExample.groovy).


### Kafka Streams

1. Run the Kafka Streams application, passing in arguments for:
   - the local file with configuration parameters to connect to your Kafka cluster
   - the same topic name you used earlier

   ```bash
   ./gradlew runApp -PmainClass="io.confluent.examples.clients.cloud.StreamsExample" \
        -PconfigPath="$HOME/.confluent/java.config" \
        -Ptopic="test1"
   ```
2. Verify the consumer received all the messages. You should see:
   ```text
   ...
   [Consumed record]: alice, 0
   [Consumed record]: alice, 1
   [Consumed record]: alice, 2
   [Consumed record]: alice, 3
   [Consumed record]: alice, 4
   [Consumed record]: alice, 5
   [Consumed record]: alice, 6
   [Consumed record]: alice, 7
   [Consumed record]: alice, 8
   [Consumed record]: alice, 9
   ...
   [Running count]: alice, 0
   [Running count]: alice, 1
   [Running count]: alice, 3
   [Running count]: alice, 6
   [Running count]: alice, 10
   [Running count]: alice, 15
   [Running count]: alice, 21
   [Running count]: alice, 28
   [Running count]: alice, 36
   [Running count]: alice, 45
   ...
   ```
3. When you are done, press `CTRL-C`.
4. View the [Kafka Streams code](https://github.com/confluentinc/examples/tree/latest/clients/cloud/groovy/src/main/groovy/io/confluent/examples/clients/cloud/StreamsExample.groovy).


### Consume Avro Records

1. Consume from topic `test2` by doing the following:
   - Referencing a properties file
     ```bash
     docker-compose exec connect bash -c 'kafka-avro-console-consumer --topic test2 --bootstrap-server $CONNECT_BOOTSTRAP_SERVERS --consumer.config /tmp/ak-tools-ccloud.delta --property basic.auth.credentials.source=$CONNECT_VALUE_CONVERTER_BASIC_AUTH_CREDENTIALS_SOURCE --property schema.registry.basic.auth.user.info=$CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_BASIC_AUTH_USER_INFO --property schema.registry.url=$CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL --max-messages 5'
     ```
   - Referencing individual properties
     ```bash
     docker-compose exec connect bash -c 'kafka-avro-console-consumer --topic test2 --bootstrap-server $CONNECT_BOOTSTRAP_SERVERS --consumer-property sasl.mechanism=PLAIN --consumer-property security.protocol=SASL_SSL --consumer-property sasl.jaas.config="$SASL_JAAS_CONFIG_PROPERTY_FORMAT" --property basic.auth.credentials.source=$CONNECT_VALUE_CONVERTER_BASIC_AUTH_CREDENTIALS_SOURCE --property schema.registry.basic.auth.user.info=$CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_BASIC_AUTH_USER_INFO --property schema.registry.url=$CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL --max-messages 5'
     ```

   You should see the following messages:
   ```text
   {"ordertime":{"long":1494153923330},"orderid":{"int":25},"itemid":{"string":"Item_441"},"orderunits":{"double":0.9910185646928878},"address":{"io.confluent.ksql.avro_schemas.KsqlDataSourceSchema_address":{"city":{"string":"City_61"},"state":{"string":"State_41"},"zipcode":{"long":60468}}}}
   ```
2. When you are done, press `CTRL-C`.
3. View the [consumer Avro code](https://github.com/confluentinc/examples/tree/latest/clients/cloud/kafka-connect-datagen/start-docker-avro.sh).


### Kafka Streams

1. Run the Kafka Streams application, passing in arguments for:
   - the local file with configuration parameters to connect to your Kafka
     cluster
   - the same topic name you used earlier

   ```bash
   ./gradlew runApp -PmainClass="io.confluent.examples.clients.cloud.StreamsExample" \
        -PconfigPath="$HOME/.confluent/java.config" \
        -Ptopic="test1"
   ```
2. Verify the consumer received all the messages. You should see:
   ```text
   ...
   [Consumed record]: alice, 0
   [Consumed record]: alice, 1
   [Consumed record]: alice, 2
   [Consumed record]: alice, 3
   [Consumed record]: alice, 4
   [Consumed record]: alice, 5
   [Consumed record]: alice, 6
   [Consumed record]: alice, 7
   [Consumed record]: alice, 8
   [Consumed record]: alice, 9
   ...
   [Running count]: alice, 0
   [Running count]: alice, 1
   [Running count]: alice, 3
   [Running count]: alice, 6
   [Running count]: alice, 10
   [Running count]: alice, 15
   [Running count]: alice, 21
   [Running count]: alice, 28
   [Running count]: alice, 36
   [Running count]: alice, 45
   ...
   ```
3. When you are done, press `CTRL-C`.
4. View the [Kafka Streams code](https://github.com/confluentinc/examples/tree/latest/clients/cloud/kotlin/src/main/kotlin/io/confluent/examples/clients/cloud/StreamsExample.kt).


## Chained Transformation

You can use SMTs together to perform a more complex transformation.

The following examples show how the `ValueToKey` and `ExtractField` SMTs are
chained together to set the key for data coming from a [JDBC Connector](../../../../kafka-connect-jdbc/current/index.html). During the transform, `ValueToKey` copies the
message `c1` field into the message key and then `ExtractField` extracts
just the integer portion of that field.

```json
"transforms": "createKey,extractInt",
"transforms.createKey.type": "org.apache.kafka.connect.transforms.ValueToKey",
"transforms.createKey.fields": "c1",
"transforms.extractInt.type": "org.apache.kafka.connect.transforms.ExtractField$Key",
"transforms.extractInt.field": "c1"
```

The following shows what the message looked like before the transform.

```none
"./bin/kafka-avro-console-consumer \
                              --bootstrap-server localhost:9092 \
                              --property schema.registry.url=http://localhost:8081 \
                              --property print.key=true \
                              --from-beginning \
                              --topic mysql-foobar

null {"c1":{"int":1},"c2":{"string":"foo"},"create_ts":1501796305000,"update_ts":1501796305000}
null {"c1":{"int":2},"c2":{"string":"foo"},"create_ts":1501796665000,"update_ts":1501796665000}
```

After the connector configuration is applied, new rows are inserted (piped) into
the MySQL table:

```none
"echo "insert into foobar (c1,c2) values (100,'bar');"|mysql --user=username --password=pw demo
```

The following is displayed in the Avro console consumer. Note that the key (the
first value on the line) matches the value of c1, which was defined with the
transforms.

```none
100 {"c1":{"int":100},"c2":{"string":"bar"},"create_ts":1501799535000,"update_ts":1501799535000}
```


## Step 3: Convert the serialization format to JSON

1. Run the following statement to confirm that the current format of this table
   is Avro Schema Registry.
   ```sql
   SHOW CREATE TABLE gaming_player_activity_source;
   ```

   Your output should resemble:
   ```text
   +-------------------------------------------------------------+
   |                      SHOW CREATE TABLE                      |
   +-------------------------------------------------------------+
   | CREATE TABLE `env`.`clus`.`gaming_player_activity_source` ( |
   |   `key` VARBINARY(2147483647),                              |
   |   `player_id` INT NOT NULL,                                 |
   |   `game_room_id` INT NOT NULL,                              |
   |   `points` INT NOT NULL,                                    |
   |   `coordinates` VARCHAR(2147483647) NOT NULL,               |
   | ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS                 |
   | WITH (                                                      |
   |   'changelog.mode' = 'append',                              |
   |   'connector' = 'confluent',                                |
   |   'kafka.cleanup-policy' = 'delete',                        |
   |   'kafka.max-message-size' = '2097164 bytes',               |
   |   'kafka.partitions' = '6',                                 |
   |   'kafka.retention.size' = '0 bytes',                       |
   |   'kafka.retention.time' = '604800000 ms',                  |
   |   'key.format' = 'raw',                                     |
   |   'scan.bounded.mode' = 'unbounded',                        |
   |   'scan.startup.mode' = 'earliest-offset',                  |
   |   'value.format' = 'avro-registry'                          |
   | )                                                           |
   |                                                             |
   +-------------------------------------------------------------+
   ```
2. Run the following statement to create a second table that has the same
   schema but is configured with the value format set to JSON with Schema Registry.
   The key format is unchanged.
   ```sql
   CREATE TABLE gaming_player_activity_source_json (
     `key` VARBINARY(2147483647),
     `player_id` INT NOT NULL,
     `game_room_id` INT NOT NULL,
     `points` INT NOT NULL,
     `coordinates` VARCHAR(2147483647) NOT NULL
   ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS
   WITH (
     'value.format' = 'json-registry',
     'key.format' = 'raw'
   );
   ```

   This statement creates a corresponding Kafka topic and Schema Registry subject named
   `gaming_player_activity_source_json-value` for the value.
3. Run the following SQL to create a long-running statement that continuously
   transforms `gaming_player_activity_source` records into
   `gaming_player_activity_source_json` records.
   ```sql
   INSERT INTO gaming_player_activity_source_json
   SELECT
     *
   FROM gaming_player_activity_source;
   ```
4. Run the following statement to confirm that records are continuously
   appended to the target table:
   ```sql
   SELECT * FROM gaming_player_activity_source_json;
   ```

   Your output should resemble:
   ```none
   key         player_id game_room_id points coordinates
   x'31303834' 1084      3583         211    [51,93]
   x'31303037' 1007      2268         55     [98,72]
   x'31303230' 1020      1625         431    [01,08]
   x'31303934' 1094      4760         43     [80,71]
   x'31303539' 1059      2822         390    [33,74]
   ...
   ```
5. Run the following statement to confirm that the format of the
   `gaming_player_activity_source_json` table is JSON.
   ```sql
   SHOW CREATE TABLE gaming_player_activity_source_json;
   ```

   Your output should resemble:
   ```text
   +--------------------------------------------------------------------------------------+
   |                                  SHOW CREATE TABLE                                   |
   +--------------------------------------------------------------------------------------+
   | CREATE TABLE `jim-flink-test-env`.`cluster_0`.`gaming_player_activity_source_json` ( |
   |   `key` VARBINARY(2147483647),                                                       |
   |   `player_id` INT NOT NULL,                                                          |
   |   `game_room_id` INT NOT NULL,                                                       |
   |   `points` INT NOT NULL,                                                             |
   |   `coordinates` VARCHAR(2147483647) NOT NULL                                         |
   | ) DISTRIBUTED BY HASH(`key`) INTO 6 BUCKETS                                          |
   | WITH (                                                                               |
   |   'changelog.mode' = 'append',                                                       |
   |   'connector' = 'confluent',                                                         |
   |   'kafka.cleanup-policy' = 'delete',                                                 |
   |   'kafka.max-message-size' = '2097164 bytes',                                        |
   |   'kafka.partitions' = '6',                                                          |
   |   'kafka.retention.size' = '0 bytes',                                                |
   |   'kafka.retention.time' = '604800000 ms',                                           |
   |   'key.format' = 'raw',                                                              |
   |   'scan.bounded.mode' = 'unbounded',                                                 |
   |   'scan.startup.mode' = 'earliest-offset',                                           |
   |   'value.format' = 'json-registry'                                                   |
   | )                                                                                    |
   |                                                                                      |
   +--------------------------------------------------------------------------------------+
   ```


## Step 2: Apply the Transform Topic action

In the previous step, you created a Flink table and populated it with a few
rows. In this step, you apply the Transform Topic action to create a
transformed output table.

1. Navigate to the [Environments](https://confluent.cloud/environments) page,
   and in the navigation menu, click **Data portal**.
2. In the **Data portal** page, click the dropdown menu and select the
   environment for your workspace.
3. In the **Recently created** section, find your **users** topic and click it
   to open the details pane.
4. In the details pane, click **Actions**, and in the Actions list, click
   **Transform topic** to open the dialog.
5. In the **Action details** section, set up the transformation.
   - **user_id** field: select the **Key field** checkbox.
   - **registertime** field: enter *registration_time*.
   - **Partition count** property: enter *3*.
   - **Serialization format** property: select **JSON Schema**.

   By default, the name of the transformed topic is `users_transform`, and
   you can change this as desired.
6. In the **Runtime configuration** section, configure how the transformation
   statement will run.
   - (Optional) Select the Flink compute pool to run the embedding query. The
     current compute pool is selected as the default.
   - (Optional) Select **Run with a service account** for production jobs.
     The service account you select must have the EnvironmentAdmin role to
     create topics, schemas, and run Flink statements.
   - (Optional) Select **Show SQL** to view the Flink statement that does the
     transformation work.

     Your Flink SQL should resemble:
     ```sql
     CREATE TABLE `your-env`.`your-cluster`.`users_transform`
     DISTRIBUTED BY HASH (
         `user_id`
     ) INTO 3 BUCKETS WITH (
         'value.format' = 'json-registry',
         'key.format' = 'json-registry'
     ) AS SELECT
         `user_id`,
         `registertime` as `registration_time`,
         `gender`,
         `regionid`
     FROM `your-env`.`your-cluster`.`users`;
     ```
7. Click **Confirm and run** to run the transformation statement.

   A **Summary** page displays the result of the job submission, showing the
   statement name and other details.


## Step 3: Inspect the transformed topic

1. In the **Summary** page, click the **Output topic** link for the
   **users_transform** topic, and in the topic’s details pane, click
   **Query** to open a Flink workspace.
2. Run the following statement to view the rows in the **users_transform**
   table. Note the renamed **registration_time** column.
   ```sql
   SELECT * FROM `users_transform`;
   ```

   Click **Stop** to end the statement.
3. Run the following command to confirm that the `user_id` field in the
   transformed table is a key field.
   ```sql
   DESCRIBE `users_source_transform`;
   ```

   Your output should resemble:
   ```text
   +-------------------+-----------+----------+------------+
   |    Column Name    | Data Type | Nullable |   Extras   |
   +-------------------+-----------+----------+------------+
   | user_id           | STRING    | NULL     | BUCKET KEY |
   | registration_time | BIGINT    | NULL     |            |
   | gender            | STRING    | NULL     |            |
   | regionid          | STRING    | NULL     |            |
   +-------------------+-----------+----------+------------+
   ```
4. Run the following command to confirm the serialization format and
   partition count on the transformed topic.
   ```sql
   SHOW CREATE TABLE `users_source_transform`;
   ```

   Your output should resemble:
   ```text
   CREATE TABLE `your-env`.`your-cluster`.`users_transform` (
     `user_id` VARCHAR(2147483647),
     `registration_time` BIGINT,
     `gender` VARCHAR(2147483647),
     `regionid` VARCHAR(2147483647)
   )
   DISTRIBUTED BY HASH(`user_id`) INTO 3 BUCKETS
   WITH (
     'changelog.mode' = 'append',
     'connector' = 'confluent',
     'kafka.cleanup-policy' = 'delete',
     'kafka.max-message-size' = '2097164 bytes',
     'kafka.retention.size' = '0 bytes',
     'kafka.retention.time' = '7 d',
     'key.format' = 'json-registry',
     'scan.bounded.mode' = 'unbounded',
     'scan.startup.mode' = 'earliest-offset',
     'value.format' = 'json-registry'
   )
   ```


#### Step 3: Produce and consume with Confluent CLI

The following is an example CLI command to produce to `test-topic`:

```text
confluent kafka topic produce test-topic \
  --protocol SASL_SSL \
  --sasl-mechanism OAUTHBEARER \
  --bootstrap ":19091,:19092" \
  --ca-location scripts/security/snakeoil-ca-1.crt
```

- Specify `--protocol SASL_SSL` to use the SASL_SSL/OAUTHBEAER authentication.
- Specify `--sasl-mechanism OAUTHBEARER` to enable the OAUTHBEARER mechanism.
- `--bootstrap` is the list of hosts that the producer/consumer talks to. This
  list should be the same as what you configured in Step 1. Hosts should be
  separated by commas.
- `--ca-location` is the path to the CA certificate verifying the broker’s
  key, and it’s required for SSL verification. For more details about setting up
  this flag, see [this document](https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#ssl).


#### IMPORTANT
The principal specified above is the Kafka user, the same as specified in
[Kafka Broker](sasl.md#controlcenter-sasl-broker).

For each Kafka topic that Confluent Control Center creates, ACLs are created to grant the specified principal
the following privileges:

- CREATE
- WRITE
- DESCRIBE
- DESCRIBE_CONFIGS
- READ

The following ACLs are created to grant the specified principal privileges for the
consumer group related to the Confluent Control Center Streams application:

- READ

ACLs granting the following privileges are also created for the cluster:

- DESCRIBE
- DESCRIBE_CONFIGS

You must export a Control Center JAAS configuration before starting Control Center.

```bash
export CONTROL_CENTER_OPTS='-Djava.security.auth.login.config=<path-to-c3-jaas.conf>'
control-center-start config/control-center.properties
```


## Migrate to JavaScript Client from KafkaJS

Below is a simple produce example for users migrating from KafkaJS.

```javascript
// require('kafkajs') is replaced with require('@confluentinc/kafka-javascript').KafkaJS.
const { Kafka } = require("@confluentinc/kafka-javascript").KafkaJS;

async function producerStart() {
    const kafka = new Kafka({
        kafkaJS: {
            brokers: ['<fill>'],
            ssl: true,
            sasl: {
                mechanism: 'plain',
                username: '<fill>',
                password: '<fill>',
            },
        }
    });

    const producer = kafka.producer();

    await producer.connect();

    console.log("Connected successfully");

    const res = []
    for (let i = 0; i < 50; i++) {
        res.push(producer.send({
            topic: 'test-topic',
            messages: [
                { value: 'v222', partition: 0 },
                { value: 'v11', partition: 0, key: 'x' },
            ]
        }));
    }
    await Promise.all(res);

    await producer.disconnect();

    console.log("Disconnected successfully");
}

producerStart();
```

To migrate to the JavaScript Client from the KafkaJS:

1. Change the import statement, and add a `kafkaJS` block around your
   configs.

   From:
   ```javascript
   const { Kafka } = require('kafkajs');
   const kafka = new Kafka({ brokers: ['kafka1:9092', 'kafka2:9092'], /* ... */ });
   const producer = kafka.producer({ /* ... */, });
   ```

   To:
   ```javascript
   const { Kafka } = require('@confluentinc/kafka-javascript').KafkaJS;
   const kafka = new Kafka({ kafkaJS: { brokers: ['kafka1:9092', 'kafka2:9092'], /* ... */ } });
   const producer = kafka.producer({ kafkaJS: { /* ... */, } });
   ```
2. Try running your program. In case a migration is needed, an
   informative error will be thrown. If you’re using Typescript, some of
   these changes will be caught at compile time.
3. The most common expected changes to the code are:
   - For the **producer**: `acks`, `compression` and `timeout` are
     not set per `send()`. They must be configured in the
     top-level configuration while creating the producer.
   - For the **consumer**:
     - `fromBeginning` is not set per `subscribe()`. It
       must be configured in the top-level configuration while creating the
       consumer.
     - `autoCommit` and `autoCommitInterval` are not set per `run()`.
       They must be configured in the top-level
       configuration while creating the consumer.
     - `autoCommitThreshold` is not supported.
     - `eachBatch`’s batch size never exceeds 1.
   - For errors: Check the `error.code` rather than the error `name` or
     `type`.
4. A more exhaustive list of semantic and configuration differences is
   [presented below](#common).

An example migration:

```diff
-const { Kafka } = require('kafkajs');
+const { Kafka } = require('@confluentinc/kafka-javascript').KafkaJS;

const kafka = new Kafka({
+ kafkaJS: {
  clientId: 'my-app',
  brokers: ['kafka1:9092', 'kafka2:9092']
+ }
})

const producerRun = async () => {
- const producer = kafka.producer();
+ const producer = kafka.producer({ kafkaJS: { acks: 1 } });
  await producer.connect();
  await producer.send({
    topic: 'test-topic',
-   acks: 1,
    messages: [
      { value: 'Hello confluent-kafka-javascript user!' },
    ],
  });
};


   const consumerRun = async () => {
     // Consuming
   - const consumer = kafka.consumer({ groupId: 'test-group' });
   + const consumer = kafka.consumer({ kafkaJS: { groupId: 'test-group', fromBeginning: true } });
     await consumer.connect();
   - await consumer.subscribe({ topic: 'test-topic', fromBeginning: true });
   + await consumer.subscribe({ topic: 'test-topic' });

     await consumer.run({
       eachMessage: async ({ topic, partition, message }) => {
         console.log({
           partition,
           offset: message.offset,
           value: message.value.toString(),
         })
       },
     });
   };

   producerRun().then(consumerRun).catch(console.error);
```


### Consumer configuration changes

```javascript
const consumer = kafka.consumer({ kafkaJS: { /* producer-specific configuration changes. */ } });
```

Each allowed config property is discussed below. If there
is any change in semantics or the default values, the property and the
change is **highlighted in bold**.

| Property                 | Default Value                   | Comment                                                                                                                                                                                                                                 |
|--------------------------|---------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| groupId                  | none                            | A mandatory string denoting consumer group name that this consumer is a part of.                                                                                                                                                        |
| **partitionAssigners**   | [PartitionAssigners.roundRobin] | Support for range, roundRobin, and cooperativeSticky assignors is provided. Custom assignors are not supported.                                                                                                                         |
| **partitionAssignors**   | [PartitionAssignors.roundRobin] | Alias for partitionAssigners                                                                                                                                                                                                            |
| **rebalanceTimeout**     | **300000**                      | The maximum allowed time for each member to join the group once a rebalance has begun. Note, that setting this value also changes the max poll interval. Message processing in eachMessage/eachBatch must not take more than this time. |
| heartbeatInterval        | 3000                            | The expected time in milliseconds between heartbeats to the consumer coordinator.                                                                                                                                                       |
| metadataMaxAge           | 5 minutes                       | Time in milliseconds after which to refresh metadata for known topics                                                                                                                                                                   |
| allowAutoTopicCreation   | `true`                          | Determines if a topic should be created if it doesn’t exist while consuming.                                                                                                                                                            |
| **maxBytesPerPartition** | 1048576 (1MB)                   | determines how many bytes can be fetched in one request from a single partition. There is a change in semantics, this size grows dynamically if a single message larger than this is encountered, and the client does not get stuck.    |
| minBytes                 | 1                               | Minimum number of bytes the broker responds with (or wait until maxWaitTimeInMs)                                                                                                                                                        |
| maxBytes                 | 10485760 (10MB)                 | Maximum number of bytes the broker responds with.                                                                                                                                                                                       |
| **retry**                | object                          | Identical to retry in the common configuration. This takes precedence over the common config retry.                                                                                                                                     |
| readUncommitted          | false                           | If `true`, consumer will read transactional messages which have not been committed.                                                                                                                                                     |
| **maxInFlightRequests**  | null                            | Maximum number of in-flight requests **per broker connection.** If not set, it is practically unbounded (same as KafkaJS).                                                                                                              |
| rackId                   | null                            | Can be set to an arbitrary string which will be used for fetch-from-follower if set up on the cluster.                                                                                                                                  |
| **fromBeginning**        | false                           | If there is initial offset in offset store or the desired offset is out of range, and this is true, we consume the earliest possible offset. **This is set on a per-consumer level, not on a per subscribe level.**                     |
| **autoCommit**           | `true`                          | Whether to periodically auto-commit offsets to the broker while consuming. **This is set on a per-consumer level, not on a per run level.**                                                                                             |
| **autoCommitInterval**   | 5000                            | Offsets are committed periodically at this interval, if autoCommit is true. **This is set on a per-consumer level, not on a per run level. The default value is changed to 5 seconds.**                                                 |
| outer config             | {}                              | The configuration outside the kafkaJS block can contain any of the keys present in the librdkafka CONFIGURATION table.                                                                                                                  |


### Consume

A `Consumer` receives messages from Kafka. The following example illustrates
how to create an instance and start consuming messages:

```js
const consumer = new Kafka().consumer({
    'bootstrap.servers': '<fill>',
    'group.id': 'test-group', // Mandatory property for a consumer - the consumer group id.
});

await consumer.connect();
await consumer.subscribe({ topics: ["test-topic"] });

consumer.run({
  eachMessage: async ({ topic, partition, message }) => {
    console.log({
      topic,
      partition,
      headers: message.headers,
      offset: message.offset,
      key: message.key?.toString(),
      value: message.value.toString(),
    });
  }
});

// Whenever we're done consuming, maybe after user input or a signal:
await consumer.disconnect();
```

The consumer must be connected before calling `run`. The `run`
method starts the consumer loop, which takes care of polling the
cluster, and will call the `eachMessage` callback for every message you
get from the cluster.

The message may contain several other fields besides the value. For example:

```js
{
  // Key of the message - may not be set.
  key: Buffer.from('key'),
  // Value of the message - will be set.
  value: Buffer.from('value'),
  // The timestamp set by the producer or the broker in milliseconds since the Unix epoch.
  timestamp: '1734008723000',
  // The current epoch of the leader for this partition.
  leaderEpoch: 2,
  // Size of the message in bytes.
  size: 6,
  // Offset of the message on the partition.
  offset: '42',
  // Headers that were sent along with the message.
  headers: {
    'header-key-0': ['header-value-0', 'header-value-1'],
    'header-key-1': Buffer.from('header-value'),
  }
}
```

A message is considered to be processed successfully when
`eachMessage` for that message runs to completion without throwing an
error. In case an error is thrown, the message is marked unprocessed,
and `eachMessage` will be called with the same message again.


#### Subscribe and rebalance

To consume messages, the consumer must be a part of a [consumer
group](https://github.com/confluentinc/librdkafka/blob/master/INTRODUCTION.md#consumer-groups),
and it must subscribe to one or more topics. The group is specified with
the `group.id` property, and the `subscribe` method should be called
after connecting to the cluster.

The consumer does not actually join the consumer group until `run` is
called. Joining a consumer group causes a rebalance within all the
members of that consumer group, where each consumer is assigned a set of
partitions to consume from. Rebalances may also be caused when a
consumer leaves a group by disconnecting, or when new partitions are
added to a topic.

It is possible to add a callback to track rebalances:

```js
const rebalance_cb = (err, assignment) => {
  switch (err.code) {
    case ErrorCodes.ERR__ASSIGN_PARTITIONS:
      console.log(`Assigned partitions ${JSON.stringify(assignment)}`);
      break;
    case ErrorCodes.ERR__REVOKE_PARTITIONS:
      console.log(`Revoked partitions ${JSON.stringify(assignment)}`);
      break;
    default:
      console.error(err);
  }
};
const consumer = new Kafka().consumer({
    'bootstrap.servers': '<fill>',
    'group.id': 'test-group',
    'rebalance_cb': rebalance_cb,
});
```

It’s also possible to modify the assignment of partitions, or pause
consumption of newly assigned partitions just after a rebalance.

```js
const rebalance_cb = (err, assignment, assignmentFns) => {
  switch (err.code) {
    case ErrorCodes.ERR__ASSIGN_PARTITIONS:
      // Change the assignment as needed - this mostly boils down to changing the offset to start consumption from, though
      // you are free to do anything.
      if (assignment.length > 0) {
        assignment[0].offset = 34;
      }
      assignmentFns.assign(assignment);
      // Can pause consumption of new partitions just after a rebalance.
      break;
    case ErrorCodes.ERR__REVOKE_PARTITIONS:
      break;
    default:
      console.error(err);
  }
};
```

Subscriptions can be changed anytime, and the running consumer triggers
a rebalance whenever that happens. The current assignment of partitions
to the consumer can be checked with the `assign` method.


### Metadata

To retrieve metadata from Kafka, use the `getMetadata` method with
`Kafka.Producer` or `Kafka.KafkaConsumer`.

When fetching metadata for a specific topic, if a topic reference does not
exist, one is created using the default configuration.

See the documentation on `Client.getMetadata` if you want to set configuration
parameters, for example, `acks` on a topic to produce messages to.

The following example illustrates how to use the `getMetadata` method.

```js
const opts = {
  topic: 'librdtesting-01',
  timeout: 10000
};

producer.getMetadata(opts, (err, metadata) => {
  if (err) {
    console.error('Error getting metadata');
    console.error(err);
  } else {
    console.log('Got metadata');
    console.log(metadata);
  }
});
```

Metadata on any connection is returned in the following data structure:

```js
{
  orig_broker_id: 1,
  orig_broker_name: "broker_name",
  brokers: [
    {
      id: 1,
      host: 'localhost',
      port: 40
    }
  ],
  topics: [
    {
      name: 'awesome-topic',
      partitions: [
        {
          id: 1,
          leader: 20,
          replicas: [1, 2],
          isrs: [1, 2]
        }
      ]
    }
  ]
}
```


### Property-based example

Create a configuration file for the connector. This file is included with the connector in `etc/kafka-connect-appdynamics-metrics/appdynamics-metrics-sink-connector.properties`. This configuration is typically used for [standalone workers](/platform/current/connect/concepts.html#standalone-workers).

```properties
name=appdynamics-metrics-sink
topics=appdynamics-metrics-topic
connector.class=io.confluent.connect.appdynamics.metrics.AppDynamicsMetricsSinkConnector
tasks.max=1
machine.agent.host=<host>
machine.agent.port=<port>
behavior.on.error=fail
confluent.topic.bootstrap.servers=localhost:9092
confluent.topic.replication.factor=1
reporter.bootstrap.servers=localhost:9092
reporter.result.topic.replication.factor=1
reporter.error.topic.replication.factor=1
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
```


### Load the Amazon Redshift Sink connector

1. Create a properties file for your Redshift Sink connector.
   ```text
   name=redshift-sink
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   connector.class=io.confluent.connect.aws.redshift.RedshiftSinkConnector
   tasks.max=1
   topics=orders
   aws.redshift.domain=< Required Configuration >
   aws.redshift.port=< Required Configuration >
   aws.redshift.database=< Required Configuration >
   aws.redshift.user=< Required Configuration >
   aws.redshift.password=< Required Configuration >
   pk.mode=kafka
   auto.create=true
   ```

   Fill in the configuration parameters of your cluster as they appear in your
   [Cluster Details](https://console.aws.amazon.com/redshift/home#cluster-list:).
2. Load the `redshift-sink` connector:
   ```bash
   confluent local load redshift-sink --config redshift-sink.properties
   ```

   Your output should resemble the following:
   ```text
   {
     "name": "redshift-sink",
     "config": {
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "connector.class": "io.confluent.connect.aws.redshift.RedshiftSinkConnector",
       "tasks.max": "1",
       "topics": "orders",
       "aws.redshift.domain": "cluster-name.cluster-id.region.redshift.amazonaws.com",
       "aws.redshift.port": "5439",
       "aws.redshift.database": "dev",
       "aws.redshift.user": "awsuser",
       "aws.redshift.password": "your-password",
       "auto.create": "true",
       "pk.mode": "kafka",
       "name": "redshift-sink"
     },
     "tasks": [],
     "type": "sink"
   }
   ```

   Note that non-CLI users can load the Redshift Sink connector by using the
   following command:
   ```text
   ${CONFLUENT_HOME}/bin/connect-standalone \
   ${CONFLUENT_HOME}/etc/schema-registry/connect-avro-standalone.properties \
   redshift-sink.properties
   ```


## Quick Start

The following quick start uses the `AzureBlobStorageSinkConnector` to write an Avro file from the Kafka topic named `blob_topic` to Azure Blob Storage.
Also, the `AzureBlobStorageSinkConnector` should be completely stopped before starting the `AzureBlobStorageSourceConnector` to avoid creating
source/sink cycle.
Then, the `AzureBlobStorageSourceConnector` loads that Avro file from Azure Blob Storage to the Kafka topic named `copy_of_blob_topic`.

For an example of how to get Kafka Connect connected to [Confluent Cloud](/cloud/current/index.html), see
[Connect Self-Managed Kafka Connect to Confluent Cloud](/cloud/current/cp-component/connect-cloud-config.html#distributed-cluster).

1. Follow the instructions from Connect Azure Blob Storage Sink connector <https://docs.confluent.io/kafka-connect-azure-blob-storage/current/source/index.html#quick-start> to set up the data to use below.
2. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your Confluent Platform installation directory
   confluent connect plugin install confluentinc/kafka-connect-azure-blob-storage-source:latest
   ```


### Using a bundled schema specification

There are a few quick start schema specifications bundled with the Datagen
connector. These schemas are listed in [this directory](https://github.com/confluentinc/kafka-connect-datagen/tree/master/src/main/resources).
To use one of these bundled schemas, refer to [this mapping](https://github.com/confluentinc/kafka-connect-datagen/blob/master/src/main/java/io/confluent/kafka/connect/datagen/DatagenTask.java#L66-L73).
In the configuration file, set the property `quick start` to the associated
name, as shown in the following example:

```text
"quickstart": "users",
```


### Install the Connector

Refer to the [Debezium tutorial](https://github.com/debezium/debezium-examples/tree/master/tutorial#using-mysql)
if you want to use Docker images for setting up Kafka, ZooKeeper, and Kafka Connect.
Note that as of Confluent Platform 7.5, ZooKeeper is deprecated for new deployments. Confluent
recommends KRaft mode for new deployments. For the following tutorial, you
need to have a local setup of Confluent Platform.

Navigate to your Confluent Platform installation directory and run the following
command to install the connector:

```bash
confluent connect plugin install debezium/debezium-connector-mysql:0.9.4
```

Adding a new connector plugin requires restarting Connect. Use the
Confluent CLI to restart Connect.

```bash
confluent local services connect stop && confluent local services connect start
Using CONFLUENT_CURRENT: /Users/username/Sandbox/confluent-snapshots/var/confluent.NuZHxXfq
Starting Zookeeper
Zookeeper is [UP]
Starting Kafka
Kafka is [UP]
Starting Schema Registry
Schema Registry is [UP]
Starting Kafka REST
Kafka REST is [UP]
Starting Connect
Connect is [UP]
```

Check if the MySQL plugin has been installed correctly and picked
up by the plugin loader:

```bash
curl -sS localhost:8083/connector-plugins | jq .[].class | grep mysql
"io.debezium.connector.mysql.MySqlConnector"
```


## Required properties

`name`
: Unique name for the connector. Trying to register again with the same name
  will fail.

`connector.class`
: The name of the Java class for the connector. You must use a value of
  `io.debezium.connector.postgresql.PostgresConnector` for the PostgreSQL
  connector.

`tasks.max`
: The maximum number of tasks that should be created for this connector. The
  connector always uses a single task and so does not use this value– the
  default is always acceptable.


  * Type: int
  * Default: 1

`plugin.name`
: The name of the Postgres logical decoding plugin installed on the server. When
  the processed transactions are very large it is possible that the JSON batch
  event with all changes in the transaction will not fit into the hard-coded
  memory buffer of size 1 GB. In such cases, it is possible to switch to
  streaming mode when every change in transactions is sent as a separate message
  from PostgreSQL into Debezium. You can configure the streaming mode by setting
  `plugin.name` to `pgoutput`. For more details, see [PostgreSQL 10+ logical
  decoding support (pgoutput)](https://debezium.io/documentation/reference/stable/connectors/postgresql.html#postgresql-pgoutput)
  in the Debezium documentation.


  * Type: string
  * Importance: medium
  * Default: `decoderbufs`
  * Valid values: `decoderbufs`, `wal2json` and `wal2json_rds`. There are
    two new options supported since 0.8.0.Beta1:
    `wal2json_streaming` and `wal2json_rds_streaming`.

`slot.name`
: The name of the Postgres logical decoding slot created for streaming changes from a plugin and database instance. Values must conform to Postgres replication slot naming rules which state: “Each replication slot has a name, which can contain lower-case letters, numbers, and the underscore character.”


  * Type: string
  * Importance: medium
  * Default: `debezium`

`slot.drop.on.stop`
: Indicates to drop or not to drop the logical replication slot when the connector finishes orderly. Should only be set to `true` in testing or development environments. Dropping the slot allows WAL segments to be discarded by the database. If set to `true` the connector may not be able to resume from the WAL position where it left off.


  * Type: string
  * Importance: low
  * Default: `false`

`publication.name`
: The name of the PostgreSQL publication created for streaming changes when
  using `pgoutput`. This publication is created at start-up if it does not
  already exist and it includes all tables. If the publication already exists,
  either for all tables or configured with a subset of tables, the connector uses the
  publication as it is defined.


  * Type: string
  * Importance: low
  * Default: `dbz_publication`

`database.hostname`
: IP address or hostname of the PostgreSQL database server.


  * Type: string
  * Importance: high

`database.port`
: Integer port number of the PostgreSQL database server.


  * Type: int
  * Importance: low
  * Default: `5432`

`database.user`
: Username to use when when connecting to the PostgreSQL database server.


  * Type: string
  * Importance: high

`database.password`
: Password to use when when connecting to the PostgreSQL database server.


  * Type: password
  * Importance: high

`database.dbname`
: The name of the PostgreSQL database from which to stream the changes.


  * Type: string
  * Importance: high

`topic.prefix`
: Topic prefix that provides a namespace for the particular PostgreSQL database
  server or cluster in which Debezium is capturing changes. The prefix should be
  unique across all other connectors, since it is used as a topic name prefix
  for all Kafka topics that receive records from this connector. Only
  alphanumeric characters, hyphens, dots and underscores must be used in the
  database server logical name. Do not change the value of this property. If you
  change the name value, after a restart, instead of continuing to emit events
  to the original topics, the connector emits subsequent events to topics whose
  names are based on the new value.


  * Type: string
  * Default: No default

`schema.include.list`
: An optional comma-separated list of regular expressions that match schema names to be monitored. Any schema name not included in the whitelist will be excluded from monitoring. By default all non-system schemas are monitored. May not be used with `schema.exclude.list`.


  * Type: list of strings
  * Importance: low

`schema.exclude.list`
: An optional comma-separated list of regular expressions that match schema names to be excluded from monitoring. Any schema name not included in the exclude list will be monitored, with the exception of system schemas. May not be used with `schema.whitelist`.


  * Type: list of strings
  * Importance: low

`table.include.list`
: An optional comma-separated list of regular expressions that match fully-qualified table identifiers for tables to be monitored. Any table not included in the whitelist is excluded from monitoring. Each identifier is in the form `schemaName.tableName`. By default the connector will monitor every non-system table in each monitored schema. May not be used with `table.exclude.list`.


  * Type: list of strings
  * Importance: low

`table.exclude.list`
: An optional comma-separated list of regular expressions that match fully-qualified table identifiers for tables to be excluded from monitoring. Any table not included in the exclude list is monitored. Each identifier is in the form `schemaName.tableName`. May not be used with `table.whitelist`.


  * Type: list of strings
  * Importance: low

`column.include.list`
: An optional, comma-separated list of regular expressions that match the
  fully-qualified names of columns that should be included in change event
  record values. Fully-qualified names for columns are of the form
  `schemaName.tableName.columnName`. Do not also set the `column.exclude.list`
  property.


  * Type: list of strings
  * Importance: low

`column.exclude.list`
: An optional comma-separated list of regular expressions that match the fully-qualified names of columns that should be excluded from change event message values. Fully-qualified names for columns are of the form `schemaName.tableName.columnName`.


  * Type: list of strings
  * Importance: low

`skip.messages.without.change`
: Specifies whether to skip publishing messages when there is no change in
  included columns. This would essentially filter messages if there is no change
  in columns included as per `column.include.list` or `column.exclude.list`
  properties.


  * Type: boolean
  * Default: false

`time.precision.mode`
: Time, date, and timestamps can be represented with different kinds of precision, including:


  - `adaptive`: Captures the time and timestamp values exactly as they are in the database. `adaptive` uses either millisecond, microsecond, or nanosecond precision values based on the database column type.
  - `adaptive_time_microseconds`: Captures the date, datetime and timestamp values exactly as they are in the database.
  - `adaptive_time_microseconds`: Uses either millisecond, microsecond, or nanosecond precision values based on the database column type, with the exception of `TIME` type fields, which are always captured as microseconds.
  - `connect`: Always represents time and timestamp values using Kafka
    Connect’s built-in representations for Time, Date, and Timestamp.
    `connect` uses millisecond precision regardless of database column
    precision. For more details, see [temporal values](https://debezium.io/docs/connectors/postgresql/#temporal-values).


  * Type: string
  * Importance: high
  * Default: `adaptive`

`decimal.handling.mode`
: Specifies how the connector should handle values for `DECIMAL` and `NUMERIC` columns:


  - `precise`: Represents values precisely using `java.math.BigDecimal`, which are represented in change events in binary form.
  - `double`: Represents them using double values. `double` may result in a loss of precision but is easier to use.
  - `string option`: Encodes values as a formatted string. `string option`
    is easy to consume but semantic information about the real type is lost. See
    [Decimal Values](https://debezium.io/docs/connectors/postgresql/#decimal-values).


  * Type: string
  * Importance: high
  * Default: `precise`

`hstore.handling.mode`
: Specifies how the connector should handle values for hstore columns. `map`
  represents using MAP. `json` represents them using JSON strings. The JSON
  option encodes values as formatted strings, such as `key`: `val`. For more
  details, see [HStore Values](https://debezium.io/docs/connectors/postgresql/#hstore-values).


  * Type: list of strings
  * Importance: low
  * Default: `map`

`interval.handling.mode`
: Specifies how the connector should handle values for interval columns.


  * Type: string
  * Default: `numeric`
  * Valid values: [`numeric` or `string`]

`database.sslmode`
: Sets whether or not to use an encrypted connection to the PostgreSQL server.
  The option of `disable` uses an unencrypted connection. `require` uses a
  secure (encrypted) connection and fails if one cannot be established.
  `verify-ca` is similar to `require`, but additionally verify the server
  TLS certificate against the configured Certificate Authority (CA)
  certificates. Fails if no valid matching CA certificates are found.
  `verify-full` is similar to `verify-ca` but additionally verify that the
  server certificate matches the host to which the connection is attempted. For
  more information, see the [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-connect.html).


  * Type: string
  * Importance: low
  * Default: `disable`

`database.sslcert`
: The path to the file containing the SSL certificate of the client. See the [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-connect.html) for more information.


  * Type: string
  * Importance: high

`database.sslkey`
: The path to the file containing the SSL private key of the client. See the
  [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-connect.html) for more
  information.


  * Type: string

`database.sslpassword`
: The password to access the client private key from the file specified by `database.sslkey`. See the [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-connect.html) for more information.


  * Type: string
  * Importance: low

`database.sslrootcert`
: The path to the file containing the root certificate(s) against which the server is validated. See the [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-connect.html) for more information.


  * Type: string
  * Importance: low

`database.tcpKeepAlive`
: Enable TCP keep-alive probe to verify that database connection is still alive. Enabled by default. See the [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-connect.html) for more information.


  * Type: string
  * Importance: low
  * Default: `true`

`tombstones.on.delete`
: Controls whether a tombstone event should be generated after a delete event.
  When `true` the delete operations are represented by a delete event and a subsequent tombstone event. When `false` only a delete event is sent. Emitting a tombstone event (the default behavior) allows Kafka to completely delete all events pertaining to the given key once the source record got deleted.


  * Type: string
  * Importance: high
  * Default: `true`

`column.truncate.to.length.chars`
: An optional, comma-separated list of regular expressions that match the
  fully-qualified names of character-based columns. Fully-qualified names for
  columns are of the form schemaName.tableName.columnName. In change event
  records, values in these columns are truncated if they are longer than the
  number of characters specified by length in the property name. You can specify
  multiple properties with different lengths in a single configuration. Length
  must be a positive integer, for example, `+column.truncate.to.20.chars.`


  * Type: list of strings
  * Default: No default

`column.mask.with.length.chars`
: An optional, comma-separated list of regular expressions that match the
  fully-qualified names of character-based columns. Fully-qualified names for
  columns are of the form `schemaName.tableName.columnName`. In change event
  values, the values in the specified table columns are replaced with length
  number of asterisk `(*)` characters. You can specify multiple properties with
  different lengths in a single configuration. Length must be a positive integer
  or zero. When you specify zero, the connector replaces a value with an empty
  string.


  * Type: list of strings
  * Default: No default

`column.mask.hash.hashAlgorithm.with.salt.salt`; `column.mask.hash.v2.hashAlgorithm.with.salt.salt`
: An optional, comma-separated list of regular expressions that match the
  fully-qualified names of character-based columns. Fully-qualified names for
  columns are of the form `<schemaName>.<tableName>.<columnName>`. In the
  resulting change event record, the values for the specified columns are
  replaced with pseudonyms, which consists of the hashed value that results from
  applying the specified `hashAlgorithm` and `salt`. Based on the hash function that
  is used, referential integrity is maintained, while column values are replaced
  with pseudonyms. Supported hash functions are described in the [MessageDigest](https://docs.oracle.com/javase/7/docs/technotes/guides/security/StandardNames.html#MessageDigest) documentation.


  * Type: list of strings
  * Default: No default

`column.propagate.source.type`
: An optional, comma-separated list of regular expressions that match the
  database-specific data type name for some columns. Fully-qualified data type
  names are of the form `databaseName.tableName.typeName`, or
  `databaseName.schemaName.tableName.typeName`.


  For these data types, the connector adds parameters to the corresponding field
  schemas in emitted change records. The added parameters specify the original
  type and length of the column: `__debezium.source.column.type +
  __debezium.source.column.length + __debezium.source.column.scale`.


  * Type: list of strings
  * Default: No default

`datatype.propagate​.source.type`
: An optional, comma-separated list of regular expressions that match the
  database-specific data type name for some columns. Fully-qualified data type
  names are of the form `databaseName.tableName.typeName`, or
  `databaseName.schemaName.tableName.typeName`. For more details, see the
  [Debezium documentation](https://debezium.io/documentation/reference/1.3/connectors/postgresql.html#postgresql-property-datatype-propagate-source-type)


  * Type: list of strings
  * Default: No default

`message.key.columns`
: A list of expressions that specify the columns that the connector uses to form
  custom message keys for change event records that it publishes to the Kafka
  topics for specified tables.


  By default, Debezium uses the primary key column of a table as the message key
  for records that it emits. In place of the default, or to specify a key for
  tables that lack a primary key, you can configure custom message keys based on
  one or more columns.


  To establish a custom message key for a table, list the table, followed by the
  columns to use as the message key. Each list entry takes the following format:


  ```text
  <fully-qualified_tableName>:_<keyColumn>_,<keyColumn>
  ```


  To base a table key on multiple column names, insert commas between the column names.


  Each fully-qualified table name is a regular expression in the following format:


  ```text
  <databaseName>.<tableName>
  ```


  The property can include entries for multiple tables. Use a semicolon to
  separate table entries in the list. The following example sets the message key
  for the tables `inventory.customers` and `purchase.orders`:


  ```text
  inventory.customers:pk1,pk2;(.*).purchaseorders:pk3,pk4
  ```


  For the table inventory.customer, the columns pk1 and pk2 are specified as the
  message key. For the `purchaseorders` tables in any database, the columns
  `pk3` and `pk4` server as the message key.


  There is no limit to the number of columns that you use to create custom message
  keys. However, it’s best to use the minimum number that are required to specify
  a unique key.


  * Type: list
  * Default: empty string

`publication.autocreate.mode`
: Applies only when streaming changes by using the [pgoutput plug-in](https://www.postgresql.org/docs/current/sql-createpublication.html). The
  setting determines how creation of a publication should work.


  * Default: `all_tables`
  * Valid values: [`all_tables`, `disabled`, `filtered`]

`replica.identity.autoset.values`
: A comma-separated list of regular expressions that match fully-qualified
  tables and the replica identity value to be used in a table. This property
  determines the value for replica identity at the table level and will
  overwrite the existing value in the database. For more details about this
  property, see the [Debezium documentation](https://debezium.io/documentation/reference/2.4/connectors/postgresql.html#postgresql-replica-autoset-type).


  * Type: list of strings
  * Default: empty string

`binary.handling.mode`
: Specifies how binary (bytea) columns should be represented in change events.


  * Type: bytes or string
  * Importance: low
  * Valid values: [`bytes`, `base4`, `hex`]

`schema.name.adjustment.mode`
: Specifies how schema names should be adjusted for compatibility with the
  message converter used by the connector. Possible settings are: `avro`,
  which replaces the characters that cannot be used in the Avro type name with
  an underscore, and `none`, which does not apply any adjustment.


  * Type: string
  * Default: `avro`

`field.name.adjustment.mode`
: Specifies how schema names should be adjusted for compatibility with the
  message converter used by the connector. The following are possible settings:


  - `avro`: Replaces the characters that cannot be used in the Avro type name with an underscore
  - `none`: Does not apply any adjustment
  - `avro_unicode`: Replaces the underscore or characters that cannot be used
    in the Avro type name with corresponding unicode like `_uxxxx`. Note that `_` is
    an escape sequence like backslash in Java.


  For more details, see [Avro naming](https://debezium.io/documentation/reference/2.2/configuration/avro.html#avro-naming).


  * Type: string
  * Default: `none`

`money.fraction.digits`
: Specifies how many decimal digits should be used when converting Postgres
  money type to `java.math.BigDecimal`, which represents the values in change
  events. Applicable only when `decimal.handling.mode` is set to `precise`.


  * Type: int
  * Default: 2

`message.prefix.include.list`
: An optional, comma-separated list of regular expressions that match names of
  logical decoding message prefixes for which you want to capture. Any logical
  decoding message with a prefix not included in `message.prefix.include.list`
  is excluded.  Do not also set the `message.prefix.exclude.list` parameter
  when setting this property.


  For information about the structure of message events and about their ordering
  semantics, see message events.


  * Type: list of strings
  * Default: By default, all logical decoding messages are captured.

`message.prefix.exclude.list`
: An optional, comma-separated list of regular expressions that match names of
  logical decoding message prefixes for which you do not to capture. Any logical
  decoding message with a prefix that is not included in
  `message.prefix.exclude.list` is included. Do not also set the
  `message.prefix.include.list` parameter when setting this property. To
  exclude all logical decoding messages pass `.*` into this config.


  * Type: list of strings
  * Default: No default


### Install the Connector

If you want to use Docker images for setting up Kafka, ZooKeeper and Connect, refer
to the [Debezium tutorial](https://github.com/debezium/debezium-examples/tree/master/tutorial#using-sql-server/).
For the following tutorial, it is required to have a local setup of the Confluent Platform.
Note that as of Confluent Platform 7.5, ZooKeeper is deprecated for new deployments. Confluent
recommends KRaft mode for new deployments.

Navigate to your Confluent Platform installation directory and run the following command to
install the connector:

```bash
confluent connect plugin install debezium/debezium-connector-sqlserver:latest
```

Adding a new connector plugin requires restarting Connect. Use the
Confluent CLI to restart Connect.

```bash
confluent local services connect stop && confluent local services connect start
Using CONFLUENT_CURRENT: /Users/username/Sandbox/confluent-snapshots/var/confluent.NuZHxXfq
Starting Zookeeper
Zookeeper is [UP]
Starting Kafka
Kafka is [UP]
Starting Schema Registry
Schema Registry is [UP]
Starting Kafka REST
Kafka REST is [UP]
Starting Connect
Connect is [UP]
```

Check if the SQL Server plugin has been installed correctly and picked
up by the plugin loader.

```bash
curl -sS localhost:8083/connector-plugins | jq '.[].class' | grep SqlServer
"io.debezium.connector.sqlserver.SqlServerConnector"
```


#### NOTE
Default connector properties are already set for this quick start. To view
the connector properties, refer to
`etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties`.

1. List the available predefined connectors using the following command:
   ```bash
   confluent local list
   ```

   Example output:
   ```bash
   Bundled Predefined Connectors (edit configuration under etc/):
     elasticsearch-sink
     file-source
     file-sink
     jdbc-source
     jdbc-sink
     hdfs-sink
     s3-sink
   ```
2. Load the `elasticsearch-sink` connector:
   ```bash
   confluent local load elasticsearch-sink
   ```

   Example output:
   ```bash
   {
     "name": "elasticsearch-sink",
     "config": {
       "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
       "tasks.max": "1",
       "topics": "test-elasticsearch-sink",
       "key.ignore": "true",
       "connection.url": "http://localhost:9200",
       "type.name": "kafka-connect",
       "name": "elasticsearch-sink"
     },
     "tasks": [],
     "type": null
   }
   ```
3. After the connector finishes ingesting data to Elasticsearch, enter the
   following command to check that data is available in Elasticsearch:
   ```bash
   curl -XGET 'http://localhost:9200/test-elasticsearch-sink/_search?pretty'
   ```

   Example output:
   ```bash
   {
     "took" : 39,
     "timed_out" : false,
     "_shards" : {
       "total" : 5,
       "successful" : 5,
       "skipped" : 0,
       "failed" : 0
     },
     "hits" : {
       "total" : 3,
       "max_score" : 1.0,
       "hits" : [
         {
           "_index" : "test-elasticsearch-sink",
           "_type" : "kafka-connect",
           "_id" : "test-elasticsearch-sink+0+0",
           "_score" : 1.0,
           "_source" : {
             "f1" : "value1"
           }
         },
         {
           "_index" : "test-elasticsearch-sink",
           "_type" : "kafka-connect",
           "_id" : "test-elasticsearch-sink+0+2",
           "_score" : 1.0,
           "_source" : {
             "f1" : "value3"
           }
         },
         {
           "_index" : "test-elasticsearch-sink",
           "_type" : "kafka-connect",
           "_id" : "test-elasticsearch-sink+0+1",
           "_score" : 1.0,
           "_source" : {
             "f1" : "value2"
           }
         }
       ]
     }
   }
   ```


## Property-based example

Create a configuration file `firebase-sink.properties` with the following
content. This file should be placed inside the Confluent Platform installation directory. This
configuration is used typically along with [standalone
workers](/platform/current/connect/concepts.html#standalone-workers).

```text
name=FirebaseSinkConnector

topics=artists,songs
connector.class=io.confluent.connect.firebase.FirebaseSinkConnector
tasks.max=1

gcp.firebase.credentials.path=file-path
gcp.firebase.database.reference=database-url
insert.mode=set/update/push

key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url":"http://localhost:8081

confluent.topic.bootstrap.servers=localhost:9092
confluent.topic.replication.factor=1
confluent.license=
```

Run the connector with this configuration.

```bash
confluent local load FirebaseSinkConnector --config firebase-sink.properties
```

The output should resemble:

```json
 {
    "name":"FirebaseSinkConnector",
    "config":{
        "topics":"artists,songs",
        "tasks.max":"1",
        "connector.class":"io.confluent.connect.firebase.FirebaseSinkConnector",
        "gcp.firebase.database.reference":"https://<gcp-project-id>.firebaseio.com",
        "gcp.firebase.credentials.path":"file-path-to-your-gcp-service-account-json-file",
        "insert.mode":"update",
        "key.converter" : "io.confluent.connect.avro.AvroConverter",
        "key.converter.schema.registry.url":"http://localhost:8081",
        "value.converter" : "io.confluent.connect.avro.AvroConverter",
        "value.converter.schema.registry.url":"http://localhost:8081",
        "confluent.topic.bootstrap.servers":"localhost:9092",
        "confluent.topic.replication.factor":"1",
        "name":"FirebaseSinkConnector"
     },
    "tasks":[
      {
        "connector":"FirebaseSinkConnector",
        "task":0
      }
     ],
     "type":"sink"
}
```

Confirm that the connector is in a `RUNNING` state.

```bash
confluent local status FirebaseSinkConnector
```

The output should resemble:

```bash
{
   "name":"FirebaseSinkConnector",
   "connector":{
      "state":"RUNNING",
      "worker_id":"127.0.1.1:8083"
   },
   "tasks":[
      {
         "id":0,
         "state":"RUNNING",
         "worker_id":"127.0.1.1:8083"
      }
   ],
   "type":"sink"
}
```


## Quick Start

This quick start uses the Google Cloud Functions Sink connector to consume
records and send them to a Google Cloud Functions function.

Prerequisites
: - [Confluent Platform](/platform/current/installation/index.html)
  - [Confluent CLI](https://docs.confluent.io/confluent-cli/current/installing.html) (requires separate installation)

1. Before starting the connector, create and deploy a basic Google Cloud Functions instance.
   - Navigate to the [Google Cloud Console](https://console.cloud.google.com).
   - Go to the [Cloud Functions](https://console.cloud.google.com/functions) tab.
   - Create a new function.
   - For creating an unauthenticated function select **Allow unauthenticated
     invocations** and go ahead.
   - For authenticated functions select **Require Authentication** and then
     click Variables, Networking and Advanced Settings to display additional
     settings. Click the Service account drop down and select the desired service
     account.
   - Note down the project id, the region, and the function name as they will
     be used later.
   - Further, to add an invoker account for an already deployed function, click
     **Add members** in the Permission tab of the functions home page. In the
     popup, select add member and select *Cloud Functions Invoker* Role.
2. Install the connector by running the following command from your Confluent Platform
   installation directory:
   ```bash
   confluent connect plugin install confluentinc/kafka-connect-gcp-functions:latest
   ```
3. Start Confluent Platform.
   ```bash
   confluent local start
   ```
4. Produce test data to the `functions-messages` topic in Kafka using the CLI
   command below.
   ```bash
   echo key1,value1 | confluent local produce functions-messages --property parse.key=true --property key.separator=,
   echo key2,value2 | confluent local produce functions-messages --property parse.key=true --property key.separator=,
   echo key3,value3 | confluent local produce functions-messages --property parse.key=true --property key.separator=,
   ```
5. Create a `gcp-functions.json` file with the following contents:
   ```json
   {
     "name": "gcp-functions",
     "config": {
       "topics": "functions-messages",
       "tasks.max": "1",
       "connector.class": "io.confluent.connect.gcp.functions.GoogleCloudFunctionsSinkConnector",
       "key.converter":"org.apache.kafka.connect.storage.StringConverter",
       "value.converter":"org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor":1,
       "function.name": "<insert function name here>",
       "project.id": "<insert project id here>",
       "region": "<insert region here>",
       "gcf.credentials.path": "<insert path to service credentials JSON here>",
       "reporter.bootstrap.servers": "localhost:9092",
       "reporter.error.topic.name": "test-error",
       "reporter.error.topic.replication.factor": 1,
       "reporter.error.topic.key.format": "string",
       "reporter.error.topic.value.format": "string",
       "reporter.result.topic.name": "test-result",
       "reporter.result.topic.key.format": "string",
       "reporter.result.topic.value.format": "string",
       "reporter.result.topic.replication.factor": 1
     }
   }
   ```

   #### NOTE
   For details about using this connector with Kafka Connect Reporter, see
   [Connect
   Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
6. Load the Google Cloud Functions Sink connector.
   ```bash
   confluent local load gcp-functions --config gcp-functions.json
   ```

   #### IMPORTANT
   Don’t use the CLI commands in production environments.
7. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status gcp-functions
   ```
8. Confirm that the messages were delivered to the result topic in Kafka
   ```bash
   confluent local consume test-result --from-beginning
   ```
9. Cleanup resources
   * Delete the connector
     ```bash
     confluent local unload gcp-functions
     ```
   * Stop Confluent Platform
     ```bash
     confluent local stop
     ```
   * Delete the created Google Cloud Function in the Google Cloud Platform
     portal.


#### NOTE
Before you begin: [Start](https://gemfire82.docs.pivotal.io/docs-gemfire/getting_started/15_minute_quickstart_gfsh.html)
the VMware Tanzu GemFire locator and server. Create a cache region to store
the data.

Start the services using the Confluent CLI.

```bash
confluent local start
```

Every service starts in order, printing a message with its status.

```bash
Starting Zookeeper
Zookeeper is [UP]
Starting Kafka
Kafka is [UP]
Starting Schema Registry
Schema Registry is [UP]
Starting Kafka REST
Kafka REST is [UP]
Starting Connect
Connect is [UP]
Starting KSQL Server
KSQL Server is [UP]
Starting Control Center
Control Center is [UP]
```

To import a few records with a simple schema in Kafka, start the Avro console
producer as follows:

```bash
  ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic input_topic \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'
```

Then, in the console producer, enter the following:

```bash
{"f1": "value1"}
{"f1": "value2"}
{"f1": "value3"}
```

The three records entered are published to the Kafka topic `input_topic` in
Avro format.


# HDFS 3 Sink Connector for Confluent Platform

The Kafka Connect HDFS 3 Sink connector allows you to export data from Kafka topics
to HDFS 3.x files in a variety of formats and integrates with Hive to make data
immediately available for querying with HiveQL. Note the following:

- This connector is released separately from the HDFS 2.x connector. If you are
  targeting an HDFS 2.x distribution, see the [HDFS 2 Sink connector for
  Confluent Platform](https://docs.confluent.io/kafka-connect-hdfs/current/index.html)
  documentation for more details. If you are upgrading from the HDFS 2 Sink
  connector for Confluent Platform, update
  `connector.class` to `io.confluent.connect.hdfs3.Hdfs3SinkConnector` and
  `partitioner.class` to `io.confluent.connect.storage.partitioner.*` All
  HDFS 2.x configurations are applicable in this connector.
- The HDFS 3 Sink connector in your Docker image can only run on a Connect
  pod where the template includes the `runAsUser` property as shown in the
  following example:
  ```text
  podTemplate:
    podSecurityContext:
      fsGroup: 1000
      runAsUser: 1000
      runAsNonRoot: true
  ```

The connector periodically polls data from Apache Kafka® and writes them to HDFS. The data from each Kafka
topic is partitioned by the provided partitioner and divided into chunks. Each chunk of data is
represented as an HDFS file with topic, Kafka partition, start and end offsets of this data chunk
in the file name. If a partitioner is not specified in the configuration, the default partitioner which
preserves the Kafka partitioning is used. The size of each data chunk is determined by the number of
records written to HDFS, the time written to HDFS, and schema compatibility.

The HDFS 3 Sink connector integrates with Hive and when it is enabled, the connector automatically creates
an external Hive partitioned table for each Kafka topic and updates the table according to the
available data in HDFS.


### Extensible data formats

Out of the box, the connector supports writing data to HDFS in Avro and Parquet format.
However, you can write other formats to HDFS by extending the `Format` class.

You must configure the `format.class` and `partitioner.class` if you want to write other
formats to HDFS or use other partitioners. The following example configurations show how to
write Parquet format and use the field partitioner:

```properties
format.class=io.confluent.connect.hdfs3.parquet.ParquetFormat
partitioner.class=io.confluent.connect.storage.partitioner.FieldPartitioner
```

You must use the
[AvroConverter](/kafka-connectors/self-managed/userguide.html#configuring-key-and-value-converters),
`ProtobufConverter`, or `JsonSchemaConverter` with `ParquetFormat` for
this connector. Attempting to use the `JsonConverter` (with or without
schemas) results in a NullPointerException and a StackOverflowException.

When using the field partitioner, you must specify the `partition.field.name`
configuration to specify the field name of the record that is used for
partitioning.

Note that if the source Kafka topic is stored as plain JSON, you can’t use a
formatter that requires a schema, you can only use the JSON formatter. The
following example shows how to use Parquet format and the field partitioner.

1. [Produce](https://docs.confluent.io/confluent-cli/current/command-reference/local/services/kafka/confluent_local_services_kafka_produce.html) test Avro data to the `parquet_field_hdfs` topic in Kafka.
   ```bash
   ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic parquet_field_hdfs \
   --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"name","type":"string"}, {"name":"address","type":"string"}, {"name" : "age", "type" : "int"}, {"name" : "is_customer", "type" : "boolean"}]}'


    # paste each of these messages

    {"name":"Peter", "address":"Mountain View", "age":27, "is_customer":true}
    {"name":"David", "address":"Mountain View", "age":37, "is_customer":false}
    {"name":"Kat", "address":"Palo Alto", "age":30, "is_customer":true}
    {"name":"David", "address":"San Francisco", "age":35, "is_customer":false}
    {"name":"Leslie", "address":"San Jose", "age":26, "is_customer":true}
    {"name":"Dani", "address":"Seatle", "age":32, "is_customer":false}
    {"name":"Kim", "address":"San Jose", "age":30, "is_customer":true}
    {"name":"Steph", "address":"Seatle", "age":31, "is_customer":false}
   ```
2. Create a `hdfs3-parquet-field.json` file with the following contents:
   ```json
   {
       "name": "hdfs3-parquet-field",
       "config": {
           "connector.class": "io.confluent.connect.hdfs3.Hdfs3SinkConnector",
           "tasks.max": "1",
           "topics": "parquet_field_hdfs",
           "hdfs.url": "hdfs://localhost:9000",
           "flush.size": "3",
           "key.converter": "org.apache.kafka.connect.storage.StringConverter",
           "value.converter": "io.confluent.connect.avro.AvroConverter",
           "value.converter.schema.registry.url":"http://localhost:8081",
           "confluent.topic.bootstrap.servers": "localhost:9092",
           "confluent.topic.replication.factor": "1",

           "format.class":"io.confluent.connect.hdfs3.parquet.ParquetFormat",
           "partitioner.class":"io.confluent.connect.storage.partitioner.FieldPartitioner",
           "partition.field.name":"is_customer"
       }
   }
   ```
3. Load the HDFS3 Sink connector.
   ```bash
   confluent local load hdfs3-parquet-field --config hdfs3-parquet-field.json
   ```
4. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status hdfs3-parquet-field
   ```
5. Validate that the Parquet data is in HDFS.
   ```bash
   # list files in partition called is_customer=true
   hadoop fs -ls /topics/parquet_field_hdfs/is_customer=true

   # the following should appear in the list
   # /topics/parquet_field_hdfs/is_customer=true/parquet_field_hdfs+0+0000000000+0000000002.parquet
   # /topics/parquet_field_hdfs/is_customer=true/parquet_field_hdfs+0+0000000004+0000000004.parquet
   ```
6. Extract the contents of the file using
   the [parquet-tools-1.9.0.jar](https://repo1.maven.org/maven2/org/apache/parquet/parquet-tools/1.9.0/parquet-tools-1.9.0.jar).
   ```bash
   # substitute "<namenode>" for the HDFS name node hostname
   hadoop jar parquet-tools-1.9.0.jar cat --json / hdfs://<namenode>/topics/parquet_field_hdfs/is_customer=true/parquet_field_hdfs+0+0000000000+0000000002.parquet
   ```
7. If you experience issues with the previous step, first copy the Parquet file
   from HDFS to the local filesystem and try again with java.
   ```bash
   hadoop fs -copyToLocal /topics/parquet_field_hdfs/is_customer=true/parquet_field_hdfs+0+0000000000+0000000002.parquet / /tmp/parquet_field_hdfs+0+0000000000+0000000002.parquet

   java -jar parquet-tools-1.9.0.jar cat --json /tmp/parquet_field_hdfs+0+0000000000+0000000002.parquet

   # expected output
   {"name":"Peter","address":"Mountain View","age":27,"is_customer":true}
   {"name":"Kat","address":"Palo Alto","age":30,"is_customer":true}
   ```


## Quick Start

In this quickstart, you copy Avro data from a single topic
to a local HEAVY-AI database running on Docker.

This example assumes you are running Kafka and Schema Registry locally on the default ports.
It also assumes your have Docker installed and running.

First, bring up HEAVY-AI database by running the following Docker command:

```bash
docker run -d -p 6274:6274 omnisci/core-os-cpu:v4.7.0
```

This starts the CPU-based community version of HEAVY-AI, and maps it to port
6274 on localhost. By default, the user name is `admin` and the password is
`HyperInteractive`. The default database is `omnisci`.

Start the Confluent Platform using the Confluent CLI command below.

```bash
confluent local start
```


## Quick start

In this quick start, you copy data from a single Kafka topic to a measurement on
a local Influx database running on Docker.

This example assumes you are running Kafka and Schema Registry locally on the default ports.
It also assumes you have Docker installed and running. Note that InfluxDB Docker
can be replaced with any installed InfluxDB server.

To get started, complete the following steps:

1. Start the Influx database by running the following Docker command:
   ```bash
   docker run -d -p 8086:8086 --name influxdb-local influxdb:1.7.7
   ```

   This starts the Influx database and maps it to port 8086 on `localhost`. By
   default, the username and password are blank. The database connection URL is
   `http://localhost:8086`.
2. Start the Confluent Platform using the following Confluent CLI command:
   ```bash
   confluent local start
   ```


### Property-based example

In this section, you will complete the steps in a property-based example.

1. Create a configuration file for the connector. This configuration is typically
   used with [standalone
   workers](/platform/current/connect/concepts.html#standalone-workers). Note that this file is
   included with the connector in
   `./etc/kafka-connect-influxdb/influxdb-sink-connector.properties` and
   contains the following settings:
   ```bash
           name=InfluxDBSinkConnector
           connector.class=io.confluent.influxdb.InfluxDBSinkConnector
           tasks.max=1
           topics=orders
           influxdb.url=http://localhost:8086
           influxdb.db=influxTestDB
           measurement.name.format=${topic}
           value.converter=io.confluent.connect.avro.AvroConverter
           value.converter.schema.registry.url=http://localhost:8081

   The first few settings are common settings you specify for all connectors,
   except for topics which are specific to sink connectors like this one.

   The ``influxdb.url`` specifies the connection URL of the influxDB server.
   ``influxdb.db`` specifies the database bame. ``influxdb.username`` specifies
   the username, and ``influxdb.password`` specifies the password of the
   InfluxDB server, respectively. By default the username and password are
   blank for the previous InfluxDB server above, so it is not added in the
   configuration.
   ```
2. Run the connector with the following configuration:
   ```bash
   confluent local load InfluxDBSinkConnector --config etc/kafka-connect-influxdb/influxdb-sink-connector.properties
   ```


### Avro tags example

In this section, you will complete the steps in an Avro tags example.

1. Configure your connector configuration with the values shown in the following
   example:
   ```text
   name=InfluxDBSinkConnector
   connector.class=io.confluent.influxdb.InfluxDBSinkConnector
   tasks.max=1
   topics=products
   influxdb.url=http://localhost:8086
   influxdb.db=influxTestDB
   measurement.name.format=${topic}
   value.converter=io.confluent.connect.avro.AvroConverter
   value.converter.schema.registry.url=http://localhost:8081
   ```
2. Create Avro tags for a topic named `products` using the following producer
   command:
   ```text
   kafka-avro-console-producer \
   --broker-list localhost:9092 \
   --topic products \
   --property value.schema='{"name": "myrecord","type": "record","fields": [{"name":"id","type":"int"}, {"name": "product","type": "string"}, {"name": "quantity","type": "int"},{"name": "price","type": "float"}, {"name": "tags","type": {"name": "tags","type": "record","fields": [{"name": "DEVICE","type": "string"},{"name": "location","type": "string"}]}}]}'
   ```

   The console producer waits for input.
3. Copy and paste the following records into the terminal:
   ```text
   {"id": 1, "product": "pencil", "quantity": 100, "price": 50, "tags" : {"DEVICE": "living", "location": "home"}}

   {"id": 2, "product": "pen", "quantity": 200, "price": 60, "tags" : {"DEVICE": "living", "location": "home"}}
   ```
4. Verify the data is in InfluxDB.


### Topic to database example

If `measurement.name.format` is not present in the configuration, the
connector uses the Kafka topic name as the database name and takes the
measurement name from a field in the message.

1. Configure your connector configuration with the values shown in the following
   example:
   ```text
   name=InfluxDBSinkConnector
   connector.class=io.confluent.influxdb.InfluxDBSinkConnector
   tasks.max=1
   topics=products
   influxdb.url=http://localhost:8086
   value.converter=io.confluent.connect.avro.AvroConverter
   value.converter.schema.registry.url=http://localhost:8081
   ```
2. Create an Avro record for a topic named `products` using the following
   producer command:
   ```text
   kafka-avro-console-producer \
   --broker-list localhost:9092 \
   --topic products \
   --property value.schema='{"name": "myrecord","type": "record","fields": [{"name":"id","type":"int"}, {"name": "measurement","type":"string"}]}'
   ```

   The console producer waits for input.
3. Copy and paste the following records into the terminal:
   ```text
   {"id": 1, "measurement": "test"}
   {"id": 2, "measurement": "test2"}
   ```

   The following query shows the measurements and points written to InfluxDB.
   ```text
   > use products;
   > show measurements;
   name: measurements
   name

   test
   test2

   > select * from test;
   name: test
   time                id
   ----                --
   1601464614638       1
   ```


### Custom timestamp example

In this section, you will complete the steps in a custom timestamp example.

1. Configure your connector configuration with the values shown in the following
   example:
   ```text
   name=InfluxDBSinkConnector
   connector.class=io.confluent.influxdb.InfluxDBSinkConnector
   tasks.max=1
   topics=products
   influxdb.url=http://localhost:8086
   influxdb.db=influxTestDB
   measurement.name.format=${topic}
   event.time.fieldname=time
   value.converter=io.confluent.connect.avro.AvroConverter
   value.converter.schema.registry.url=http://localhost:8081
   ```
2. Create an Avro record for a topic named `products` using the following
   producer command:
   ```text
   kafka-avro-console-producer \
   --broker-list localhost:9092 \
   --topic products \
   --property value.schema='{"name": "myrecord","type": "record","fields": [{"name":"id","type":"int"}, {"name": "time","type":"long"}]}'
   ```

   The console producer waits for input. Note that the timestamp needs to be in
   milliseconds since the Unix Epoch (Unix time).
3. Copy and paste the following record into the terminal:
   ```text
   {"id": 1, "time": 123412341234}
   ```

   The following shows the custom timestamp written to InfluxDB.
   ```text
   > precision ms
   > select * from products;
   name: products
   time                id
   ----                --
   123412341234        1
   ```


## Quick Start

In this quick start, you copy data from a single measurement from a local
Influx database running on Docker into a Kafka topic.

This example assumes you are running Kafka and Schema Registry locally on the default ports.
It also assumes you have Docker installed and running.

First, bring up the Influx database by running the following Docker command:

```bash
docker run -d -p 8086:8086 --name influxdb-local influxdb:1.7.7
```

This starts the Influx database, and maps it to port 8086 on `localhost`.
By default, the user name and password are blank. The database connection URL is `http://localhost:8086`.

To create sample data in the Influx database, log in to the Docker container using the following command:

```bash
docker exec -it <containerid> bash
```

Once you are in the Docker container, log in to InfluxDB shell:

```bash
influx
```

Your output should resemble:

```bash
Connected to http://localhost:8086 version 1.7.7
InfluxDB shell version: 1.7.7
```


### Source connector configuration

1. Start the services using the Confluent CLI:
   ```bash
   confluent local start
   ```
2. Create a configuration file named `kinesis-source-config.json` with the
   following contents.
   ```text
   {
   "name": "kinesis-source",
   "config": {
       "connector.class": "io.confluent.connect.kinesis.KinesisSourceConnector",
       "tasks.max": "1",
       "kafka.topic": "kinesis_topic",
       "kinesis.region": "US_WEST_1",
       "kinesis.stream": "my_kinesis_stream",
       "confluent.license": "",
       "name": "kinesis-source",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
   }
   }
   ```

   The important configuration parameters used here are:
   - **kinesis.stream.name**: The Kinesis Stream to subscribe to.
   - **kafka.topic**: The Kafka topic in which the messages received from
     Kinesis are produced.
   - **tasks.max**: The maximum number of tasks that should be created for this
     connector. Each Kinesis [shard](https://docs.aws.amazon.com/streams/latest/dev/key-concepts.html#shard)
     is allocated to a single task. If the number of shards specified exceeds
     the number of tasks, the connector throws an exception and fails.
   - **kinesis.region**: The region where the stream exists. Defaults to
     `US_EAST_1` if not specified.
   - You may pass your AWS credentials to the Kinesis connector through your
     source connector configuration. To pass AWS credentials in the source
     configuration set the **aws.access.key.id** and the **aws.secret.key.id**
     parameters.
     ```text
     "aws.acess.key.id":<your-access-key>
     "aws.secret.key.id":<your-secret-key>
     ```
3. Run the following command to start the Kinesis Source connector.
   ```bash
   confluent local load source-kinesis --config source-kinesis-config.json
   ```
4. Run the following command to check that the connector started successfully by
   viewing the Connect worker’s log:
   ```bash
   confluent local services connect log
   ```
5. Start a Kafka Consumer in a separate terminal session to view the data
   exported by the connector into the Kafka topic
   ```text
   bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kinesis_topic --from-beginning
   ```
6. Stop the Confluent services using the following command:
   ```bash
   confluent local stop
   ```


### Load the Kudu Source Connector

Load the predefined Kudu Source connector.

1. Optional: View the available predefined connectors with this command:
   ```bash
   confluent local list
   ```

   Your output should resemble:
   ```bash
   Bundled Predefined Connectors (edit configuration under etc/):
     elasticsearch-sink
     file-source
     file-sink
     jdbc-source
     jdbc-sink
     kudu-source
     kudu-sink
     hdfs-sink
     s3-sink
   ```
2. Create a `kudu-source.json` file for your Kudu Source connector.
   ```text
   {
       "name": "kudu-source",
       "config": {
         "connector.class": "io.confluent.connect.kudu.KuduSourceConnector",
         "tasks.max": "1",
         "impala.server": "127.0.0.1",
         "impala.port": "21050",
         "kudu.database": "test",
         "mode": "incrementing",
         "incrementing.column.name": "id",
         "topic.prefix": "test-kudu-",
         "table.whitelist": "accounts",

         "key.converter": "io.confluent.connect.avro.AvroConverter",
         "key.converter.schema.registry.url": "http://localhost:8081",
         "value.converter": "io.confluent.connect.avro.AvroConverter",
         "value.converter.schema.registry.url": "http://localhost:8081",
         "confluent.topic.bootstrap.servers": "localhost:9092",
         "confluent.topic.replication.factor": "1",
         "impala.ldap.password": "secret",
         "impala.ldap.user": "kudu",
         "name": "kudu-source"
       }
   }
   ```
3. Load the `kudu-source` connector. The `test` file must be in the same
   directory where Connect is started.
   ```bash
   confluent local load kudu-source --config kudu-source.json
   ```

   Your output should resemble:
   ```bash
   {
      "name": "kudu-source",
      "config": {
        "connector.class": "io.confluent.connect.kudu.KuduSourceConnector",
        "tasks.max": "1",
        "impala.server": "127.0.0.1",
        "impala.port": "21050",
        "kudu.database": "test",
        "mode": "incrementing",
        "incrementing.column.name": "id",
        "topic.prefix": "test-kudu-",
        "table.whitelist": "accounts",
        "key.converter": "io.confluent.connect.avro.AvroConverter",
        "key.converter.schema.registry.url": "http://localhost:8081",
        "value.converter": "io.confluent.connect.avro.AvroConverter",
        "value.converter.schema.registry.url": "http://localhost:8081",
        "confluent.topic.bootstrap.servers": "localhost:9092",
        "confluent.topic.replication.factor": "1",
        "impala.ldap.password": "<ldap-password>",
        "impala.ldap.user": "<ldap-user>",
        "name": "kudu-source"
      },
      "tasks": [],
      "type": "source"
    }
   ```

   To check that it has copied the data that was present when you started Kafka
   Connect, start a console consumer, reading from the beginning of the topic:
   ```bash
     ./bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic test-kudu-accounts --from-beginning
   {"id":1,"name":{"string":"alice"}}
   {"id":2,"name":{"string":"bob"}}
   ```

The output shows the two records as expected, one per line, in the JSON encoding
of the Avro records. Each row is represented as an Avro record and each column
is a field in the record. You can see both columns in the table, `id` and
`name`. The IDs were auto-generated and the column is of type `INTEGER NOT
NULL`, which can be encoded directly as an integer. The `name` column has
type `STRING` and can be `NULL`. The JSON encoding of Avro encodes the
strings in the format `{"type": value}`, so you can see that both rows have
`string` values with the names specified when you inserted the data.


#### Procedure

1. In your Connect worker, run the following command:
   ```text
   kinit -kt /path/to/the/keytab --renewable -f <your-connect-user-principle>
   ```

   The flags `--renewable` and `-f` are required when using kinit, since a
   long running connector has to renew the ticket-granting ticket (TGT) and the
   tickets must be forwardable (`-f`).
2. If not running, start Confluent Platform.
   ```text
   confluent local start
   ```
3. Create the following connector configuration JSON file and save it as
   `config5.json`.

   #### NOTE
   The Oracle CDC Source connector uses the `oracle.kerberos.cache.file`
   configuration property to specify the path to the cache file generated
   previously.

   ```json
   {
          "name": "SimpleOracleCDC_5",
          "config":{
            "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
            "name": "SimpleOracleCDC_5",
            "tasks.max":1,
            "key.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter.schema.registry.url": "http://localhost:8081",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "value.converter.schema.registry.url": "http://localhost:8081",
            "confluent.topic.bootstrap.servers":"localhost:9092",
            "oracle.server": "",
            "oracle.port": 1521,
            "oracle.sid":"",
            "oracle.pdb.name":"",
            "oracle.username": "<C##MYUSER>",
            "oracle.password": "",
            "start.from":"snapshot",
            "redo.log.topic.name": "redo-log-topic-5",
            "table.inclusion.regex":"",
            "table.topic.name.template":"<Set to an empty string to disable generating change event records>",
            "table.topic.name.template": "",
            "connection.pool.max.size": 20,
            "confluent.topic.replication.factor":1,
            "topic.creation.groups": "redo",
            "topic.creation.redo.include": "redo-log-topic-5",
            "topic.creation.redo.replication.factor": 3,
            "topic.creation.redo.partitions": 1,
            "topic.creation.redo.cleanup.policy": "delete",
            "topic.creation.redo.retention.ms": 1209600000,
            "topic.creation.default.replication.factor": 3,
            "topic.creation.default.partitions": 5,
            "topic.creation.default.cleanup.policy": "compact",
            "oracle.kerberos.cache.file": "</path/to/my/kerberos/credentials-file>"
          }
   }
   ```
4. Create `redo-log-topic-5`. Make sure the topic name matches the value you put for `"redo.log.topic.name"`.
   ```text
   bin/kafka-topics --create --topic redo-log-topic-5 \
   --bootstrap-server broker:9092 --replication-factor 1 \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```
5. Enter the following command to start the connector:
   ```text
   curl -s -X POST -H 'Content-Type: application/json' --data @config5.json http://localhost:8083/connectors | jq
   ```
6. Enter the following command and verify that connector is started with two running tasks.
   ```text
   curl -s -X GET -H 'Content-Type: application/json' http://localhost:8083/connectors/SimpleOracleCDC_5/status | jq
   ```
7. Perform `INSERT`, `UPDATE`, and `DELETE` row operations for each and
   verify following lists expected results:
   - The connector starts and has one running task.
   - Change event topics are created for each captured table.
   - The connector does not produce records for tables that were not included in
     regex or were explicitly excluded using the `table.exclusion.regex`
     property.
   - If the `redo.log.corruption.topic` property was configured, the connector
     sends corrupted records to the specified corruption topic.


#### Procedure

1. If not running, start Confluent Platform.
   ```text
   confluent local start
   ```
2. Create the following connector configuration JSON file and save it as
   `config6.json`.

   #### NOTE
   Note the following properties (shown in the example):
   * The Oracle CDC Source connector uses the `oracle.ssl.truststore.file` and `oracle.ssl.truststore.password` properties to specify the location of the truststore containing the trusted server certificate and the truststore password.
   * The passthrough properties `oracle.connection.javax.net.ssl.keyStore` and `oracle.connection.javax.net.ssl.keyStorePassword` are also used to supply the keystore location and password.

   ```json
   {
          "name": "SimpleOracleCDC_6",
          "config":{
            "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
            "name": "SimpleOracleCDC_6",
            "tasks.max":1,
            "key.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter.schema.registry.url": "http://localhost:8081",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "value.converter.schema.registry.url": "http://localhost:8081",
            "confluent.topic.bootstrap.servers":"localhost:9092",
            "oracle.server": "",
            "oracle.port": 1521,
            "oracle.sid":"",
            "oracle.pdb.name":"",
            "oracle.username": "<C##MYUSER>",
            "oracle.password": "",
            "start.from":"snapshot",
            "redo.log.topic.name": "redo-log-topic-6",
            "table.inclusion.regex":"",
            "table.topic.name.template":"<Set to an empty string to disable generating change event records>",
            "table.topic.name.template": "",
            "connection.pool.max.size": 20,
            "confluent.topic.replication.factor":1,
            "topic.creation.groups": "redo",
            "topic.creation.redo.include": "redo-log-topic-6",
            "topic.creation.redo.replication.factor": 3,
            "topic.creation.redo.partitions": 1,
            "topic.creation.redo.cleanup.policy": "delete",
            "topic.creation.redo.retention.ms": 1209600000,
            "topic.creation.default.replication.factor": 3,
            "topic.creation.default.partitions": 5,
            "topic.creation.default.cleanup.policy": "compact",
            "oracle.ssl.truststore.file": "</path/to/truststore/file/containing/server/certs>",
            "oracle.ssl.truststore.password": "<password>",
            "oracle.connection.javax.net.ssl.keyStore": "</path/to/keystore/file>",
            "oracle.connection.javax.net.ssl.keyStorePassword": "<password>"
          }
   }
   ```
3. Create `redo-log-topic-6`. Make sure the topic name matches the value you put for `"redo.log.topic.name"`.
   ```text
   bin/kafka-topics --create --topic redo-log-topic-6 \
   --bootstrap-server broker:9092 --replication-factor 1 \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```
4. Enter the following command to start the connector:
   ```text
   curl -s -X POST -H 'Content-Type: application/json' --data @config5.json http://localhost:8083/connectors | jq
   ```
5. Enter the following command and verify that connector is started with two running tasks.
   ```text
   curl -s -X GET -H 'Content-Type: application/json' http://localhost:8083/connectors/SimpleOracleCDC_6/status | jq
   ```
6. Perform `INSERT`, `UPDATE`, and `DELETE` row operations for each and
   verify following lists expected results:
   - The connector starts and has one running task.
   - Change event topics are created for each captured table.
   - The connector does not produce records for tables that were not included in
     regex or were explicitly excluded using the `table.exclusion.regex`
     property.
   - If the `redo.log.corruption.topic` property was configured, the connector
     sends corrupted records to the specified corruption topic.


#### Procedure

1. If not running, start Confluent Platform.
   ```text
   confluent local start
   ```
2. Create the following connector configuration JSON file and save it as
   `config7.json`.

   #### NOTE
   * `oracle.service.name` property specifies the service name to use when connecting to RAC
   * An `oracle.sid` is still required. It can be the SID of any of the database instances.

   ```json
   {
          "name": "SimpleOracleCDC_7",
          "config":{
            "connector.class": "io.confluent.connect.oracle.cdc.OracleCdcSourceConnector",
            "name": "SimpleOracleCDC_7",
            "tasks.max":1,
            "key.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter.schema.registry.url": "http://localhost:8081",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "value.converter.schema.registry.url": "http://localhost:8081",
            "confluent.topic.bootstrap.servers":"localhost:9092",
            "oracle.server": "",
            "oracle.port": 1521,
            "oracle.sid":"",
            "oracle.service.name":"",
            "oracle.pdb.name":"",
            "oracle.username": "<C##MYUSER>",
            "oracle.password": "",
            "start.from":"snapshot",
            "redo.log.topic.name": "redo-log-topic-7",
            "table.inclusion.regex":"",
            "table.topic.name.template":"<Set to an empty string to disable generating change event records>",
            "table.topic.name.template": "",
            "connection.pool.max.size": 20,
            "confluent.topic.replication.factor":1,
            "topic.creation.groups": "redo",
            "topic.creation.redo.include": "redo-log-topic-6",
            "topic.creation.redo.replication.factor": 3,
            "topic.creation.redo.partitions": 1,
            "topic.creation.redo.cleanup.policy": "delete",
            "topic.creation.redo.retention.ms": 1209600000,
            "topic.creation.default.replication.factor": 3,
            "topic.creation.default.partitions": 5,
            "topic.creation.default.cleanup.policy": "compact"
          }
   }
   ```
3. Create `redo-log-topic-7`. Make sure the topic name matches the value you put for `"redo.log.topic.name"`.
   ```text
   bin/kafka-topics --create --topic redo-log-topic-7 \
   --bootstrap-server broker:9092 --replication-factor 1 \
   --partitions 1 --config cleanup.policy=delete \
   --config retention.ms=120960000
   ```
4. Enter the following command to start the connector:
   ```text
   curl -s -X POST -H 'Content-Type: application/json' --data @config5.json http://localhost:8083/connectors | jq
   ```
5. Enter the following command and verify that connector is started with two running tasks.
   ```text
   curl -s -X GET -H 'Content-Type: application/json' http://localhost:8083/connectors/SimpleOracleCDC_7/status | jq
   ```
6. Perform `INSERT`, `UPDATE`, and `DELETE` row operations for each and
   verify following lists expected results:
   - The connector starts and has one running task.
   - Change event topics are created for each captured table.
   - The connector does not produce records for tables that were not included in
     regex or were explicitly excluded using the `table.exclusion.regex`
     property.
   - If the `redo.log.corruption.topic` property was configured, the connector
     sends corrupted records to the specified corruption topic.


### NUMERIC data type with no precision or scale results in unreadable output

The following Oracle database table includes `ORDER_NUMBER` and `CUSTOMER_NUMBER` NUMERIC data types without precision or scale included.

```text
CREATE TABLE MARIPOSA_ORDERS (
"ORDER_NUMBER" NUMBER PRIMARY KEY,
"ORDER_DATE" TIMESTAMP(6) NOT NULL,
"SHIPPED_DATE" TIMESTAMP(6) NOT NULL,
"STATUS" VARCHAR2(50),
"CUSTOMER_NUMBER" NUMBER)
```

The Oracle CDC Source connector generates the following schema in Schema Registry, when using the Avro converter and the connector property `"numeric.mapping: "best_fit_or_decimal"`:

```json
{ "fields": [
   { "name": "ORDER_NUMBER",
     "type": {
        "connect.name": "org.apache.kafka.connect.data.Decimal",
        "connect.parameters": { "scale": "127" },
        "connect.version": 1,
        "logicalType": "decimal",
        "precision": 64,
        "scale": 127,
        "type": "bytes"
      }
   },
   { "name": "ORDER_DATE",
     "type": {
        "connect.name": "org.apache.kafka.connect.data.Timestamp",
        "connect.version": 1,
        "logicalType": "timestamp-millis",
        "type": "long"
         }
   },
   { "name": "SHIPPED_DATE",
     "type": {
       "connect.name": "org.apache.kafka.connect.data.Timestamp",
       "connect.version": 1,
       "logicalType": "timestamp-millis",
       "type": "long"
         }
   },
   { "default": null,
     "name": "STATUS",
     "type": [ "null", "string"  ]
      },
   { "default": null,
     "name": "CUSTOMER_NUMBER",
     "type": [ "null",
       { "connect.name": "org.apache.kafka.connect.data.Decimal",
         "connect.parameters": {"scale": "127"},
         "connect.version": 1,
         "logicalType": "decimal",
         "precision": 64,
         "scale": 127,
         "type": "bytes"
         } ]
   },

   ... omitted

   { "default": null,
     "name": "username",
     "type": [ "null", "string"  ]
   }  ],

   "name": "ConnectDefault",
   "namespace": "io.confluent.connect.avro",
   "type": "record"
}
```

In this scenario, the resulting values for `ORDER_NUMBER` or
`CUSTOMER_NUMBER` are unreadable, as shown below:

```text
A\u0000\u000b\u001b8¸®æ«Îò,Rt]!\u0013_\u0018aVKæ,«1\u0010êo\u0017\u000bKðÀ\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"
```

This is because the connector attempts to preserve accuracy (not lose any
decimals) when precision and scale are not provided. As a workaround, you can set
`"numeric.mapping: "best_fit_or_double"` or `"numeric.mapping: best_fit_or_string"`, or use
[ksqlDB to create a new stream](https://docs.ksqldb.io/en/latest/operate-and-deploy/schema-registry-integration/#create-a-new-stream)
with explicit data types (based on a Schema Registry schema). For example:

```sql
CREATE STREAM ORDERS_RAW (
  ORDER_NUMBER DECIMAL(9,2),
  ORDER_DATE TIMESTAMP,
  SHIPPED_DATE TIMESTAMP,
  STATUS VARCHAR,
  CUSTOMER_NUMBER DECIMAL(9,2)  )
  WITH (
    KAFKA_TOPIC='ORCL.ADMIN.MARIPOSA_ORDERS',
    VALUE_FORMAT='AVRO'
  );
```

When you check the stream using `SELECT * FROM ORDERS_RAW EMIT CHANGES`, you will see readable values for `ORDER_NUMBER` and `CUSTOMER_NUMBER` as shown in the following example.

```sql
{
  "ORDER_NUMBER": 5361,
  "ORDER_DATE": "2020-08-06T03:41:58.000",
  "SHIPPED_DATE": "2020-08-11T03:41:58.000",
  "STATUS": "Not shipped yet",
  "CUSTOMER_NUMBER": 9076
}
```


### Property-based example

Create a configuration file `salesforce-cdc-source.properties` with the following content. This file should be placed inside the Confluent Platform installation directory. This configuration is used typically along with [standalone workers](/platform/current/connect/concepts.html#standalone-workers).

```none
name=SalesforceCdcSourceConenctor

connector.class=io.confluent.salesforce.SalesforceCdcSourceConnector
tasks.max=1
kafka.topic=< Required Configuration >

salesforce.consumer.key=< Required Configuration >
salesforce.consumer.secret=< Required Configuration >
salesforce.username=< Required Configuration >
salesforce.password=< Required Configuration >
salesforce.password.token=< Required Configuration >

salesforce.cdc.name=< Required Configuration >
salesforce.initial.start=all

confluent.topic.bootstrap.servers=localhost:9092
confluent.topic.replication.factor=1
confluent.license=
```


### Property-based example

This configuration is typically used with [standalone workers](/platform/current/connect/concepts.html#standalone-workers). This configuration overrides the record `_EventType` to perform upsert operations using an external id field named `CustomId__c`. The config ignores the field `CleanStatus` in the Kafka source record.

```text
 name=SalesforceSObjectSinkConnector1
 connector.class=io.confluent.salesforce.SalesforceSObjectSinkConnector
 tasks.max=1
 topics=LeadsTopic< Required Configuration >
 salesforce.consumer.key=< Required Configuration >
 salesforce.consumer.secret=< Required Configuration >
 salesforce.object=< Required Configuration >
 salesforce.password=< Required Configuration >
 salesforce.password.token=< Required Configuration >
 salesforce.push.topic.name=< Required Configuration >
 salesforce.username=< Required Configuration >
 salesforce.ignore.fields=CleanStatus
 salesforce.ignore.reference.fields=true
 salesforce.custom.id.field.name=CustomId__c
 salesforce.use.custom.id.field=true
 salesforce.sink.object.operation=upsert
 override.event.type=true
 confluent.topic.bootstrap.servers=localhost:9092
 confluent.topic.replication.factor=1
 confluent.license=
```

To include your broker, change the `confluent.topic.bootstrap.servers`
property address(es), and for staging or production use, change the
`confluent.topic.replication.factor` to 3. When working on a downloaded
Confluent development cluster, or any single broker cluster, use a
`confluent.topic.replication.factor` of 1.

For details about using this connector with Kafka Connect Reporter, see
[Connect Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).


## Quick Start

This quick start uses the Solace Sink connector to consume records from Kafka and
send them to a Solace PubSub+ Standard broker.

1. Start a Solace PubSub+ Standard docker container.
   ```bash
   docker run -d --name "solace" --hostname "solace" \
     -p 8080:8080 -p 55555:55555 -p 5550:5550 \
     --shm-size=1000000000 \
     --tmpfs /dev/shm \
     --ulimit nofile=2448:38048 \
     -e username_admin_globalaccesslevel=admin \
     -e username_admin_password=admin \
     -e system_scaling_maxconnectioncount=100 \
     solace/solace-pubsub-standard:9.1.0.77
   ```
2. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your CP installation directory
   confluent connect plugin install confluentinc/kafka-connect-solace-sink:latest
   ```
3. [Install the Solace JMS Client Library](#installing-solace-client-jar).
4. Start the Confluent Platform.
   ```bash
   confluent local start
   ```
5. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html)
   test data to the `sink-messages` topic in Kafka.
   ```bash
   seq 10 | confluent local produce sink-messages
   ```
6. Create a `solace-sink.json` file with the following contents:
   ```json
   {
     "name": "SolaceSinkConnector",
     "config": {
       "connector.class": "io.confluent.connect.jms.SolaceSinkConnector",
       "tasks.max": "1",
       "topics": "sink-messages",
       "solace.host": "smf://localhost:55555",
       "solace.username": "admin",
       "solace.password": "admin",
       "solace.dynamic.durables": "true",
       "jms.destination.type": "queue",
       "jms.destination.name": "connector-quickstart",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```
7. Load the Solace Sink connector.
   ```bash
   confluent local load solace --config solace-sink.json
   ```
8. Confirm the connector is in a `RUNNING` state.
   ```bash
   confluent local status solace
   ```
9. Navigate to the [Solace UI](http://localhost:8080) to confirm the messages
   were delivered to the `connector-quickstart` queue in the `default`
   Message VPN.


## Quick Start

This quick start uses the Splunk Source connector to receive application data
ingest it into Kafka.

1. Install the connector using the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```text
   # run from your CP installation directory
   confluent connect plugin install confluentinc/kafka-connect-splunk-source:latest
   ```
2. Start the Confluent Platform.
   ```bash
   confluent local start
   ```
3. Create a `splunk-source.properties` file with the following contents:
   ```text
   name=splunk-source
   kafka.topic=splunk-source
   tasks.max=1
   connector.class=io.confluent.connect.SplunkHttpSourceConnector
   splunk.collector.index.default=default-index
   splunk.port=8889
   splunk.ssl.key.store.path=/path/to/your/keystore.jks
   splunk.ssl.key.store.password=<keystore password>
   confluent.topic.bootstrap.servers=localhost:9092
   confluent.topic.replication.factor=1
   ```
4. Load the Splunk Source connector.
   ```bash
   confluent local load splunk-source --config splunk-source.properties
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production
   environments.
5. Confirm the connector is in a `RUNNING` state.
   ```bash
   confluent local status splunk-source
   ```
6. Simulate an application sending data to the connector.
   ```bash
   curl -k -X POST https://localhost:8889/services/collector/event -d '{"event":"from curl"}'
   ```
7. Verify the data was ingested into the Kafka topic.
   ```text
   kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic splunk-source --from-beginning
   ```
8. Shut down Confluent Platform.
   ```bash
   confluent local destroy
   ```


## Quick start

This quick start uses the TIBCO Sink connector to consume records from Kafka and
send them to TIBCO Enterprise Message Service™ - Community Edition.

1. Download TIBCO Enterprise Message Service™ - Community Edition ([Mac](https://www.tibco.com/resources/product-download/tibco-enterprise-message-service-community-edition-free-download-mac)
   or [Linux](https://www.tibco.com/resources/product-download/tibco-enterprise-message-service-community-edition-free-download-linux))
   and run the appropriate installer. See the [TIBCO Enterprise Message Service™
   Installation Guide](https://docs.tibco.com/pub/ems-zlinux/8.5.0/doc/pdf/TIB_ems_8.5_installation.pdf)
   for more details. Similar documentation is available for each version of TIBCO
   EMS.
2. Install the connector through the [Confluent Hub Client](https://docs.confluent.io/current/connect/managing/confluent-hub/client.html).
   ```bash
   # run from your CP installation directory
   confluent connect plugin install confluentinc/kafka-connect-tibco-sink:latest
   ```
3. [Install the TIBCO JMS Client Library](#installing-tibco-client-jar).
4. Start Confluent Platform.
   ```bash
   confluent local start
   ```
5. [Produce](https://docs.confluent.io/current/cli/command-reference/confluent-produce.html)
   test data to the `sink-messages` topic in Kafka.
   ```bash
   seq 10 | confluent local produce sink-messages
   ```
6. Create a `tibco-sink.json` file with the following contents:
   ```json
   {
     "name": "TibcoSinkConnector",
     "config": {
       "connector.class": "io.confluent.connect.jms.TibcoSinkConnector",
       "tasks.max": "1",
       "topics": "sink-messages",
       "tibco.url": "tcp://localhost:7222",
       "tibco.username": "admin",
       "tibco.password": "",
       "jms.destination.type": "queue",
       "jms.destination.name": "connector-quickstart",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "confluent.topic.bootstrap.servers": "localhost:9092",
       "confluent.topic.replication.factor": "1"
     }
   }
   ```
7. Load the TIBCO Sink connector.
   ```bash
   confluent local load tibco --config tibco-sink.json
   ```
8. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status tibco
   ```
9. Confirm the messages were delivered to the `connector-quickstart` queue in
   TIBCO.
   ```bash
   # open TIBCO admin tool (password is empty)
   tibco/ems/8.4/bin/tibemsadmin -server "tcp://localhost:7222" -user admin

   > show queue connector-quickstart
   ```


### Deploy CFK with cluster object deletion protection

Confluent for Kubernetes (CFK) provides validating admission webhooks for deletion events of
the Confluent Platform clusters.

CFK webhooks are disabled by default in this release of CFK.

CFK provides the following webhooks:

* **Webhook to prevent component deletion when its persistent volume (PV) reclaim policy is set to Delete**

  This webhook (`cfk-resources.webhooks.platform.confluent.io`) blocks
  deletion requests on CRs with PVs in `ReclaimPolicy: Delete`. Without this
  prevention, a CR deletion will result in the deletion of those PVs and data
  loss.

  This webhook only applies to the components that have persistent volumes,
  namely, ZooKeeper (Confluent Platform 7.9 or earlier only), Kafka, ksqlDB, and Control Center (Legacy).

  In addition to blocking deletion as described above, this webhook blocks updates to prevent manual modifications during ZooKeeper to KRaft migration when the CR has the
  `platform.confluent.io/kraft-migration-cr-lock: "true"` annotation set.

  This webhook does not block normal CR updates (outside of KRaft migration).
* **Webhook to prevent CFK StatefulSet deletion**

  The proper way to delete Confluent Platform resources is to delete the component custom
  resource (CR) as CFK watches those deletion events and properly cleans
  everything up. Deletion of StatefulSets can result in unintended PV deletion
  and data loss.

  This webhook (`core-resources.webhooks.platform.confluent.io`) blocks delete
  requests on CFK StatefulSets.
* **Webhook to prevent unsafe Kafka pod deletion**

  This webhook (`kafka-pods.webhooks.platform.confluent.io`) blocks Kafka pod
  deletion when the removal of a broker will result in fewer in-sync replicas
  than configured in the `min.insync.replicas` Kafka property. Dropping below
  that value can result in data loss. Pod deletion can happen during Kubernetes
  maintenance without warning, such as during node replacement, and this webhook
  is an additional safeguard for your Kafka data.

  Review the following when using this webhook:
  * This webhook is only supported on clusters with fewer than 140,000 partitions.
  * This webhook does not take the Kafka setting, [minimum in-sync replicas](https://docs.confluent.io/platform/current/installation/configuration/broker-configs.html#brokerconfigs_min.insync.replicas)
    (`min.insync.replicas`), into consideration.

    The minimum in-sync replicas setting on all topics is assumed to be `2` for Kafka with 3 or
    more replicas. Do not create topics with minimum in-sync replicas set to 1.
  * To avoid having an internal ksqlDB topic with min in-sync replicas set to 1, set the ksqlDB internal topic replicas setting to `3` using
    `configOverrides` in the ksqlDB CR:
    ```yaml
    spec:
      configOverrides:
        server:
          - ksql.internal.topic.replicas=3
    ```
* **Webhook to prevent unsafe pod eviction**

  This webhook (`evictions.webhooks.platform.confluent.io`) follows the same
  logic as the pod deletion webhook described above and prevents the creation of
  the pod eviction object which results in the draining of the pod nodes.


## Create a cluster link

Create a cluster link by using a new ClusterLink custom resource (CR) and apply
the CR with the `kubectl apply -f <Cluster Link CR>` command.

To create a destination-initiated cluster link
: Create a cluster link on the destination cluster and configure authentication
  and encryption for the source cluster.

To create a source-initiated cluster link
: Create two cluster links with the same cluster link names.

`mirrorTopics`, `mirrorTopicOptions`, `aclFilter`,
`consumerGroupFilters` and other cluster link configs must be defined in the
ClusterLink CR on **the destination cluster**, for both destination-initiated
and source-initiated cluster links.

```yaml
kind: ClusterLink
metadata:
  name: clusterlink              --- [1]
  namespace:                     --- [2]
spec:
  name:                          --- [3]
  sourceInitiatedLink:           --- [4]
    linkMode:                    --- [5]
  destinationKafkaCluster:       --- [6]
  sourceKafkaCluster:            --- [7]
  consumerGroupFilters:          --- [8]
  aclFilters:                    --- [9]
  configs:                       --- [10]
  mirrorTopics:                  --- [11]
  mirrorTopicOptions:            --- [12]
```

* [1] Required. The name of the ClusterLink CR.
* [2] Optional. The namespace of the ClusterLink CR. If omitted, the same
  namespace as this CR is assumed.
* [3] Optional. The name of the cluster link. If not defined `metadata.name`
  ([1]) is used.
* [4] Optional. Configure if this cluster link is a source-initiated
  cluster link.
* [5] Required under `sourceInitiatedLink`. Specify whether this
  source-initiated cluster link CR is on the source cluster or on the
  destination cluster. Valid values are `Source` and `Destination`.
* [6] Required. The information about the destination cluster. See
  [Configure the destination-initiated cluster link](#co-clusterlink-destination-initiated-connection) and
  [Configure the source-initiated cluster link on the destination cluster](#co-clusterlink-source-initiated-connection-destination-mode).
* [7] Required. The information about the source cluster. See
  [Configure the destination-initiated cluster link](#co-clusterlink-destination-initiated-connection) and
  [Configure the source-initiated cluster link on the source cluster](#co-clusterlink-source-initiated-connection-source-mode).
* [8] Optional. An array of consumer groups to be migrated from the source
  cluster to the destination cluster. See
  [Define consumer group filters](#co-clusterlink-consumer-group-filters).
* [9] Optional. An array of Access Control Lists (ACLs) to be migrated from the
  source cluster to the destination cluster. See
  [Define ACL filters](#co-clusterlink-acl-filters).
* [10] Optional. A map of additional configurations for creating a cluster link.

  For example:
  ```yaml
  spec:
    configs:
      connections.max.idle.ms: "620000"
      cluster.link.retry.timeout.ms: "200000"
  ```

  This setting can be in all types and modes or ClusterLink CRs.

  For the list of optional configurations, see [Cluster Linking config options](https://docs.confluent.io/platform/current/multi-dc-deployments/cluster-linking/configs.html#configuration-options).
* [11] Optional. An array of mirror topics. See [Create a mirror topic](#co-create-mirror-topic).
* [12] Optional. Configuration options for mirror topics. See
  [Configure mirror topic options](#co-clusterlink-mirror-topic-options).


# Manage Password Encoder Secrets for Confluent Platform Using Confluent for Kubernetes

To encrypt sensitive configuration information in Confluent for Kubernetes (CFK), such as
passwords for SASL/PLAIN or TLS, you define a password encoder in your custom
resource (CR). The use cases for the feature include the following:

* For destination-initiated (default) Kafka Cluster Linking, the destination Kafka
  cluster needs to set a password encoder secret and use it to encrypt the
  sensitive authentication and TLS information of the source cluster.

  For source-initiated (default) Kafka Cluster Linking, the source Kafka cluster
  needs to set a password encoder secret and use it to encrypt the sensitive
  authentication and TLS information of the destination cluster.
* For Schema Linking, a password encoder secret needs to be configured in the
  source Schema Registry cluster.

For details about password encoder secret, see [Kafka Broker Configuration](https://docs.confluent.io/platform/current/installation/configuration/broker-configs.html#brokerconfigs_password.encoder.secret).


To specify a password encoder secret:

1. Create the `password-encoder.txt` file with the following content:
   ```text
   password=<password>
   oldPassword=<old password>
   ```

   `oldPassword` is only required for password rotations.
2. Store the secret for the password encoder, using either a Kubernetes secret
   or the directory path in the container feature.
   * To use a Kubernetes secret, create a Kubernetes secret using the file
     created in the previous step:

     The expected key (the file name) is `password-encoder.txt`.

     For example:
     ```bash
     kubectl create secret generic myencodersecret \
        --from-file=password-encoder.txt=$MY_PATH/password-encoder.txt
     ```
   * To use the directory path in the container feature, copy the
     `password-encoder.txt` file to the container path.
3. In the Kafka or Schema Registry CR, specify the secret created in the previous step:
   ```yaml
   spec:
     passwordEncoder:
       secretRef:                --- [1]
       directoryPathInContainer: --- [2]
   ```

   * If `spec.passwordEncoder` is defined, either [1] or [2] is required.
   * [1] The secret for the password encoder.
   * [2] The path in the container where the `password-encoder.txt` file
     exists.

     See  [Provide secrets for Confluent Platform component CR](co-credentials.md#co-vault-category-1) for providing the secret and
     required annotations when using Vault.
4. Apply the CR changes using the `kubectl apply` command.

   The cluster will automatically restart.


### Issue: Use different authentication for internal and external listeners

When you set up SASL/PLAIN and SASL/PLAIN with LDAP listeners in the Kafka CR,
you will get the following error:

```text
WARN [Producer clientId=producer-client] Connection to node -1
(<broker_hostname>/<broker_ip>:9092) terminated during authentication. This
may happen due to any of the following reasons: (1) Authentication failed due
to invalid credentials with brokers older than 1.0.0, (2) Firewall blocking
Kafka TLS traffic (eg it may only allow HTTPS traffic), (3) Transient network
issue. (org.apache.kafka.clients.NetworkClient)
```

**Workaround:**

To implement both a SASL/PLAIN listener and a SASL/PLAIN with LDAP listener in
the Kafka cluster, the SASL/PLAIN listener must be configured with
`authentication.jaasConfigPassThrough`.

Following is the example configuration that can be used to set up an internal listener
with SASL/PLAIN and an external listener with SASL/PLAIN with LDAP:

**Step 1:** Create a file, `creds-kafka-sasl-users.conf` with the following content.

```bash
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
 username="kafka" \
 password="kafka-secret" \
 user_kafka="kafka-secret";
```

**Step 2:** Create a secret credential.

```bash
kubectl create secret generic credential \
  --from-file=plain-jaas.conf=./creds-kafka-sasl-users.conf \
  --namespace confluent
```

**Step 3:** Specify the JAAS configuration passthrough in the Kafka CR and apply the
change with the `kubectl apply -f` command.

```yaml
kind: Kafka
spec:
  listeners:
    internal:
      authentication:
        type: plain
        jaasConfigPassThrough:
          secretRef: credential
      tls:
        enabled: true
    external:
      authentication:
        type: ldap
      tls:
        enabled: true
```


### GCS

To enable Tiered Storage on Google Cloud Platform (GCP) with Google Cloud Storage (GCS):

1. To enable Tiered Storage, add the following properties in your `broker.properties` file:
   ```properties
   confluent.tier.feature=true
   confluent.tier.enable=true
   confluent.tier.backend=GCS
   confluent.tier.gcs.bucket=<BUCKET_NAME>
   confluent.tier.gcs.region=<REGION>
   # confluent.tier.metadata.replication.factor=1
   ```

   Adding the above properties enables the Tiered Storage components on GCS with default parameters on all of the possible configurations.
   - `confluent.tier.feature` enables Tiered Storage for a broker. Setting this to `true` allows a broker to utilize Tiered Storage.
   - `confluent.tier.enable` sets the default value for created topics. Setting this to `true` causes all non-compacted topics to be tiered. When set to `true`, this causes all existing, non-compacted topics to have this configuration set to `true` as well. Only topics explicitly set to `false` do not use tiered storage. It is not required to set `confluent.tier.enable=true` to enable Tiered Storage.
   - `confluent.tier.backend` refers to the cloud storage service a broker connects to. For Google Cloud Storage, set this to `GCS` as shown above.
   - `BUCKET_NAME` and `REGION` are the S3 bucket name and its region, respectively. A broker interacts with this bucket for writing and reading tiered data.

   For example, a bucket named `tiered-storage-test-gcs` located in the `us-central1` region would have these properties:
   ```properties
   confluent.tier.gcs.bucket=tiered-storage-test-gcs
   confluent.tier.gcs.region=us-central1
   ```
2. Provide [GCS credentials](https://cloud.google.com/docs/authentication/getting-started) to connect to the GCS bucket.
   You can set these through `broker.properties` or through environment variables. Either method is sufficient. The brokers prioritize
   using the credentials supplied through `broker.properties`. If the brokers do not find credentials in `broker.properties`, they use
   environment variables instead.
   - **Broker Properties** - Add the following property to your `broker.properties` file:
     ```properties
     confluent.tier.gcs.cred.file.path=<PATH_TO_GCS_CREDENTIALS_FILE>
     ```

     This field is hidden from the server log files.
   - **Environment Variables** - Specify GCS credentials with this local environment variable:
     ```properties
     export GOOGLE_APPLICATION_CREDENTIALS=<PATH_TO_GCS_CREDENTIALS_FILE>
     ```

     If `broker.properties` does not contain the property with the path to the credentials file, the broker will use the above environment variable to connect to the GCS bucket.

   See the [GCS documentation](https://cloud.google.com/docs/authentication) for more information.
3. The GCS bucket should allow the broker to perform the following actions. These operations are required by the broker to properly enable and use Tiered Storage.
   ```properties
   storage.buckets.get
   storage.objects.get
   storage.objects.list
   storage.objects.create
   storage.objects.delete
   storage.objects.update
   ```

Troubleshooting Certificates
: If the brokers fail to start due to Tiered Storage errors such as inability to access buckets and security certificate issues,
  make sure that you have the needed Google CA certificate(s). To troubleshoot:


  1. Go to [Google Trust Services repository](https://pki.goog/repository/), scroll down to the section **Download CA certificates**, and click **Expand all**.
  2. Choose a certificate suitable for your cluster (for example, **GlobalSign R4**) that is currently valid (not yet expired), click the **Action** drop-down next to it, and download the Certificate (PEM) file to all the brokers in the cluster.
  3. Import the certificate by running the following command:
     ```bash
     keytool -import -trustcacerts -keystore /var/ssl/private/kafka_broker.truststore.jks -alias root -file <certificate.pem file>
     ```


#### Create a Dead Letter Queue topic

To create a DLQ, add the following configuration properties to your sink
connector configuration:

```bash
errors.tolerance = all
errors.deadletterqueue.topic.name = <dead-letter-topic-name>
```

The following example shows a GCS Sink connector configuration with DLQ enabled:

```bash
 {
  "name": "gcs-sink-01",
  "config": {
    "connector.class": "io.confluent.connect.gcs.GcsSinkConnector",
    "tasks.max": "1",
    "topics": "gcs_topic",
    "gcs.bucket.name": "<my-gcs-bucket>",
    "gcs.part.size": "5242880",
    "flush.size": "3",
    "storage.class": "io.confluent.connect.gcs.storage.GcsStorage",
    "format.class": "io.confluent.connect.gcs.format.avro.AvroFormat",
    "partitioner.class": "io.confluent.connect.storage.partitioner.DefaultPartitioner",
    "value.converter": "io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url": "http://localhost:8081",
    "schema.compatibility": "NONE",
    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1",
    "errors.tolerance": "all",
    "errors.deadletterqueue.topic.name": "dlq-gcs-sink-01"
  }
}
```

Even if the DLQ topic contains the records that failed, it does not show why.
You can add the following configuration property to include failed record header
information.

```bash
errors.deadletterqueue.context.headers.enable=true
```

Record headers are added to the DLQ when
`errors.deadletterqueue.context.headers.enable` parameter is set to
`true`–the default is `false`. You can then use the [kafkacat](../tools/kafkacat-usage.md#kafkacat-usage) to
view the record header and determine why the record failed. Errors are also sent
to [Connect Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
To avoid conflicts with the original record header, the DLQ context
header keys start with `_connect.errors`.

Here is the same example configuration with headers enabled:

```bash
 {
  "name": "gcs-sink-01",
  "config": {
    "connector.class": "io.confluent.connect.gcs.GcsSinkConnector",
    "tasks.max": "1",
    "topics": "gcs_topic",
    "gcs.bucket.name": "<my-gcs-bucket>",
    "gcs.part.size": "5242880",
    "flush.size": "3",
    "storage.class": "io.confluent.connect.gcs.storage.GcsStorage",
    "format.class": "io.confluent.connect.gcs.format.avro.AvroFormat",
    "partitioner.class": "io.confluent.connect.storage.partitioner.DefaultPartitioner",
    "value.converter": "io.confluent.connect.avro.AvroConverter",
    "value.converter.schema.registry.url": "http://localhost:8081",
    "schema.compatibility": "NONE",
    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1",
    "errors.tolerance": "all",
    "errors.deadletterqueue.topic.name": "dlq-gcs-sink-01",
    "errors.deadletterqueue.context.headers.enable":true
  }
}
```

For more information about DLQs, see [Kafka Connect Deep Dive – Error Handling
and Dead Letter Queues](https://www.confluent.io/blog/kafka-connect-deep-dive-error-handling-dead-letter-queues/).


## Requirements

To use CSFLE in Confluent Platform with self-managed connectors, you must meet the following requirements:

- An installation of Confluent Enterprise 8.0 and later with the CSFLE Add-On enabled.
- Ensure Schema Registry is configured with the following properties before it starts:
  ```shell
  resource.extension.class=io.confluent.kafka.schemaregistry.rulehandler.RuleSetResourceExtension,io.confluent.dekregistry.DekRegistryResourceExtension
  confluent.license=<cpe-license-key>
  confluent.license.addon.csfle=<cpe-license-key>
  ```

  #### NOTE
  The value for `confluent.license.addon.csfle` is the same as your main `confluent.license` key.
- An external KMS to manage your Key Encryption Keys (KEKs). For more information, see [Manage KEKs](../security/protect-data/csfle/manage-keys.md#manage-keks-csfle).
- The [KMS provider](/platform/current/security/protect-data/csfle/quick-start.html#step-1-configure-the-kms-provider) must be
  configured for the connector.
- A Kafka topic to use as a data source or destination.


#### IMPORTANT
The Confluent CLI [confluent local](https://docs.confluent.io/confluent-cli/current/command-reference/local/index.html) commands are intended for a single-node development environment and
are not suitable for a production environment. The data that are produced are transient and are intended to be
temporary. For production-ready workflows, see [Install and Upgrade Confluent Platform](../installation/index.md#installation-overview).

Every service will start in order, printing a message with its status:

```bash
Starting KRaft Controller
KRaft Controller is [UP]
Starting Kafka
Kafka is [UP]
Starting Schema Registry
Schema Registry is [UP]
Starting Kafka REST
Kafka REST is [UP]
Starting Connect
Connect is [UP]
Starting ksqlDB Server
ksqlDB Server is [UP]
```


#### NOTE
For instructions on getting your actual cluster IDs, refer to
[Cluster Identifiers in Confluent Platform](../../security/authorization/rbac/rbac-get-cluster-ids.md#rbac-get-cluster-ids).

1. Enter the following Confluent CLI command to give an example service principal named `$CONNECT_USER` the role `SecurityAdmin` on the Connect cluster. The example cluster IDs `$CONNECT_CLUSTER` for Connect and `$KAFKA_CLUSTER` for Kafka are used in the command and all subsequent command examples.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role SecurityAdmin \
   --kafka-cluster $KAFKA_CLUSTER \
   --connect-cluster-id $CONNECT_CLUSTER
   ```
2. Enter the following command to give `$CONNECT_USER` the role `ResourceOwner` on the group that Connect workers use to coordinate with other workers.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Group:$CONNECT_CLUSTER \
   --kafka-cluster $KAFKA_CLUSTER
   ```
3. Enter the following commands to give `$CONNECT_USER` the role `ResourceOwner` on the internal Kafka topics used by Connect to store configuration, status, and offset information. Internal configuration topic `$CONFIGS_TOPIC`, internal offsets topic `$OFFSETS_TOPIC`, and status topic `$STATUS_TOPIC` are used in the examples.

   #### NOTE
   The configuration topics `config.storage.topic`,
   `offset.storage.topic`, and `status.storage.topic` are where the
   internal configuration, offset configuration, and status configuration
   data are stored. These are set for Confluent Platform to `connect-configs`,
   `connect-offsets`, and `connect-status` by default.

   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Topic:$CONFIGS_TOPIC \
   --kafka-cluster $KAFKA_CLUSTER
   ```

   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Topic:$OFFSETS_TOPIC \
   --kafka-cluster $KAFKA_CLUSTER
   ```

   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Topic:$STATUS_TOPIC \
   --kafka-cluster $KAFKA_CLUSTER
   ```

   #### IMPORTANT
   By default the Kafka worker uses the same settings and the same principal
   for reading and writing to the `_confluent-command` license topic that
   it uses to read and write to internal topics. For more information about
   Connect licensing, see [Licensing Connectors](/kafka-connectors/self-managed/license.html).


If you have configured a Connect [Secret Registry](connect-rbac-secret-registry.md#connect-rbac-secret-registry), you must complete two additional steps.

1. Enter the following command to give `$CONNECT_USER` the role `ResourceOwner` on the group that secret registry nodes use to coordinate with other nodes. Secret Registry group ID `$SECRET_REGISTRY_GROUP` is used in the example. Note that the actual value of `$SECRET_REGISTRY_GROUP` needs to match the value of `config.providers.secret.param.secret.registry.group.id` in the Connect worker properties. This value defaults to `secret-registry` if not specified in the Connect worker properties.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Group:$SECRET_REGISTRY_GROUP \
   --kafka-cluster $KAFKA_CLUSTER
   ```
2. Enter the following command to give `$CONNECT_USER` the role `ResourceOwner` on the Kafka topic used to store secrets. Kafka secrets topic `$SECRETS_TOPIC` is used in the example. Note that the actual value of `$SECRETS_TOPIC` needs to match the value of `config.providers.secret.param.kafkastore.topic` in the Connect worker properties. This value defaults to `_confluent-secrets` if not specified in the Connect worker properties.

   #### WARNING
   The default value for the secrets topic changed from `_secrets` to `_confluent-secrets` in version 5.4. If your Secret Registry cluster is not configured with a `kafkastore.topic` property, explicitly set it to `_secrets` before upgrading to 5.4, to avoid losing existing secrets.

   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECT_USER \
   --role ResourceOwner \
   --resource Topic:$SECRETS_TOPIC \
   --kafka-cluster $KAFKA_CLUSTER
   ```


## Connector role bindings

Use the following steps to configure role bindings for the connector:
`User:$CONNECTOR_USER`.

1. Grant principal `User:$CONNECTOR_USER` the `ResourceOwner` role to
   `Topic:$DATA_TOPIC`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECTOR_USER \
   --role ResourceOwner \
   --resource Topic:$DATA_TOPIC \
   --kafka-cluster $KAFKA_CLUSTER_ID
   ```

   The following step is only required if using **Schema Registry**.
2. Grant principal `User:$CONNECTOR_USER` the `ResourceOwner` role to
   `Subject:$(DATA_TOPIC)-value`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECTOR_USER \
   --role ResourceOwner \
   --resource Subject:$(DATA_TOPIC)-value \
   --kafka-cluster $KAFKA_CLUSTER_ID \
   --schema-registry-cluster $SCHEMA_REGISTRY_CLUSTER_ID
   ```

   The following step is only required for **Sink** connectors.
3. Grant principal `User:$CONNECTOR_USER` the `DeveloperRead` role to the
   consumer group `Group:$connect-`.
   ```none
   confluent iam rbac role-binding create \
   --principal User:$CONNECTOR_USER \
   --role DeveloperRead \
   --resource Group:$connect- \
   --prefix \
   --kafka-cluster $KAFKA_CLUSTER_ID
   ```
4. List the role bindings for the principal `User:$CONNECTOR_USER` to the
   Connect cluster.
   ```none
   confluent iam rbac role-binding list \
   --principal User:$CONNECTOR_USER \
   --kafka-cluster $KAFKA_CLUSTER_ID \
   --connect-cluster-id $CONNECT_CLUSTER_ID
   ```


#### IMPORTANT
- The Kafka Connect framework does not allow you to unset or set `null` for
  producer or consumer configuration properties. Instead, try to set the default
  callback handler at the connector level using the following configuration
  property:
  ```properties
  producer.override.sasl.login.callback.handler.class=org.apache.kafka.common.security.authenticator.AbstractLogin$DefaultLoginCallbackHandler
  ```
- For source connectors whose destination clusters uses SCRAM SASL mechanism, the default callback handler should not be set at connector level.
  Instead, set the producer configurations on the Kafka Connect framework. In such cases, change the distributed worker to point to the appropriate
  producer settings, for example:
  ```bash
  producer.bootstrap.servers"=x:9096,y:9096,z:9096
  producer.retry.backoff.ms= 500,
  producer.security.protocol=SASL_SSL
  producer.sasl.mechanism=SCRAM-SHA-512
  producer.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username=\"######\" password=\"#####\";",
  producer.ssl.truststore.location = /var/ssl/private/kafka_connect.truststore.jks
  producer.ssl.truststore.password = ${securepass:/var/ssl/private/kafka-connect-security.properties:connect-distributed.properties/producer.ssl.truststore.password}
  ```

To enable per-connector configuration properties and override the default worker
properties, add the following `connector.client.config.override.policy`
configuration parameter to the worker properties file.

`connector.client.config.override.policy`
: The class name or alias implementation of ConnectorClientConfigOverridePolicy.
  This defines configurations that can be overridden by the connector. The
  default implementation is `All`. Other possible policies are `None` and
  `Principal`.


  * Type: string
  * Default: All
  * Valid Values: [All, None, Principal]
  * Importance: medium

When `connector.client.config.override.policy=All`, each connector that belongs to
the worker is allowed to override the worker configuration. This is implemented
by adding one of the following override prefixes to the source and sink
connector configurations:

* `producer.override.<source-configuration-property>`
* `consumer.override.<sink-configuration-property>`


# Authentication settings for Connect workers
ssl.keystore.location=/var/private/ssl/kafka.worker.keystore.jks
ssl.keystore.password=worker1234
ssl.key.password=worker1234
```

Connect workers manage the producers used by source connectors and the consumers used by sink connectors. So, for the connectors to leverage security, you also have to override the default producer/consumer configuration that the worker uses.

```bash

### Reporter and Kerberos security


The following configuration example shows a sink connector with all the
necessary configuration properties for Reporter and Kerberos security. This
example shows the [Prometheus Metrics Sink Connector for Confluent Platform](https://docs.confluent.io/kafka-connectors/prometheus-metrics/current/index.html), but can be modified for any applicable
sink connector.

```json
{

  "name" : "prometheus-connector",
  "config" : {
    "topics":"prediction-metrics",
    "connector.class" : "io.confluent.connect.prometheus.PrometheusMetricsSinkConnector",
    "tasks.max" : "1",
    "confluent.topic.bootstrap.servers":"localhost:9092",
    "confluent.topic.ssl.truststore.location":"/etc/pki/hadoop/kafkalab.jks",
    "confluent.topic.ssl.truststore.password":"xxxx",
    "confluent.topic.ssl.keystore.location":"/etc/pki/hadoop/kafkalab.jks",
    "confluent.topic.ssl.keystore.password":"xxxx",
    "confluent.topic.ssl.key.password":"xxxx",
    "confluent.topic.security.protocol":"SASL_SSL",
    "confluent.topic.replication.factor": "3",
    "confluent.topic.sasl.kerberos.service.name":"kafka",
    "confluent.topic.sasl.jaas.config":"com.sun.security.auth.module.Krb5LoginModule required \nuseKeyTab=true \nstoreKey=true \nkeyTab=\"/etc/security/keytabs/svc.kfkconnect.lab.keytab\" \nprincipal=\"svc.kfkconnect.lab@DS.DTVENG.NET\";",
    "prometheus.scrape.url": "http://localhost:8889/metrics",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "value.converter.schemas.enable": "false",
    "behavior.on.error": "LOG",
    "reporter.result.topic.replication.factor": "3",
    "reporter.error.topic.replication.factor": "3",
    "reporter.bootstrap.servers":"localhost:9092",
    "reporter.producer.ssl.truststore.location":"/etc/pki/hadoop/kafkalab.jks",
    "reporter.producer.ssl.truststore.password":"xxxx",
    "reporter.producer.ssl.keystore.location":"/etc/pki/hadoop/kafkalab.jks",
    "reporter.producer.ssl.keystore.password":"xxxx",
    "reporter.producer.ssl.key.password":"xxxx",
    "reporter.producer.security.protocol":"SASL_SSL",
    "reporter.producer.sasl.kerberos.service.name":"kafka",
    "reporter.producer.sasl.jaas.config":"com.sun.security.auth.module.Krb5LoginModule required \nuseKeyTab=true \nstoreKey=true \nkeyTab=\"/etc/security/keytabs/svc.kfkconnect.lab.keytab\" \nprincipal=\"svc.kfkconnect.lab@DS.DTVENG.NET\";",
    "reporter.admin.ssl.truststore.location":"/etc/pki/hadoop/kafkalab.jks",
    "reporter.admin.ssl.truststore.password":"xxxx",
    "reporter.admin.ssl.keystore.location":"/etc/pki/hadoop/kafkalab.jks",
    "reporter.admin.ssl.keystore.password":"xxxx",
    "reporter.admin.ssl.key.password":"xxxx",
    "reporter.admin.security.protocol":"SASL_SSL",
    "reporter.admin.sasl.kerberos.service.name":"kafka",
    "reporter.admin.sasl.jaas.config":"com.sun.security.auth.module.Krb5LoginModule required \nuseKeyTab=true \nstoreKey=true \nkeyTab=\"/etc/security/keytabs/svc.kfkconnect.lab.keytab\" \nprincipal=\"svc.kfkconnect.lab@DS.DTVENG.NET\";",
    "confluent.license":"eyJ0eXAiOiJK ...omitted"

  }
```


## Step 1: Download and start Confluent Platform

In this step, you start by cloning a GitHub repository. This repository contains a Docker compose file and some required configuration files.
The `docker-compose.yml` file sets ports and Docker environment variables such as the replication factor and listener properties
for Confluent Platform and its components. To learn more about the settings in this file, see [Docker Image Configuration Reference for Confluent Platform](../installation/docker/config-reference.md#config-reference).

1. Clone the [Confluent Platform all-in-one example repository](https://github.com/confluentinc/cp-all-in-one/tree/latest/cp-all-in-one),
   for example:
   ```bash
   git clone https://github.com/confluentinc/cp-all-in-one.git
   ```
2. Change to the cloned repository’s root directory:
   ```bash
   cd cp-all-in-one
   ```
3. The default branch may not be the latest. Check out the branch for the version you want to run, for example, 8.1.0-post:
   ```bash
   git checkout 8.1.0-post
   ```
4. The `docker-compose.yml` file is located in a nested directory. Navigate into the following directory:
   ```bash
   cd cp-all-in-one
   ```
5. Start the Confluent Platform stack with the `-d` option to run in detached mode:
   ```bash
   docker compose up -d
   ```

   #### NOTE
   If you using an Docker Compose V1, you need to use a
   dash in the `docker compose` commands. For example:
   ```bash
   docker-compose up -d
   ```

   To learn more, see [Migrate to Compose V2](https://docs.docker.com/compose/releases/migrate/).

   Each Confluent Platform component starts in a separate container. Your output should
   resemble the following.  Your output may vary slightly from these
   examples depending on your operating system.
   ```bash
   ✔ Network cp-all-in-one_default  Created       0.0s
   ✔ Container flink-jobmanager     Started       0.5s
   ✔ Container broker               Started       0.5s
   ✔ Container prometheus           Started       0.5s
   ✔ Container flink-taskmanager    Started       0.5s
   ✔ Container flink-sql-client     Started       0.5s
   ✔ Container alertmanager         Started       0.5s
   ✔ Container schema-registry      Started       0.5s
   ✔ Container connect              Started       0.6s
   ✔ Container rest-proxy           Started       0.6s
   ✔ Container ksqldb-server        Started       0.6s
   ✔ Container control-center       Started       0.7s
   ```
6. Verify that the services are up and running:
   ```bash
   docker compose ps
   ```

   Your output should resemble:
   ```none
   NAME                IMAGE                                                       COMMAND                  SERVICE             CREATED         STATUS         PORTS
   alertmanager        confluentinc/cp-enterprise-alertmanager:2.2.1               "alertmanager-start"     alertmanager        8 minutes ago   Up 8 minutes   0.0.0.0:9093->9093/tcp, [::]:9093->9093/tcp
   broker              confluentinc/cp-server:8.1.0                                "/etc/confluent/dock…"   broker              8 minutes ago   Up 8 minutes   0.0.0.0:9092->9092/tcp, [::]:9092->9092/tcp, 0.0.0.0:9101->9101/tcp, [::]:9101->9101/tcp
   connect             cnfldemos/cp-server-connect-datagen:0.6.4-7.6.0             "/etc/confluent/dock…"   connect             8 minutes ago   Up 8 minutes   0.0.0.0:8083->8083/tcp, [::]:8083->8083/tcp
   control-center      confluentinc/cp-enterprise-control-center-next-gen:2.2.1    "/etc/confluent/dock…"   control-center      8 minutes ago   Up 8 minutes   0.0.0.0:9021->9021/tcp, [::]:9021->9021/tcp
   flink-jobmanager    cnfldemos/flink-kafka:1.19.1-scala_2.12-java17              "/docker-entrypoint.…"   flink-jobmanager    8 minutes ago   Up 8 minutes   0.0.0.0:9081->9081/tcp, [::]:9081->9081/tcp
   flink-sql-client    cnfldemos/flink-sql-client-kafka:1.19.1-scala_2.12-java17   "/docker-entrypoint.…"   flink-sql-client    8 minutes ago   Up 8 minutes   6123/tcp, 8081/tcp
   flink-taskmanager   cnfldemos/flink-kafka:1.19.1-scala_2.12-java17              "/docker-entrypoint.…"   flink-taskmanager   8 minutes ago   Up 8 minutes   6123/tcp, 8081/tcp
   ksqldb-server       confluentinc/cp-ksqldb-server:8.1.0                         "/etc/confluent/dock…"   ksqldb-server       8 minutes ago   Up 8 minutes   0.0.0.0:8088->8088/tcp, [::]:8088->8088/tcp
   prometheus          confluentinc/cp-enterprise-prometheus:2.2.1                 "prometheus-start"       prometheus          8 minutes ago   Up 8 minutes   0.0.0.0:9090->9090/tcp, [::]:9090->9090/tcp
   rest-proxy          confluentinc/cp-kafka-rest:8.1.0                            "/etc/confluent/dock…"   rest-proxy          8 minutes ago   Up 8 minutes   0.0.0.0:8082->8082/tcp, [::]:8082->8082/tcp
   schema-registry     confluentinc/cp-schema-registry:8.1.0                       "/etc/confluent/dock…"   schema-registry     8 minutes ago   Up 8 minutes   0.0.0.0:8081->8081/tcp, [::]:8081->8081/tcp
   ```

   After a few minutes, if the state of any component isn’t **Up**, run the
   `docker compose up -d` command again, or try `docker compose restart <image-name>`,
   for example:
   ```bash
   docker compose restart control-center
   ```


### Kafka

For Kafka in KRaft mode, you must configure a node to be a broker or a controller.
In addition, you must create a unique cluster ID and format the log directories with that ID.

Typically in a production environment, you should have a minimum of three brokers and three controllers.

* Navigate to the KRaft configuration files located in the  `/etc/kafka/` directory. In this directory, you will find three sample property files for different node roles:
  - `broker.properties`: Use this file to configure a broker node.
  - `controller.properties`: Use this file to configure a controller node.
  - `server.properties`: Use this file to configure a node that runs in combined mode as both a broker and a controller. This mode is not supported for production environments.

  Choose the appropriate properties file for the node’s role in your KRaft cluster and then customize the settings in that file.
* Configure the `process.roles`, `node.id` and `controller.quorum.voters` for each node.
  - For `process.roles`, set whether the node will be a `broker` or a `controller`. `combined` mode, meaning `process.roles` is set to
    `broker,controller`, is currently not supported and should only be used for experimentation.
  - Set a system-wide unique ID for the `node.id` for each broker/controller.
  - `controller.quorum.voters` should be a comma-separated list of controllers in the format `nodeID@hostname:port`
    ```bash
    ############################# Server Basics #############################

    # The role of this server. Setting this puts us in KRaft mode
    process.roles=broker

    # The node id associated with this instance's roles
    node.id=2

    # The connect string for the controller quorum
    controller.quorum.voters=1@controller1:9093,3@controller3:9093,5@controller5:9093
    ```
* Configure how brokers and clients communicate with the broker using `listeners`, and where controllers listen with
  `controller.listener.names`.
  - `listeners`: Comma-separated list of URIs and listener names to listen on in the format `listener_name://host_name:port`
  - `controller.listener.names`: Comma-separated list of `listener_name` entries for listeners used by the controller.

  For more information, see [KRaft Configuration for Confluent Platform](../../kafka-metadata/config-kraft.md#configure-kraft).
* Configure security for your environment.
  - For general security guidance, see [KRaft Security in Confluent Platform](../../security/component/kraft-security.md#kraft-security).
  - For role-based access control (RBAC), see [Configure Metadata Service (MDS) in Confluent Platform](../../kafka/configure-mds/index.md#rbac-mds-config).
  - For configuring SASL/SCRAM for broker-to-broker communication, see [KRaft-based Confluent Platform clusters](../../security/authentication/sasl/scram/overview.md#sasl-scram-kraft-based-clusters).


# List of Kafka brokers to connect to, e.g. PLAINTEXT://hostname:9092,SSL://hostname2:9092
kafkastore.bootstrap.servers=PLAINTEXT://hostname:9092,SSL://hostname2:9092
```

This configuration is for a three node multi-node cluster. For more information, see [Deploy Schema Registry in Production on Confluent Platform](../../schema-registry/installation/deployment.md#schema-registry-prod).


### Configure the LDAP identity provider

This configuration shows the LDAP context to identify LDAP users and groups to
the MDS. The baseline LDAP configuration procedure for MDS is shown followed by
detailed descriptions of the essential configuration options.

1. Ensure you have this information available before you begin.
   - The hostname (LDAP server URL, for example, `LDAPSERVER.EXAMPLE.COM`),
     port (for example, `389`), and any other security mechanisms (such as
     TLS)
   - The full DN (distinguished name) of LDAP users
   - If you have a complex LDAP directory tree, consider developing search
     filters for your configuration. These filters help MDS to trim LDAP
     search results.
2. Edit your Kafka properties file (`/etc/kafka/server.properties`).
3. Add the following baseline configuration for your identify provider (LDAP).
   ```RST
   ############################# Identity Provider Settings (LDAP) #############################
   # Search groups for group-based authorization.
   ldap.group.name.attribute=<sAMAccountName>
   ldap.group.object.class=group
   ldap.group.member.attribute=member
   ldap.group.member.attribute.pattern=CN=(.*),DC=rbac,DC=confluent,DC=io
   ldap.group.search.base=CN=Users,DC=rbac,DC=confluent,DC=io
   #Limit the scope of searches to subtrees off of base
   ldap.user.search.scope=2
   #Enable filters to limit search to only those groups needed
   ldap.group.search.filter=(|(CN=<specific group>)(CN=<specific group>))

   # Kafka authenticates to the directory service with the bind user.
   ldap.java.naming.provider.url=ldap://<hostname>:389
   ldap.java.naming.security.authentication=simple
   ldap.java.naming.security.credentials=<password>
   ldap.java.naming.security.principal=<mds-user-DN>

   # Locate users. Make sure that these attributes and object classes match what is in your directory service.
   ldap.user.name.attribute=<sAMAccountName>
   ldap.user.object.class=user
   ldap.user.search.base=<user-search-base-DN>
   ```
4. Adjust the configuration details for your environment, particularly the content in brackets (`<>`).

   Pay special attention to the following as you work:
   * Nested LDAP groups are not supported.
   * If you enable LDAP authentication for Kafka clients by adding [the LDAP callback handler](../../security/authentication/ldap/client-authentication-ldap.md#client-auth-with-ldap) (not shown in this configuration):
     - Specify `ldap.user.password.attribute` only if your LDAP server does not support simple bind.
     - If you define this property
       (`io.confluent.security.auth.provider.ldap.LdapAuthenticateCallbackHandler`),
       LDAP will perform the user search and return the password back to Kafka and Kafka will perform the password check.
     - The LDAP server will return the user’s hashed password, so Kafka cannot
       authenticate the user unless the user’s properties file also uses the
       hashed password.
5. Save and close the property file.
6. After configuring LDAP–but before configuring MDS– connect to and query
   your LDAP server to verify your LDAP connection information.

   It is recommended that you use an LDAP tool to do this (for example, JXplorer).
7. When *all* sections of the MDS configuration are complete and your LDAP
   connection is verified, [Start Confluent Platform](../../installation/overview.md#installation).

The following sections provide details about the baseline LDAP configuration
options for user and group-based authorization. For more details about LDAP configuration,
see [Configure LDAP Group-Based Authorization for MDS](ldap-auth-config.md#ldap-auth-config) and [Configure LDAP Authentication](ldap-auth-mds.md#ldap-auth-mds).


### MDS REST client configurations

If a component (such as Schema Registry, ksqlDB, Confluent Control Center, or Connect) client configured
to communicate with MDS includes an incorrect username or password, it can result
in an endless loop of attempts to authenticate, which can inadvertently produce a
continuous loop of exceptions in your REST client exception log and impact performance.
For example:

```none
[2021-01-25 05:11:58,330] ERROR [pool-17-thread-1] Error while refreshing active metadata server urls, retrying (io.confluent.security.auth.client.rest.RestClient)
io.confluent.security.auth.client.rest.exceptions.RestClientException: Unauthorized; error code: 401
   at io.confluent.security.auth.client.rest.RestClient$HTTPRequestSender.lambda$submit$0(RestClient.java:353)
   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   at java.lang.Thread.run(Thread.java:748)
```

To control the number of authentication retry attempts, include the following
options in your REST client MDS configuration:

- `confluent.metadata.server.urls.max.retries`
- `confluent.metadata.server.urls.fail.on.401`

For details about these configuration options, refer to [REST client configurations](mds-configuration.md#rest-client-mds-config).


### Handling large message sizes

We strongly recommend that you adhere to the default maximum size of 1 MB for messages. When it is absolutely necessary to increase the maximum message size, the following are a few of the many implications you should consider. Also consider alternative options such as using compression and/or splitting up messages.

Heap fragmentation
: Consistently large messages likely cause heap fragmentation on the broker side, requiring significant JVM tuning to maintain consistent performance.

Dirty page cache
: Accessing messages that are no longer available in the page cache is slow. With larger messages, fewer messages can fit in the page cache, causing degraded performance.

Kafka client buffer sizes
: Default buffer sizes on the client side are tuned for small messages (<1MB). You will have to tune client side buffers on both the producer and consumer to properly handle the messages.
  See [max.message.bytes](/platform/current/installation/configuration/topic-configs.html#max-message-bytes).

To configure Kafka to handle larger messages, set the following configuration
parameters at the level you need, in Producer, Consumer and Topic. If all topics
needs this configuration, set it in the Broker configuration, but this is not
recommended for the reasons listed above.

| Scope                            | Config Parameter                                  | Notes                                                                    |
|----------------------------------|---------------------------------------------------|--------------------------------------------------------------------------|
| Topic                            | `max.message.bytes`                               | Recommended to set the maximum message size at the topic level.          |
| Broker                           | `message.max.bytes`                               | Setting the maximum message size at the broker level is not recommended. |
| Producer                         | `max.request.size`                                | Required for the producer level change of the maximum message size.      |
| `batch.size`
`buffer.memory` | Use these parameters for performance tuning.      |                                                                          |
| Consumer                         | `fetch.max.bytes`
`max.partition.fetch.bytes` | Use these to set the maximum message size at the consumer level.         |

For example, if you want to be able to handle 2 MB messages, you need to configure as below.

Topic configuration:

```none
max.message.bytes=2097152
```

Producer configuration:

```none
max.request.size=2097152


### License Client Configuration

A Kafka client is used to check the license topic for compliance. Review the
following information about how to configure this license client when using
principal propagation.

Configure license client authentication
: When using principal propagation, client license authentication is inherited
  from the inter-broker listeners.

Configure license client authorization
: When using principal propagation and RBAC or ACLs, you must configure client
  authorization for the license topic.


  #### NOTE
  The `_confluent-command` internal topic is available as the preferred
  alternative to the `_confluent-license` topic for components such as Schema Registry, REST Proxy, and Confluent Server
  (which were previously using `_confluent-license`). Both topics will be supported going
  forward. Here are some guidelines:


  - New deployments (Confluent Platform 6.2.1 and later) will default to using `_confluent-command` as shown below.
  - Existing clusters will continue using the `_confluent-license` unless manually changed.
  - Newly created clusters on Confluent Platform 6.2.1 and later will default to creating the
    `_confluent-command`  topic, and only existing clusters that already have a
    `_confluent-license` topic will continue to use it.


  - **RBAC authorization**


    Run this command to add `ResourceOwner` for the component user for the
    Confluent license topic resource (default name is `_confluent-command`).
    ```none
    confluent iam rbac role-binding create \
    --role ResourceOwner \
    --principal User:<service-account-id> \
    --resource Topic:_confluent-command \
    --kafka-cluster <kafka-cluster-id>
    ```
  - **ACL authorization**


    Run this command to configure Kafka authorization, where bootstrap server,
    client configuration, service account ID is specified. This grants create,
    read, and write on the `_confluent-command` topic.
    ```none
    kafka-acls --bootstrap-server <broker-listener> --command-config <client conf> \
    --add --allow-principal User:<service-account-id>  --operation Create --operation Read --operation Write \
    --topic _confluent-command
    ```


## High Availability for pull queries

ksqlDB supports [pull queries](../developer-guide/ksqldb-reference/select-pull-query.md#ksqldb-reference-select-pull-query),
which you use to query materialized state that is stored while executing
a [persistent query](../concepts/queries.md#ksqldb-concepts-queries-persistent). This works
without issue when all nodes in your ksqlDB cluster are operating
correctly, but what happens when a node storing that state goes down?
First, you must start multiple nodes and make sure inter-node
communication is configured so that query forwarding works correctly:

```properties
listeners=http://0.0.0.0:8088
ksql.advertised.listener=http://host1.example.com:8088
```

The `ksql.advertised.listener` configuration specifies the URL that is
propagated to other nodes for inter-node requests, so it must be
reachable from other hosts/pods in the cluster. Inter-node requests are
critical in a multi-node cluster. For more information, see
[configuring listeners of a ksqlDB cluster](installation/server-config.md#ksqldb-install-configure-server-configuring-listeners).

While waiting for a failed node to restart is one possibility, this
approach may incur more downtime than you want, and it may not be
possible if there is a more serious failure. The other possibility is to
have replicas of the data, ready to go when they’re needed. Fortunately,
Kafka Streams provides a mechanism to do this:

```properties
ksql.streams.num.standby.replicas=1
ksql.query.pull.enable.standby.reads=true
```

This first configuration tells Kafka Streams to use a separate task
that operates independently of the active (writer) state store to build
up a replica of the state. The second config indicates that reading is
allowed from the replicas (or *standbys*) if reading fails from the
active store.

This approach is sufficient to enable high availability for pull queries
in ksqlDB, but it requires that every request must try the active first.
A better approach is to use a *heartbeating* mechanism to detect failed
nodes preemptively, before a pull query arrives, so the request can
forward straight to a replica. Set the following configs to detect
failed nodes preemptively.

```properties
ksql.heartbeat.enable=true
ksql.lag.reporting.enable=true
```

The first configuration enables heartbeating, which should improve the
speed of request handling significantly during failures, as described
above. The second config allows for lag data of each of the standbys to
be collected and sent to the other nodes to make routing decisions. In
this case, the lag is defined by how many messages behind the active a
given standby is. If ensuring freshness is a priority, you can provide a
threshold in a pull query request to avoid the largest outliers:

```sql
SET 'ksql.query.pull.max.allowed.offset.lag'='100';
SELECT * FROM QUERYABLE_TABLE WHERE ID = 456;
```

This configuration causes the request to consider only standbys that are
within 100 messages of the active host.

With these configurations, you can introduce as much redundancy as you
require and ensure that your pull queries succeed with controlled lag
and low latency.


### Mount Volumes

Various features (plugins, UDFs, embedded connectors) may require that
you mount volumes to the docker image. To do this, follow
[the official docker documentation](https://docs.docker.com/storage/volumes/).

As an example using `docker-compose`, you can mount a udf directory
and use it like this:

```yaml
ksqldb-server:
  image: confluentinc/cp-ksqldb-server:8.1.0
  hostname: ksqldb-server
  container_name: ksqldb-server
  depends_on:
    - broker
    - schema-registry
  ports:
    - "8088:8088"
  volumes:
    - "./extensions/:/opt/ksqldb-udfs"
  environment:
    KSQL_LISTENERS: "http://0.0.0.0:8088"
    KSQL_BOOTSTRAP_SERVERS: "broker:9092"
    KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
    KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
    KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
    # Configuration for UDFs
    KSQL_KSQL_EXTENSION_DIR: "/opt/ksqldb-udfs"
```


#### ksqlDB Quickstart stack

Download the `docker-compose.yml` file from the **Include Kafka** tab
of the [ksqlDB Quick Start](../../quickstart.md#ksqldb-quick-start).

This `docker-compose.yml` file defines a stack with these features:

- Start one ksqlDB Server instance.
- Does not start Schema Registry, so Avro and Protobuf schemas aren’t
  available.
- Start the ksqlDB CLI container automatically.

Use the following command to start the ksqlDB CLI in the running
`ksqldb-cli` container.

```bash
docker exec -it ksqldb-cli ksql http://ksqldb-server:8088
```


#### Interactive ksqlDB clusters pre Kafka 2.0

[Interactive ksqlDB clusters](../how-it-works.md#ksqldb-architecture-interactive-deployment),
(which is the default configuration), require that the authenticated
ksqlDB user has open access to create, read, write, delete topics, and
use any consumer group:

Interactive ksqlDB clusters require these ACLs:

- The `DESCRIBE_CONFIGS` operation on the `CLUSTER` resource type.
- The `CREATE` operation on the `CLUSTER` resource type.
- The `DESCRIBE`, `READ`, `WRITE` and `DELETE` operations on
  all `TOPIC` resource types.
- The `DESCRIBE` and `READ` operations on all `GROUP` resource
  types.

It’s still possible to restrict the authenticated ksqlDB user from
accessing specific resources using `DENY` ACLs. For example, you can
add a `DENY` ACL to stop SQL queries from accessing a topic that
contains sensitive data.

For example, given the following setup:

- A 3-node ksqlDB cluster with ksqlDB servers running on IPs
  198.51.100.0, 198.51.100.1, 198.51.100.2
- Authenticating with the Kafka cluster as a ‘KSQL1’ user.

Then the following commands would create the necessary ACLs in the Kafka
cluster to allow ksqlDB to operate:

```bash

# Manage Metadata Schemas in ksqlDB for Confluent Platform

Use the `ksql-migrations` tool to manage metadata schemas for your
ksqlDB clusters by applying statements from *migration files* to your
ksqlDB clusters. This enables you to keep your SQL statements for
creating streams, tables, and queries in version control and manage the
versions of your ksqlDB clusters based on the migration files that have
been applied.

```none
usage: ksql-migrations [ {-c | --config-file} <config-file> ] <command> [ <args> ]

Commands are:
    apply                 Migrates the metadata schema to a new schema version.
    create                Creates a blank migration file with the specified description, which can then be populated with ksqlDB statements and applied as the next schema version.
    destroy-metadata      Destroys all ksqlDB server resources related to migrations, including the migrations metadata stream and table and their underlying Kafka topics. WARNING: this is not reversible!
    help                  Display help information
    info                  Displays information about the current and available migrations.
    initialize-metadata   Initializes the migrations schema metadata (ksqlDB stream and table) on the ksqlDB server.
    new-project           Creates a new migrations project directory structure and config file.
    validate              Validates applied migrations against local files.

See 'ksql-migrations help <command>' for more information on a specific
command.
```

The `ksql-migrations` tool supports migrations files containing the
following types of ksqlDB statements:

- `CREATE STREAM`
- `CREATE TABLE`
- `CREATE STREAM ... AS SELECT`
- `CREATE TABLE ... AS SELECT`
- `CREATE OR REPLACE`
- `INSERT INTO ... AS SELECT`
- `PAUSE <queryID>`
- `RESUME <queryID>`
- `TERMINATE <queryID>`
- `DROP STREAM`
- `DROP TABLE`
- `ALTER STREAM`
- `ALTER TABLE`
- `INSERT INTO ... VALUES`
- `CREATE CONNECTOR`
- `DROP CONNECTOR`
- `CREATE TYPE`
- `DROP TYPE`
- `SET <property>`
- `UNSET <property>`
- `DEFINE <variable>` - available if both `ksql-migrations` and the
  server are version 0.18 or newer.
- `UNDEFINE <variable>` - available if both `ksql-migrations` and
  the server are version 0.18 or newer.
- `ASSERT SCHEMA` - available if both `ksql-migrations` and the
  server are version 0.27 or newer.
- `ASSERT TOPIC` - available if both `ksql-migrations` and the
  server are version 0.27 or newer.

Any properties or variables set using the `SET`, `UNSET`, `DEFINE`
and `UNDEFINE` are applied in the current migration file only. They do
not carry over to the next migration file, even if multiple migration
files are applied as part of the same `ksql-migrations apply` command


#### NOTE
In the following examples, the `AVRO` schema string in Schema Registry is a
single-line raw string without newline characters (`\n`). The strings
are shown as human-readable text for convenience.

For example, the following a physical schema is in `AVRO` format and
is registered with Schema Registry with ID 1:

```json
{
  "schema": {
    "type": "record",
    "name": "PageViewValueSchema",
    "namespace": "io.confluent.ksql.avro_schemas",
    "fields": [
      {
        "name": "page_name",
        "type": "string",
        "default": "abc"
      },
      {
        "name": "ts",
        "type": "int",
        "default": 123
      }
    ]
  }
}
```

The following `CREATE` statement defines a stream on the `pageviews`
topic and specifies the physical schema that has an ID of `1`.

```sql
CREATE STREAM pageviews (pageId INT KEY)
WITH (
   KAFKA_TOPIC='pageviews-avro-topic', KEY_FORMAT='KAFKA', VALUE_FORMAT='AVRO',VALUE_SCHEMA_ID=1,PARTITIONS=1
   );
```

The following output from the `describe pageviews` command shows the
inferred logical schema for the `pageviews` stream:

```sql
DESCRIBE pageviews;

Name                 : PAGEVIEWS
 Field     | Type

 PAGEID    | INTEGER          (key)
 page_name | VARCHAR(STRING)
 ts        | INTEGER

```

If `WRAP_SINGLE_VALUE` is `false` in the statement, and if
`KEY_SCHEMA_ID` is set, `ROWKEY` is used as the key’s column name.

If `VALUE_SCHEMA_ID` is set, `ROWVAL` is used as the value’s column
name. The physical schema is used as the column data type.

For example, the following physical schema is `AVRO` and is defined in
Schema Registry with ID `2`:

```json
{"schema": "int"}
```

The following `CREATE` statement defines a table on the
`pageview-count` topic and specifies the physical schema that has ID
`2`.

`sql hl_lines="7" CREATE TABLE pageview_count (     pageId INT PRIMARY KEY   ) WITH (     KAFKA_TOPIC='pageview-count',     KEY_FORMAT='KAFKA',     VALUE_FORMAT='AVRO',     VALUE_SCHEMA_ID=2,     WRAP_SINGLE_VALUE=false,     PARTITIONS=1   );`

The inferred logical schema for the `pageview_count` table is:

```none
Name                 : PAGEVIEW_COUNT
 Field  | Type

 PAGEID | INTEGER          (primary key)
 ROWVAL | INTEGER

```

For more information about `WRAP_SINGLE_VALUE`, see
[Single field unwrapping](../reference/serialization.md#ksqldb-serialization-formats-single-field-unwrapping).


### Data Serialization

When a schema ID is provided, and schema inference is successful, ksqlDB
can create the data source. When writing to the data source, the
physical schema inferred by the schema ID is used to serialize data,
instead of the logical schema that’s used in other cases. Because
ksqlDB’s logical schema accepts `null` values but the physical schema
may not, serialization can fail even if the inserted value is valid for
the logical schema.

The following example shows a physical schema that’s defined in
Schema Registry with ID `1`. No default values are specified for the
`page_name` and `ts` fields.

`json hl_lines="8-9 12-13" {   "schema": {     "type": "record",     "name": "PageViewValueSchema",     "namespace": "io.confluent.ksql.avro_schemas",     "fields": [       {         "name": "page_name",         "type": "string"       },       {         "name": "ts",         "type": "int"       }     ]   } }`

The following example creates a stream with schema ID `1`:

```sql
CREATE STREAM pageviews (
    pageId INT KEY
  ) WITH (
    KAFKA_TOPIC='pageviews-avro-topic',
    KEY_FORMAT='KAFKA',
    VALUE_FORMAT='AVRO',
    VALUE_SCHEMA_ID=1,
    PARTITIONS=1
  );
```

ksqlDB infers the following schema for `pageviews`:

```none
Name                 : PAGEVIEWS
 Field     | Type

 PAGEID    | INTEGER          (key)
 page_name | VARCHAR(STRING)
 ts        | INTEGER

```

If you insert values to `pageviews` with `null` values, ksqlDB
returns an error:

```sql
INSERT INTO pageviews VALUES (1, null, null);
```

```none
Failed to insert values into 'PAGEVIEWS'. Could not serialize value: [ null | null ]. Error serializing message to topic: pageviews-avro-topic1. Invalid value: null used for required field: "page_name", schema type: STRING
```

This error occurs because `page_name` and `ts` are required fields
without default values in the specified physical schema.


## Step 4: Create a stream

You’re ready to create a [stream](concepts/streams.md#ksqldb-concepts-streams). A stream
associates a schema with an underlying Kafka topic. You use the
[CREATE STREAM](developer-guide/ksqldb-reference/create-stream.md#ksqldb-reference-create-stream) statement to register a
stream on a topic. If the topic doesn’t exist yet, ksqlDB creates it on the
Kafka broker.

In the ksqlDB CLI, copy the following SQL and press Enter to run the
statement.

```sql
CREATE STREAM riderLocations (profileId VARCHAR, latitude DOUBLE, longitude DOUBLE)
  WITH (kafka_topic='locations', value_format='json', partitions=1);
```

Your output should resemble:

```none
 Message

 Stream created

```

Here’s what each parameter in the CREATE STREAM statement does:

- `kafka_topic`: Name of the Kafka topic underlying the stream. In this case,
  it’s created automatically, because it doesn’t exist yet, but you can create
  streams over topics that exist already.
- `value_format`: Encoding of the messages stored in the Kafka topic. For JSON
  encoding, each row is stored as a JSON object whose keys and values are column
  names and values, for example:
  ```json
  {"profileId": "c2309eec", "latitude": 37.7877, "longitude": -122.4205}
  ```
- `partitions`: Number of partitions to create for the `locations` topic.
  This parameter is not needed for topics that exist already.


## Streams

A stream is a partitioned, immutable, append-only collection that
represents a series of historical facts. For example, the rows of a
stream could model a sequence of financial transactions, like “Alice
sent $100 to Bob”, followed by “Charlie sent $50 to Bob”.

Once a row is inserted into a stream, it can never change. New rows can
be appended at the end of the stream, but existing rows can never be
updated or deleted.

Each row is stored in a particular partition. Every row, implicitly or
explicitly, has a key that represents its identity. All rows with the
same key reside in the same partition.

To create a stream, use the `CREATE STREAM` command. The following
example statement specifies a name for the new stream, the names of the
columns, and the data type of each column.

```sql
CREATE STREAM s1 (
    k VARCHAR KEY,
    v1 INT,
    v2 VARCHAR
) WITH (
    kafka_topic = 's1',
    partitions = 3,
    value_format = 'json'
);
```

This creates a new stream named `s1` with three columns: `k`,
`v1`, and `v2`. The column `k` is designated as the key of this
stream, which controls the partition that each row is stored in. When
the data is stored, the value portion of each row’s underlying
Kafka record is serialized in the JSON format.

Under the hood, each stream corresponds to a
[Kafka topic](../../concepts/apache-kafka-primer.md#ksqldb-apache-kafka-primer-topics) with a
registered schema. If the backing topic for a stream doesn’t exist when
you declare it, ksqlDB creates it on your behalf, as shown in the
previous example statement.

You can also declare a stream on top of an existing topic. When you do
that, ksqlDB simply registers its associated schema. If topic `s2`
already exists, the following statement register a new stream over it:

```sql
CREATE STREAM s2 (
    k1 VARCHAR KEY,
    v1 VARCHAR
) WITH (
    kafka_topic = 's2',
    value_format = 'json'
);
```


# Create Clickstream Data Analysis Pipeline Using ksqlDB in Confluent Platform

This example shows how you can use ksqlDB to process a stream of click data, aggregate and filter it, and join to
information about the users. Visualisation of the results is provided by Grafana, on top of data streamed to Elasticsearch.

These steps will guide you through how to setup your environment and run the clickstream analysis tutorial from a Docker container.

![image](ksqldb/images/clickstream_demo_flow.png)


Prerequisites:
: - Docker
    - Docker version 1.11 or later is
      [installed and running](https://docs.docker.com/engine/installation/).
    - Docker Compose is [installed](https://docs.docker.com/compose/install/). Docker Compose is installed by default with Docker
      for Mac.
    - Docker memory is allocated minimally at 6 GB. When using Docker Desktop for Mac, the default Docker memory
      allocation is 2 GB. You can change the default allocation to 6 GB in Docker. Navigate to **Preferences** > **Resources** > **Advanced**.
  - Internet connectivity
  - [Operating System](../../installation/versions-interoperability.md#operating-systems) currently supported by Confluent Platform
  - Networking and Kafka on Docker
    - Configure your hosts and ports to allow both internal and external components to the Docker network to communicate.
    - Configure your hostnames and ports to allow the Docker network’s internal and external components to communicate.
  - (Optional) [curl](https://curl.se/).
    - In the steps below, you will download a Docker Compose file. You can download this file any way you like, but the
      instructions below provide the explicit curl command you can use to download the file.

- [jq](https://stedolan.github.io/jq/) version 1.6 or later
- If you are using Linux as your host, for the Elasticsearch container to start successfully you must first run:

```bash
sudo sysctl -w vm.max_map_count=262144
```


## Create the Clickstream Data

Once you’ve confirmed all the Docker containers are running, create the source connectors that generate mock data. This demo leverages the embedded Connect worker in ksqlDB.

1. Launch the ksqlDB CLI:
   ```bash
   docker-compose exec ksqldb-cli ksql http://ksqldb-server:8088
   ```
2. Ensure the ksqlDB server is ready to receive requests by running the following until it succeeds:
   ```sql
   show topics;
   ```

   The output should look similar to:
   ```none
    Kafka Topic | Partitions | Partition Replicas


   ```
3. Run the script [create-connectors.sql](https://github.com/confluentinc/examples/tree/latest/clickstream/ksql/ksql-clickstream-demo/demo/create-connectors.sql) that executes the ksqlDB statements to create three source connectors for generating mock data.
   ```sql
   RUN SCRIPT '/scripts/create-connectors.sql';
   ```

   The output should look similar to:
   ```none
   CREATE SOURCE CONNECTOR datagen_clickstream_codes WITH (
     'connector.class'          = 'io.confluent.kafka.connect.datagen.DatagenConnector',
     'kafka.topic'              = 'clickstream_codes',
     'quickstart'               = 'clickstream_codes',
     'maxInterval'              = '20',
     'iterations'               = '100',
     'format'                   = 'json',
     'key.converter'            = 'org.apache.kafka.connect.converters.IntegerConverter');
    Message

    Created connector DATAGEN_CLICKSTREAM_CODES

   [...]
   ```
4. Now the `clickstream` generator is running, simulating the stream of clicks. Sample the messages in the `clickstream` topic:
   ```sql
   print clickstream limit 3;
   ```

   Your output should resemble:
   ```bash
   Key format: HOPPING(JSON) or TUMBLING(JSON) or HOPPING(KAFKA_STRING) or TUMBLING(KAFKA_STRING) or KAFKA_STRING
   Value format: JSON or KAFKA_STRING
   rowtime: 2020/06/11 10:38:42.449 Z, key: 222.90.225.227, value: {"ip":"222.90.225.227","userid":12,"remote_user":"-","time":"1","_time":1,"request":"GET /images/logo-small.png HTTP/1.1","status":"302","bytes":"1289","referrer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"}
   rowtime: 2020/06/11 10:38:42.528 Z, key: 111.245.174.248, value: {"ip":"111.245.174.248","userid":30,"remote_user":"-","time":"11","_time":11,"request":"GET /site/login.html HTTP/1.1","status":"302","bytes":"14096","referrer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"}
   rowtime: 2020/06/11 10:38:42.705 Z, key: 122.152.45.245, value: {"ip":"122.152.45.245","userid":11,"remote_user":"-","time":"21","_time":21,"request":"GET /images/logo-small.png HTTP/1.1","status":"407","bytes":"4196","referrer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"}
   Topic printing ceased
   ```
5. The second data generator running is for the HTTP status codes. Sample the messages in the `clickstream_codes` topic:
   ```sql
   print clickstream_codes limit 3;
   ```

   Your output should resemble:
   ```bash
   Key format: KAFKA_INT
   Value format: JSON or KAFKA_STRING
   rowtime: 2020/06/11 10:38:40.222 Z, key: 200, value: {"code":200,"definition":"Successful"}
   rowtime: 2020/06/11 10:38:40.688 Z, key: 404, value: {"code":404,"definition":"Page not found"}
   rowtime: 2020/06/11 10:38:41.006 Z, key: 200, value: {"code":200,"definition":"Successful"}
   Topic printing ceased
   ```
6. The third data generator is for the user information. Sample the messages in the `clickstream_users` topic:
   ```sql
   print clickstream_users limit 3;
   ```

   Your output should resemble:
   ```bash
   Key format: KAFKA_INT
   Value format: JSON or KAFKA_STRING
   rowtime: 2020/06/11 10:38:40.815 Z, key: 1, value: {"user_id":1,"username":"Roberto_123","registered_at":1410180399070,"first_name":"Greta","last_name":"Garrity","city":"San Francisco","level":"Platinum"}
   rowtime: 2020/06/11 10:38:41.001 Z, key: 2, value: {"user_id":2,"username":"akatz1022","registered_at":1410356353826,"first_name":"Ferd","last_name":"Pask","city":"London","level":"Gold"}
   rowtime: 2020/06/11 10:38:41.214 Z, key: 3, value: {"user_id":3,"username":"akatz1022","registered_at":1483293331831,"first_name":"Oriana","last_name":"Romagosa","city":"London","level":"Platinum"}
   Topic printing ceased
   ```
7. Go to Confluent Control Center UI at [http://localhost:9021](http://localhost:9021) and view the three kafka-connect-datagen source connectors created with the ksqlDB CLI.
   ![Datagen Connectors](ksqldb/images/c3_datagen_connectors.png)


### Create the ksqlDB source streams

For ksqlDB to be able to use the topics that Debezium created, you must
declare streams over it. Because you configured Kafka Connect
with Schema Registry, you don’t need to declare the schema of the data for
the streams, because it’s inferred from the schema that Debezium writes
with.

Run the following statement to create a stream over the `customers`
table:

```sql
CREATE STREAM customers WITH (
    kafka_topic = 'customers.public.customers',
    value_format = 'avro'
);
```

Do the same for `orders`. For this stream, specify that the timestamp
of the event is derived from the data itself. Specifically, it’s
extracted and parsed from the `ts` field.

```sql
CREATE STREAM orders WITH (
    kafka_topic = 'my-replica-set.logistics.orders',
    value_format = 'avro',
    timestamp = 'ts',
    timestamp_format = 'yyyy-MM-dd''T''HH:mm:ss'
);
```

Finally, repeat the same for `shipments`:

```sql
CREATE STREAM shipments WITH (
    kafka_topic = 'my-replica-set.logistics.shipments',
    value_format = 'avro',
    timestamp = 'ts',
    timestamp_format = 'yyyy-MM-dd''T''HH:mm:ss'
);
```


### Run the microservice

Compile the program with:

```bash
mvn compile
```

And run it:

```bash
mvn exec:java -Dexec.mainClass="io.ksqldb.tutorial.EmailSender"
```

If everything is configured correctly, emails will be sent whenever an
anomaly is detected. There are a few things to note with this simple
implementation.

First, if you start more instances of this microservice, the partitions
of the `possible_anomalies` topic will be load balanced across them.
This takes advantage of the standard
[Kafka consumer groups](/platform/current/clients/consumer.html#consumer-groups)
behavior.

Second, this microservice is configured to checkpoint its progress every
`100` milliseconds through the `ENABLE_AUTO_COMMIT_CONFIG`
configuration. That means any successfully processed messages will not
be reprocessed if the microservice is taken down and turned on again.

Finally, note that ksqlDB emits a new event every time a tumbling window
changes. ksqlDB uses a model called “refinements” to continually emit
new changes to stateful aggregations. For example, if an anomaly was
detected because three credit card transactions were found in a given
interval, an event would be emitted from the table. If a fourth is
detected in the same interval, another event is emitted. Because
SendGrid does not (at the time of writing) support idempotent email
submission, you would need to have a small piece of logic in your
program to prevent sending an email multiple times for the same period.
This is omitted for brevity.

If you wish, you can continue the example by inserting more events into
the `transactions` topics.


### Create the ksqlDB calls stream

For ksqlDB to be able to use the topic that Debezium created, you must
declare a stream over it. Because you configured Kafka Connect
with Schema Registry, you don’t need to declare the schema of the data for
the streams. It is simply inferred from the schema that Debezium writes
with. Run the following at the ksqlDB CLI:

```sql
CREATE STREAM calls WITH (
    kafka_topic = 'call-center-db.call-center.calls',
    value_format = 'avro'
);
```


### Listing cluster links

**Example Command**

```bash
kafka-cluster-links --list --bootstrap-server localhost:9093
```

**Example Output**

```bash
Link name: 'example-link', link ID: '123-some-link-id', remote cluster ID: '123-some-cluster-id', local cluster ID: ', local cluster ID: '456-some-other-cluster-id'', remote cluster available: 'true'
```

You can list existing cluster links. The command returns the link name, link ID
(an internally allocated unique ID), the cluster ID of the linked cluster, and whether
the linked cluster is available or not.

`--link`
: If provided, only lists the specified cluster link.


  * Type: string

`--command-config`
: Property file containing configurations to be passed to the [AdminClient](../../installation/configuration/admin-configs.md#cp-config-admin). For example,
  with security credentials for authorization and authentication.


  * Type: string

`--include-topics`
: If provided, includes a list of all mirror topics on this cluster link.


  * Type: string

You must have `DESCRIBE CLUSTER` authorization to list cluster links.


### Viewing a cluster link task status

You can view the status of the following configurable tasks:

- [Consumer offset sync](mirror-topics-cp.md#mirror-topics-consumer-offsets)
- [ACL sync (migrate ACLs)](security.md#cluster-link-acls-migrate)
- [Topic configurations sync](mirror-topics-cp.md#sync-topic-configs)
- [Auto-Create mirror topics](mirror-topics-cp.md#auto-create-mirror-topics-concepts)

To view the status of any given task on Confluent Platform, use the following command:

```bash
confluent kafka link task list <link-name>
```

Or:

```bash
./bin/kafka-cluster-links.sh ... --list-tasks --link <link-name>
```


### Deleting a cluster link

**Example Command**

```bash
kafka-cluster-links --bootstrap-server localhost:9093 \
                       --delete \
                       --link example-link
```

**Example Output**

```bash
Cluster link 'example-link' deletion successfully completed.
```

To delete an existing link, use `kafka-cluster-links` along with [bootstrap-server](#bootstrap-cluster-links) and these flags.

`--link`
: (Required) The name of the cluster link to describe.


  * Type: string

`--command-config`
: Property file containing configurations to be passed to the [AdminClient](../../installation/configuration/admin-configs.md#cp-config-admin). For example,
  with security credentials for authorization and authentication.


  * Type: string

`--validate-only`
: If provided, validates the cluster link deletion but doesn’t apply the delete.

`--force`
: Force deletion of a link even if there are mirror topics are currently linked with it.

You must have `ALTER CLUSTER` authorization to delete a cluster link, as described in [Authorization (ACLs)](security.md#cluster-link-acls).


#### Create the Principal and ACLs to allow the cluster link to read from cluster A

The cluster link needs a principal that is authorized to read data from cluster
A. You created the “link” principal in the cluster setup step, above, and now
you will assign it the required privileges.

1. Give the link’s principal the **Describe:Cluster ACL**.
   ```bash
   $CONFLUENT_HOME/bin/kafka-acls --command-config my-examples/command.config --bootstrap-server localhost:9092  \
   --add --allow-principal User:link --operation Describe --cluster
   ```

   This ACL is specifically required for bidirectional mode.
2. At a minimum, give the link’s principal **Read:Topics** and **DescribeConfigs:Topics** on the topics that the cluster link is allowed to read from.

   This example allows the cluster link to read data from all topics. Alternatively, only specific topic names or prefixes can be given.
   These can be different from the topic ACLs given on the remote cluster.
3. (Recommended) Assign additional ACLs for syncing consumer offsets, which is a critical feature of a bidirectional cluster link.
   To learn about consumer offset sync configuration options, see `consumer.offset.sync.enable` and `consumer.offset.sync.ms` in [Configuration Options](#cp-cluster-link-config-options).
   - Grant the link’s principal **Describe** permissions on all topics.
     ```bash
     $CONFLUENT_HOME/bin/kafka-acls --command-config my-examples/command.config --bootstrap-server localhost:9092  --add --allow-principal User:link --operation Describe --topic "*"
     ```

     Your output should resemble:
     ```bash
     Adding ACLs for resource `ResourcePattern(resourceType=TOPIC, name=*, patternType=LITERAL)`:
             (principal=User:link, host=*, operation=DESCRIBE, permissionType=ALLOW)
     ```
   - Grant the link’s principal **Describe** permissions on all consumer groups.
     ```bash
     $CONFLUENT_HOME/bin/kafka-acls --command-config my-examples/command.config --bootstrap-server localhost:9092  --add --allow-principal User:link --operation Describe --operation Read --group "*"
     ```

     Your output should resemble:
     ```bash
     Adding ACLs for resource `ResourcePattern(resourceType=GROUP, name=*, patternType=LITERAL)`:
         (principal=User:link, host=*, operation=READ, permissionType=ALLOW)
         (principal=User:link, host=*, operation=DESCRIBE, permissionType=ALLOW)

     Current ACLs for resource `ResourcePattern(resourceType=GROUP, name=*, patternType=LITERAL)`:
         (principal=User:link, host=*, operation=DESCRIBE, permissionType=ALLOW)
         (principal=User:link, host=*, operation=READ, permissionType=ALLOW)
     ```
   - Grant the link’s principal **DescribeConfigs** permissions on the cluster.
     ```bash
     $CONFLUENT_HOME/bin/kafka-acls --command-config my-examples/command.config --bootstrap-server localhost:9092  --add --allow-principal User:link --operation DescribeConfigs --cluster
     ```

     Your output should resemble:
     ```bash
     Adding ACLs for resource `ResourcePattern(resourceType=CLUSTER, name=kafka-cluster, patternType=LITERAL)`:
         (principal=User:link, host=*, operation=DESCRIBE_CONFIGS, permissionType=ALLOW)
     ```
4. (Optional) Assign additional ACLs for [syncing (migrating) ACLs](security.md#cluster-link-acls-migrate)
   or using [prefixing](/platform/current/multi-dc-deployments/cluster-linking/mirror-topics-cp.html#prefix-mirror-topics-and-consumer-group-names)
   plus [auto-create mirror topics](/platform/current/multi-dc-deployments/cluster-linking/mirror-topics-cp.html#auto-create-mirror-topics).
   These can be different from the ACLs given on the remote cluster.


## Replicator with RBAC

When using RBAC, Replicator clients should use token authentication as described in [Configure Clients for SASL/OAUTHBEARER authentication in Confluent Platform](../../security/authentication/sasl/oauthbearer/configure-clients.md#security-sasl-rbac-oauthbearer-clientconfig). These configurations should be prefixed
with the usual Replicator prefixes of `src.kafka.` and `dest.kafka.`. An example configuration for source and destination cluster that are RBAC enabled is below:

```bash
src.kafka.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  username="sourceUser \
  password="xxx" \
  metadataServerUrls="http://sourceHost:8090";
src.kafka.security.protocol=SASL_PLAINTEXT
src.kafka.sasl.mechanism=OAUTHBEARER
src.kafka.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler

dest.kafka.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  username="destUser \
  password="xxx" \
  metadataServerUrls="http://destHost:8090";
dest.kafka.security.protocol=SASL_PLAINTEXT
dest.kafka.sasl.mechanism=OAUTHBEARER
dest.kafka.sasl.login.callback.handler.class=io.confluent.kafka.clients.plugins.auth.token.TokenUserLoginCallbackHandler
```

for Replicator executable these should not be prefixed and should be placed in files referred to by `--consumer.config` and `--producer.config` as shown below:

```bash

### Verify topic replication across the clusters

When Replicator finishes initialization, it checks the origin cluster for topics that need to be replicated.

In this case, it finds `test-topic` and creates the corresponding topic in the
destination cluster. You can verify this with the following command.

```none
./bin/kafka-topics --describe --topic test-topic.replica --bootstrap-server localhost:9092
```

Note that you are checking the existence of `test-topic.replica` because `test-topic` was renamed when it was replicated to the destination cluster, according to your configuration.

Your output should look similar to this:

```none
./bin/kafka-topics --describe --topic test-topic.replica --bootstrap-server localhost:9092
Topic: test-topic.replica    PartitionCount: 1       ReplicationFactor: 1    Configs: message.timestamp.type=CreateTime,segment.bytes=1073741824
      Topic: test-topic.replica      Partition: 0    Leader: 0       Replicas: 0     Isr: 0  Offline: 0
```

You can also list and describe the topics on the destination cluster. Replicated topics, like `test-topic.replica` will be listed.

```none
./bin/kafka-topics --list --bootstrap-server localhost:9092
```

At any time after you’ve created the topic in the origin cluster, you
can begin sending data to it using a Kafka producer to write to
`test-topic` in the origin cluster. You can then confirm that the data
has been replicated by consuming from `test-topic.replica` in the
destination cluster. For example, to send a sequence of numbers using
Kafka’s console producer, run the following command in a new terminal window:

```none
seq 10000 | ./bin/kafka-console-producer --topic test-topic --broker-list localhost:9082
```

You can confirm delivery in the destination cluster using the console consumer in its own terminal window:

```none
./bin/kafka-console-consumer --from-beginning --topic test-topic.replica --bootstrap-server localhost:9092
```

If the numbers 1 to 10,000 appear in the consumer output, this indicates that you have successfully created multi-cluster replication.

Press `Ctl-C` to end the consumer readout and return to the command prompt.


## Run Example

1. Clone the [confluentinc/examples](https://github.com/confluentinc/examples) GitHub repository.
   ```bash
   git clone https://github.com/confluentinc/examples
   ```

1. Change directory to the Schema Translation example.
   ```bash
   cd examples/replicator-schema-translation
   ```
2. Start the entire example by running a single command that creates source and destination clusters automatically and adds a schema to the source Schema Registry. This takes less than 5 minutes to complete.
   ```bash
   docker-compose up -d
   ```
3. Wait at least 2 minutes and then verify the example has completely started by checking the subjects in the source and destination Schema Registry.
   ```bash
   # Source Schema Registry should show one subject, i.e., the output should be ["testTopic-value"]
   docker-compose exec connect curl http://srcSchemaregistry:8085/subjects

   # Destination Schema Registry should show no subjects, i.e., the output should be []
   docker-compose exec connect curl http://destSchemaregistry:8086/subjects
   ```
4. To prepare for schema translation, put the source Schema Registry in “READONLY” mode and the destination registry in “IMPORT” mode. Note that this works only when the destination Schema Registry has no registered subjects (as is true in this example), otherwise the import would fail with a message similar to “Cannot import since found existing subjects”.
   ```bash
   docker-compose exec connect /etc/kafka/scripts/set_sr_modes_pre_translation.sh
   ```

   Your output should resemble:
   ```bash
   Setting srcSchemaregistry to READONLY mode:
   {"mode":"READONLY"}
   Setting destSchemaregistry to IMPORT mode:
   {"mode":"IMPORT"}
   ```
5. Submit Replicator to perform the translation.
   ```bash
   docker-compose exec connect /etc/kafka/scripts/submit_replicator.sh
   ```

   Your output should show the posted Replicator configuration. The key configuration that enables the schema translation is `schema.subject.translator.class=io.confluent.connect.replicator.schemas.DefaultSubjectTranslator`
   ```bash
   {
   "name": "testReplicator",
   "config": {
      "connector.class": "io.confluent.connect.replicator.ReplicatorSourceConnector",
      "topic.whitelist": "_schemas",
      "topic.rename.format": "${topic}.replica",
      "key.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
      "value.converter": "io.confluent.connect.replicator.util.ByteArrayConverter",
      "src.kafka.bootstrap.servers": "srcKafka1:10091",
      "dest.kafka.bootstrap.servers": "destKafka1:11091",
      "tasks.max": "1",
      "confluent.topic.replication.factor": "1",
      "schema.subject.translator.class": "io.confluent.connect.replicator.schemas.DefaultSubjectTranslator",
      "schema.registry.topic": "_schemas",
      "schema.registry.url": "http://destSchemaregistry:8086",
      "name": "testReplicator"
   },
   "tasks": [],
   "type": "source"
   }
   ```
6. Verify the schema translation by revisiting the subjects in the source and destination Schema Registries.
   ```bash
   # Source Schema Registry should show one subject, i.e., the output should be ["testTopic-value"]
   docker-compose exec connect curl http://srcSchemaregistry:8085/subjects

   # Destination Schema Registry should show one subject, i.e., the output should be ["testTopic.replica-value"]
   docker-compose exec connect curl http://destSchemaregistry:8086/subjects
   ```
7. To complete the example, reset both Schema Registries to `READWRITE` mode, this completes the migration process:
   ```bash
   docker-compose exec connect /etc/kafka/scripts/set_sr_modes_post_translation.sh
   ```


## Workflows and examples

You can integrate Maven Plugin goals with [GitHub Actions](https://docs.github.com/en/actions)
into a continuous integration/continuous deployment (CI/CD) pipleline to manage schemas on Schema Registry.
A general example for developing and validating an Apache Kafka® client application with a Python producer
and consumer is provided in the [kafka-github-actions demo repo](https://github.com/ybyzek/kafka-github-actions).

Here is an alternative sample [pom.xml](https://maven.apache.org/guides/introduction/introduction-to-the-pom.html)
with project configurations for more detailed validate and register steps.

```bash
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>io.confluent</groupId>
    <artifactId>GitHub-Actions-Demo</artifactId>
    <version>1.0</version>

<pluginRepositories>
    <pluginRepository>
        <id>confluent</id>
        <url>https://packages.confluent.io/maven/</url>
    </pluginRepository>
</pluginRepositories>

    <properties>
        <schemaRegistryUrl><$CONFLUENT_SCHEMA_REGISTRY_URL></schemaRegistryUrl>
        <schemaRegistryBasicAuthUserInfo>
            <$CONFLUENT_BASIC_AUTH_USER_INFO>
        </schemaRegistryBasicAuthUserInfo>
        <confluent.version>8.1.0</confluent.version>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>io.confluent</groupId>
                <artifactId>kafka-schema-registry-maven-plugin</artifactId>
                <version>${confluent.version}</version>
                <configuration>
                    <schemaRegistryUrls>
                        <param>${schemaRegistryUrl}</param>
                    </schemaRegistryUrls>
                    <userInfoConfig>${schemaRegistryBasicAuthUserInfo}</userInfoConfig>
                </configuration>
                <executions>

                    <execution>
                        <id>validate</id>
                        <phase>validate</phase>
                        <goals>
                            <goal>validate</goal>
                        </goals>
                        <configuration>
                            <subjects>
                                <Orders-value>src/main/resources/order.avsc</Orders-value>
                                <Flights-value>src/main/resources/flight.proto</Flights-value>
                            </subjects>
                            <schemaTypes>
                                <Flights-value>PROTOBUF</Flights-value>
                            </schemaTypes>

                        </configuration>
                    </execution>

                    <execution>
                        <id>set-compatibility</id>
                        <phase>validate</phase>
                        <goals>
                            <goal>set-compatibility</goal>
                        </goals>
                        <configuration>
                            <compatibilityLevels>
                                <Orders-value>FORWARD_TRANSITIVE</Orders-value>
                                <Flights-value>FORWARD_TRANSITIVE</Flights-value>
                            </compatibilityLevels>
                        </configuration>
                    </execution>


                    <execution>
                        <id>test-local</id>
                        <phase>validate</phase>
                        <goals>
                            <goal>test-local-compatibility</goal>
                        </goals>
                        <configuration>
                            <schemas>
                                <order>src/main/resources/order.avsc</order>
                                <flight>src/main/resources/flight.proto</flight>
                            </schemas>
                            <schemaTypes>
                                <flight>ProtoBuf</flight>
                            </schemaTypes>
                            <previousSchemaPaths>
                                <flight>src/main/resources/flightSchemas</flight>
                                <order>src/main/resources/orderSchemas</order>
                            </previousSchemaPaths>
                            <compatibilityLevels>
                                <order>FORWARD_TRANSITIVE</order>
                                <flight>FORWARD_TRANSITIVE</flight>
                            </compatibilityLevels>
                        </configuration>
                    </execution>

                    <execution>
                        <id>test-compatibility</id>
                        <phase>validate</phase>
                        <goals>
                            <goal>test-compatibility</goal>
                        </goals>
                        <configuration>
                            <subjects>
                                <Orders-value>src/main/resources/order.avsc</Orders-value>
                                <Flights-value>src/main/resources/flight.proto</Flights-value>
                            </subjects>
                            <schemaTypes>
                                <Flights-value>PROTOBUF</Flights-value>
                            </schemaTypes>
                        </configuration>
                    </execution>

                    <execution>
                        <id>register</id>
                        <goals>
                            <goal>register</goal>
                        </goals>
                        <configuration>
                            <subjects>
                                <Orders-value>src/main/resources/order.avsc</Orders-value>
                                <Flights-value>src/main/resources/flight.proto</Flights-value>
                            </subjects>
                            <schemaTypes>
                                <Flights-value>PROTOBUF</Flights-value>
                            </schemaTypes>
                        </configuration>
                    </execution>

                </executions>
            </plugin>

        </plugins>
    </build>


</project>
```

The following workflows can be coded as GitHub actions to accomplish CICD for schema management.

1. When a pull request is created to merge a new schema to master, validate the schema,
   check local schema compatibility, set compatibility of subject, and test schema compatibility with subject.
   ```properties
   run: mvn validate
   ```

   The validate step would include:
   ```bash
   mvn schema-registry:validate@validate
   mvn schema-registry:test-local-compatibility@test-local
   mvn schema-registry:set-compatibility@set-compatibility
   mvn schema-registry:test-compatibility@test-compatibility
   ```

   Integrated with GitHub Actions, the `pull-request.yaml` for this step might look like this:
   ```bash
   name: Testing branch for compatibility before merging
    on:
     pull_request:
     branches: [ master ]
     paths: [src/main/resources/*]
    jobs:

     validate:
       runs-on: ubuntu-latest
       steps:
          - uses: actions/checkout@v2
          - uses: actions/setup-java@v2
            with:
              java-version: '11'
              distribution: 'temurin'
              cache: maven
          - name: Validate if schema is valid
            run: mvn schema-registry:validate@validate

     test-local-compatibility:
     needs: validate
       runs-on: ubuntu-latest
       steps:
         - uses: actions/checkout@v2
         - uses: actions/setup-java@v2
           with:
             java-version: '11'
             distribution: 'temurin'
             cache: maven
         - name: Test schema with locally present schema
           run: mvn schema-registry:test-local-compatibility@test-local

     set-compatibility:
       needs: test-local-compatibility
       runs-on: ubuntu-latest
       steps:
         - uses: actions/checkout@v2
         - uses: actions/setup-java@v2
           with:
             java-version: '11'
             distribution: 'temurin'
             cache: maven
         - name: Set compatibility of subject
           run: mvn schema-registry:set-compatibility@set-compatibility

     test-compatibility:
       needs: set-compatibility
       runs-on: ubuntu-latest
       steps:
         - uses: actions/checkout@v2
         - uses: actions/setup-java@v2
           with:
             java-version: '11'
             distribution: 'temurin'
             cache: maven
         - name: Test schema with subject
           run: mvn schema-registry:test-compatibility@test-compatibility
   ```

   If compatibility checking passes a new pull request is created for approval.
2. Register schema when a pull request is approved and merged to master.

   Run the action to register the new schema on the Schema Registry:
   ```bash
   run: mvn schema-registry:register@register
   ```

   The `push.yaml` for this step would look like this:
   ```bash
   name: Registering Schema on merge of pull request
   on:
     push:
       branches: [ master ]
       paths: [src/main/resources/*]
   jobs:
     register-schema:
       runs-on: ubuntu-latest
       steps:
         - uses: actions/checkout@v2
         - uses: actions/setup-java@v2
           with:
             java-version: '11'
             distribution: 'temurin'
             cache: maven
         - name: Register Schema
           run: mvn io.confluent:kafka-schema-registry-maven-plugin:register@register
   ```


## Avro deserializer

You can plug in `KafkaAvroDeserializer` to `KafkaConsumer` to receive messages of any Avro type from Kafka.
In the following example, messages are received with a key of type `string` and a value of type Avro record
from Kafka. When getting the message key or value, a `SerializationException` may occur if the data is
not well formed.

The examples below use the default hostname and port for the Kafka bootstrap server (`localhost:9092`) and Schema Registry (`localhost:8081`).

```none
import org.apache.kafka.clients.consumer.Consumer;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.clients.consumer.ConsumerConfig;

import org.apache.avro.generic.GenericRecord;

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.Properties;
import java.util.Random;

Properties props = new Properties();

props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "group1");


props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "io.confluent.kafka.serializers.KafkaAvroDeserializer");
props.put("schema.registry.url", "http://localhost:8081");

props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");

String topic = "topic1";
final Consumer<String, GenericRecord> consumer = new KafkaConsumer<String, GenericRecord>(props);
consumer.subscribe(Arrays.asList(topic));

try {
  while (true) {
    ConsumerRecords<String, GenericRecord> records = consumer.poll(100);
    for (ConsumerRecord<String, GenericRecord> record : records) {
      System.out.printf("offset = %d, key = %s, value = %s \n", record.offset(), record.key(), record.value());
    }
  }
} finally {
  consumer.close();
}
```

With Avro, it is not necessary to use a property to specify a specific type,
since the type can be derived directly from the Avro schema, using the namespace
and name of the Avro type.  This allows the Avro deserializer to be used out of
the box with topics that have records of heterogeneous Avro types. This would
be the case when using the `RecordNameStrategy` (or `TopicRecordNameStrategy`) to
store multiple types in the same topic, as described in Martin Kleppmann’s blog post
[Should You Put Several Event Types in the Same Kafka Topic?](https://www.confluent.io/blog/put-several-event-types-kafka-topic/).
(An alternative is to use schema references, as described in [Multiple event types in the same topic](#multiple-event-types-same-topic-avro) and
[Putting Several Event Types in the Same Topic – Revisited](https://www.confluent.io/blog/multiple-event-types-in-the-same-kafka-topic/))

This differs from the [Protobuf](serdes-protobuf.md#sr-deserializer-protobuf) and [JSON Schema](serdes-json.md#sr-deserializer-json) deserializers,
where in order to return a specific rather than a generic type, you must use a specific property.

To return a specific type in Avro, you must add the following configuration:

```bash
props.put(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, true);
```


Here is a summary of specific and generic return types for each schema format.

|
         | Avro                                                               | Protobuf                                                     | JSON Schema                                                    |
|---------------|--------------------------------------------------------------------|--------------------------------------------------------------|----------------------------------------------------------------|
| Specific type | Generated class that implements
org.apache.avro.SpecificRecord | Generated class that extends
com.google.protobuf.Message | Java class (that is compatible with
Jackson serialization) |
| Generic type  | org.apache.avro.GenericRecord                                      | com.google.protobuf.DynamicMessage                           | com.fasterxml.jackson.databind.JsonNode                        |


### Confluent Platform

1. Start Confluent Platform using the following command:
   > ```bash
   > confluent local services start
   > ```

   >
   >
2. Verify registered schema types.
   >
   >

   > Schema Registry supports arbitrary schema types. You should verify which schema types are currently registered with Schema Registry.

   > To do so, type the following command (assuming you use the default URL and port for Schema Registry, `localhost:8081`):
   > ```bash
   > curl http://localhost:8081/schemas/types
   > ```

   > The response will be one or more of the following. If additional schema format plugins are installed, these will also be available.
   > ```bash
   > ["JSON", "PROTOBUF", "AVRO"]
   > ```

   > Alternatively, use the curl `--silent` flag, and pipe the command through [jq](https://stedolan.github.io/jq/) (`curl --silent http://localhost:8081/schemas/types | jq`) to get nicely formatted output:
   > ```bash
   > "JSON",
   > "PROTOBUF",
   > "AVRO"
   > ```
3. Use the producer to send Avro records in JSON as the message value.
   > The new topic, `transactions-json`,  will be created as a part of this producer command if it does not already exist.
   > This command starts a producer, and creates a schema for the transactions-avro topic. The schema has two fields, `id` and `amount`.
   > ```none
   > kafka-json-schema-console-producer --bootstrap-server localhost:9092 \
   > --property schema.registry.url=http://localhost:8081 --topic transactions-json \
   > --property value.schema='{"type":"object", "properties":{"id":{"type":"string"},"amount":{"type":"number"} }}'
   > ```
4. Type the following command in the shell, and hit return.
   > ```none
   > { "id":"1000", "amount":500 }
   > ```
5. Open a new terminal window, and use the consumer to read from topic `transactions-json` and get the value of the message in JSON.
   > ```none
   > kafka-json-schema-console-consumer --bootstrap-server localhost:9092  --from-beginning --topic transactions-json --property schema.registry.url=http://localhost:8081
   > ```

   > You should see following in the console.
   > ```none
   > {"id":"1000","amount":500}
   > ```

   > Leave this consumer running.
6. Use the producer to send another record as the message value, which includes a new property not explicitly declared in the schema.
   > JSON Schema has an open content model, which allows any number of additional properties to appear in a JSON document without being specified in the JSON schema.
   > This is achieved with `additionalProperties` set to `true`, which is the default. If you do not explicitly disable `additionalProperties` (by setting it to `false`),
   > undeclared properties are allowed in records. These next few steps demonstrate this unique aspect of JSON Schema.

   > Return to the producer session that is already running and send the following message, which includes a new property `"customer_id"` that is not declared in the schema
   > with which we started this producer. (Hit return to send the message.)
   > ```none
   > {"id":"1000","amount":500,"customer_id":"1221"}
   > ```
7. Return to your running consumer to read from topic `transactions-json` and get the new message.
   > You should see the new output added to the original.
   > ```none
   > {"id":"1000","amount":500}
   > {"id":"1000","amount":500,"customer_id":"1221"}
   > ```

   > The message with the new property (`customer_id`) is successfully produced and read. If you try this with the other schema formats (Avro, Protobuf),
   > it will fail at the producer command because those specifications require that all properties be explicitly declared in the schemas.

   > Keep this consumer running.
8. Start a producer and pass a JSON Schema with `additionalProperties` explicitly set to `false`.
   > Return to the producer command window, and stop the producer with Ctl+C.

   > Type the following in the shell, and press return. This is the same producer and topic (`transactions-json`) used in the previous steps.
   > The schema is almost the same as the previous one, but in this example `additionalProperties` is explicitly set to false, as a part of the schema.
   > ```none
   > kafka-json-schema-console-producer --bootstrap-server localhost:9092  --property schema.registry.url=http://localhost:8081 --topic transactions-json \
   > --property value.schema='{"type":"object", "properties":{"id":{"type":"string"},"amount":{"type":"number"} }}  "additionalProperties": false}'
   > ```
9. In another shell, use `curl` to get the top-level compatibility configuration.
   > ```none
   > curl --silent -X GET http://localhost:8081/config
   > ```

   > Example result (this is the default):
   > ```none
   > {"compatibilityLevel":"BACKWARD"}
   > ```
10. Update the compatibility requirements globally.
    > ```none
    > curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \
    > --data '{"compatibility": "NONE"}' \
    > http://localhost:8081/config
    > ```

    > The output will be:
    > ```none
    > {"compatibilityLevel":"NONE"}
    > ```
11. Start a new producer and pass a JSON Schema with additionalProperties explicitly set to `false`.
    > (You can shut down the previous producer, and start this one in the same window.)
    > ```none
    > kafka-json-schema-console-producer --bootstrap-server localhost:9092 \
    > --property schema.registry.url=http://localhost:8081 --topic transactions-json \
    > --property value.schema='{"type":"object", "properties":{"id":{"type":"string"}, "amount":{"type":"number"} }, "additionalProperties": false}'
    > ```
12. Attempt to use the producer to send another record as the message value, which includes a new property not explicitly declared in the schema.
    > ```none
    > { "id":"1001","amount":500,"customer_id":"this-will-break"}
    > ```

    > This will break. You will get the following error:
    > ```none
    > org.apache.kafka.common.errors.SerializationException: Error serializing JSON message
    > ...
    > Caused by: org.apache.kafka.common.errors.SerializationException: JSON {"id":"1001","amount":500,"customer_id":"1222"} does not match schema
    > {"type":"object","properties":{"id":{"type":"string"},"amount":{"type":"number"}},"additionalProperties":false}       at
    > io.confluent.kafka.serializers.json.AbstractKafkaJsonSchemaSerializer.serializeImpl(AbstractKafkaJsonSchemaSerializer.java:132)
    > ... 5 more
    > Caused by: org.everit.json.schema.ValidationException: #: extraneous key [customer_id] is not permitted
    > ...
    > ```

    > The consumer will continue running, but no new messages will be displayed.

    > This is the same behavior you would see by default if using Avro or Protobuf in this scenario.
13. Rerun the producer in default mode as before and send a follow-on message with an undeclared property.
    > In the producer command window, stop the producer with Ctl+C.

    > Run the original producer command. There is no need to explicitly declare `additionalProperties` as `true` (although you could), as this is the default.
    > ```none
    > kafka-json-schema-console-producer --bootstrap-server localhost:9092 \
    > --property schema.registry.url=http://localhost:8081 --topic transactions-json \
    > --property value.schema='{"type":"object", "properties":{"id":{"type":"string"},"amount":{"type":"number"} }}'
    > ```
14. Use the producer to send another record as the message value, which again includes a new property not explicitly declared in the schema.
    > ```none
    > { "id":"1001","amount":500,"customer_id":"1222"}
    > ```
15. Return to the consumer session to read the new message.
    > The consumer should still be running and reading from topic `transactions-json`. You will see following new message in the console.
    > ```none
    > {"id":"1001","amount":500,"customer_id":"this-will-work-again"}
    > ```

    > More specifically, if you followed all steps in order and started the consumer with the `--from-beginning` flag
    > as mentioned earlier, the consumer shows a history of all messages sent:
    > ```none
    > {"id":"1000","amount":500}
    > {"id":"1000","amount":500,"customer_id":"1221"}
    > {"id":"1001","amount":500,"customer_id":"this-will-work-again"}
    > ```
16. In another shell, use this [curl](https://curl.haxx.se/docs/manual.html) command (piped through `jq` for readability) to query the schemas that were registered with Schema Registry as versions 1 and 2.
    > To query version 1 of the schema, type:
    > ```none
    > curl --silent -X GET http://localhost:8081/subjects/transactions-json-value/versions/1/schema | jq .
    > ```

    > Here is the expected output for version 1:
    > ```none
    > {
    >   "type": "object",
    >   "properties": {
    >     "id": {
    >       "type": "string"
    >     },
    >     "amount": {
    >       "type": "number"
    >     }
    >   }
    > }
    > ```

    > To query version 2 of the schema.
    > ```none
    > curl --silent -X GET http://localhost:8081/subjects/transactions-json-value/versions/2/schema | jq .
    > ```

    > Here is the expected output for version 2:
    > ```none
    > {
    >   "type": "object",
    >   "properties": {
    >     "id": {
    >       "type": "string"
    >     },
    >     "amount": {
    >       "type": "number"
    >     }
    >   },
    >   "additionalProperties": false
    > }
    > ```
17. View the latest version of the schema in more detail by running this command.
    > ```none
    > curl --silent -X GET http://localhost:8081/subjects/transactions-json-value/versions/latest | jq .
    > ```

    > Here is the expected output of the above command:
    > ```bash
    > "subject": "transactions-json-value",
    > "version": 2,
    > "id": 2,
    > "schemaType": "JSON",
    > "schema": "{\"type\":\"object\",\"properties\":{\"id\":{\"type\":\"string\"},\"amount\":{\"type\":\"number\"}},\"additionalProperties\":false}"
    > ```
18. Use Confluent Control Center to examine schemas and messages.
    >
    >

    > Messages that were successfully produced also show on Control Center ([http://localhost:9021/](http://localhost:9021/))
    > in **Topics > <topicName> > Messages**. You may have to select a partition or jump to a timestamp to see messages sent earlier.
    > (For timestamp, type in a number, which will default to partition `1/Partition: 0`, and press return. To get the message view shown here,
    > select the **cards** icon on the upper right.)
    > ![image](images/serdes-json-c3-messages.png)

    > Schemas you create are available on the **Schemas** tab for the selected topic.
    > ![image](images/serdes-json-c3-schema.png)
19.


    Run shutdown and cleanup tasks.
    - You can stop the consumer and producer with Ctl-C in their respective command windows.
    - To stop Confluent Platform, type `confluent local services stop`.
    - If you would like to clear out existing data (topics, schemas, and messages) before starting again with another test, type `confluent local destroy`.


### System topics and security configurations

The following configurations for system topics are available:

- `exporter.config.topic` - Stores configurations for the exporters. The default name for this topic is `_exporter_configs`, and its default/required configuration is: `numPartitions=1`, `replicationFactor=3`, and `cleanup.policy=compact`.
- `exporter.state.topic` - Stores the status of the exporters. The default name for this topic is `_exporter_states`, and its default/required configuration is: `numPartitions=1`, `replicationFactor=3`, and `cleanup.policy=compact`.

If you are using role-based access control (RBAC), `exporter.config.topic` and `exporter.state.topic` require `ResourceOwner` on these topics, as does the `_schemas` internal topic.
See also, [Use Role-Based Access Control (RBAC) in Confluent Cloud](https://docs.confluent.io/cloud/current/access-management/access-control/cloud-rbac.html#)
and [Configuring Role-Based Access Control for Schema Registry on Confluent Platform](https://docs.confluent.io/platform/current/schema-registry/security/rbac-schema-registry.html).

If you are configuring Schema Registry on Confluent Platform using the [Schema Registry Security Plugin](https://docs.confluent.io/platform/current/confluent-security-plugins/schema-registry/install.html),
you must activate both the exporter and the [Schema Registry security plugin](https://docs.confluent.io/platform/current/confluent-security-plugins/schema-registry/install.html#activate-the-plugins)
by specifying both extension classes in the `$CONFLUENT_HOME/etc/schema-registry/schema-registry.properties` files:

```bash
resource.extension.class=io.confluent.kafka.schemaregistry.security.SchemaRegistrySecurityResourceExtension,io.confluent.schema.exporter.SchemaExporterResourceExtension
```

The configuration for the exporter resource extension class in the `schema-registry.properties` is described in [Set up source and destination environments](https://docs.confluent.io/platform/current/schema-registry/schema-linking-cp.html#set-up-source-and-destination-environments)
in Schema Linking on Confluent Platform.


## Prerequisites and Setting Schema Registry URLs on the Brokers

Basic requirements to run these examples are generally the same as those described for the [Schema Registry Tutorial](schema_registry_onprem_tutorial.md#sr-tutorial-prereqs)
with the exception of Maven, which is not needed here. Also, Confluent Platform version 5.4.0 or later is required here.

As an additional prerequisite to enable Schema ID Validation on the brokers, you must specify `confluent.schema.registry.url` in the Kafka
`server.properties` file (`$CONFLUENT_HOME/etc/kafka/server.properties`) before you start Confluent Platform.  This tells the broker
how to connect to Schema Registry.

For example:

```none
confluent.schema.registry.url=http://schema-registry:8081
```

This configuration accepts a comma-separated list of URLs for Schema Registry instances. This setting is required to make Schema ID Validation available both from the
[Confluent CLI](/ccloud-cli/current/command-reference/index.html) and on the [Control Center for Confluent Platform](https://docs.confluent.io/control-center/current/overview.html).


### Basic Authentication

For this setup, the brokers are configured to authenticate to Schema Registry using [basic authentication](../security/authentication/http-basic-auth/overview.md#http-basic-auth).

Define the following settings on each broker (`$CONFLUENT_HOME/etc/kafka/server.properties`).

```bash
confluent.schema.registry.url=http://<host>:<port>
confluent.basic.auth.credentials.source=<USER_INFO, URL, or SASL_INHERIT>
confluent.basic.auth.user.info=<username>:<password> #required only if credentials source is set to USER_INFO
```

- The property `confluent.basic.auth.credentials.source` defines the type of credentials to use (user name and password). These are literals, not variables.
- If you set `confluent.basic.auth.credentials` to `USER_INFO`, you must also specify `confluent.basic.auth.user.info`.


## Configure Kafka clients

You can configure the JAAS configuration property for each client in `producer.properties`
or `consumer.properties` files. The login module describes how the clients
like producer and consumer can connect to the Confluent Server broker. The following is an
example configuration for a Kafka client to use token authentication:

```bash
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
  username="tokenID123" \
  password="lAYYSFmLs4bTjf+lTZ1LCHR/ZZFNA==" \
  tokenauth="true";
```

The options `username` and `password` are used by clients to configure the
token ID and token HMAC. And the option `tokenauth` is used to indicate the
server about token authentication. In this example, clients connect to the Confluent Server
broker using token ID: `tokenID123`. Different clients within a JVM may connect
using different tokens by specifying different token details in `sasl.jaas.config`.


#### NOTE
For details on all required and optional broker configuration properties, see [Kafka Broker and Controller Configuration Reference for Confluent Platform](../../../installation/configuration/broker-configs.md#cp-config-brokers).

1. Configure the truststore, keystore, and password in the `server.properties`
   file of every broker. Because this stores passwords directly in the broker
   configuration file, it is important to restrict access to these files using
   file system permissions.
   ```bash
   ssl.truststore.location=/var/private/ssl/kafka.server.truststore.jks
   ssl.truststore.password=test1234
   ssl.keystore.location=/var/private/ssl/kafka.server.keystore.jks
   ssl.keystore.password=test1234
   ssl.key.password=test1234
   ```

   Note that `ssl.truststore.password` is technically optional, but strongly
   recommended. If a password is not set, access to the truststore is still
   available, but integrity checking is disabled.
2. If you want to enable TLS for interbroker communication, add the following
   to the broker properties file, which defaults to `PLAINTEXT`:
   ```bash
   security.inter.broker.protocol=SSL
   ```
3. Configure the ports for the Apache Kafka® brokers to listen for client and interbroker
   TLS (`SSL`) connections. You should configure `listeners`, and optionally,
   `advertised.listeners` if the value is different from `listeners`.
   ```bash
   listeners=SSL://kafka1:9093
   advertised.listeners=SSL://localhost:9093
   ```
4. Configure both TLS (`SSL`) ports and `PLAINTEXT` ports if:
   * TLS is not enabled for interbroker communication
   * Some clients connecting to the Confluent Platform cluster do not use TLS

   ```bash
   listeners=PLAINTEXT://kafka1:9092,SSL://kafka1:9093
   advertised.listeners=PLAINTEXT://localhost:9092,SSL://localhost:9093
   ```

   Note that `advertised.host.name` and `advertised.port` configure a
   single `PLAINTEXT` port and are incompatible with secure protocols. Use
   `advertised.listeners` instead.
5. To enable the broker to authenticate clients (two-way authentication), you must
   configure all the brokers for client authentication. Configure
   this to use  `required` rather than `requested` because misconfigured
   clients can still connect successfully and this provides a false sense of
   security.
   ```bash
   ssl.client.auth=required
   ```

   #### NOTE
   If you specify `ssl.client.auth=required`, client authentication
   fails if valid client certificates are not provided. SASL listeners can be
   enabled in parallel to mTLS if you have defined SASL listeners with the
   following listener prefix:
   ```bash
   listener.name.<saslListenerName>.ssl.client.auth
   ```

   For details, see [KIP-684](https://cwiki.apache.org/confluence/display/KAFKA/KIP-684+-+Support+mutual+TLS+authentication+on+SASL_SSL+listeners#KIP684SupportmutualTLSauthenticationonSASL_SSLlisteners-UseadifferentconfigurationoptionforenablingmTLSwithSASL_SSL).


#### SEE ALSO
To see an example Confluent Replicator configuration, refer to the [TLS source authentication demo script](https://github.com/confluentinc/examples/tree/latest//replicator-security/scripts/submit_replicator_source_ssl_auth.sh). For demos of common security configurations refer to [Replicator security demos](https://github.com/confluentinc/examples/tree/latest//replicator-security).

To configure Confluent Replicator for a destination cluster with TLS authentication, modify
the Replicator JSON configuration to include the following:

```bash
{
  "name":"replicator",
    "config":{
      ....
      "dest.kafka.ssl.truststore.location":"/etc/kafka/secrets/kafka.connect.truststore.jks",
      "dest.kafka.ssl.truststore.password":"confluent",
      "dest.kafka.ssl.keystore.location":"/etc/kafka/secrets/kafka.connect.keystore.jks",
      "dest.kafka.ssl.keystore.password":"confluent",
      "dest.kafka.ssl.key.password":"confluent",
      "dest.kafka.security.protocol":"SSL"
      ....
    }
  }
}
```

Additionally, the following properties are required in the Connect worker:

```bash
security.protocol=SSL
ssl.truststore.location=/etc/kafka/secrets/kafka.connect.truststore.jks
ssl.truststore.password=confluent
ssl.keystore.location=/etc/kafka/secrets/kafka.connect.keystore.jks
ssl.keystore.password=confluent
ssl.key.password=confluent
producer.security.protocol=SSL
producer.ssl.truststore.location=/etc/kafka/secrets/kafka.connect.truststore.jks
producer.ssl.truststore.password=confluent
producer.ssl.keystore.location=/etc/kafka/secrets/kafka.connect.keystore.jks
producer.ssl.keystore.password=confluent
producer.ssl.key.password=confluent
```

For more details, see [general security configuration for Connect workers](../../../connect/security.md#connect-security).


### Configure Connect Worker level configurations for connectors

Add the following configurations to enable OAuth authentication for Kafka Connect
workers, allowing them to securely produce and consume messages using the
SASL_SSL protocol. By specifying the OAUTHBEARER mechanism, these settings
ensure that both producers and consumers authenticate using OAuth tokens,
leveraging the OAuthBearerLoginCallbackHandler for token management. The use
of SASL_SSL ensures that data in transit is encrypted, enhancing the security
of your Kafka Connect deployment. Replace the placeholder values with your
actual configuration values.

```properties
producer.security.protocol=SASL_SSL
producer.sasl.mechanism=OAUTHBEARER
producer.sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginCallbackHandler
producer.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
    clientId="<client-id>" \
    clientSecret="<client-secret>" \
    scope="<scope>";

consumer.security.protocol=SASL_SSL
consumer.sasl.mechanism=OAUTHBEARER
consumer.sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginCallbackHandler
consumer.sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
    clientId="<client-id>" \
    clientSecret="<client-secret>" \
    scope="<scope>";
```


## How to Migrate

This section explains the processes you need to follow to migrate from mTLS
based to OAuth authentication in your Confluent Platform clusters.

1. List the existing ACLs, using the following Confluent CLI command:
   ```text
   confluent kafka acl list
   ```

   This command lists the ACLs currently in place. Save this list for comparison
   after you complete the migration.
2. Make sure you are running Confluent Platform version 7.7 or later.
3. Configure the `AuthenticationHandler` according to [Use the AuthenticationHandler Class for Multi-Protocol Authentication in Confluent Platform](../multi-protocol/authenticationhandler.md#authenticationhandler).
4. Enable OAuth/OIDC for Confluent Server and other Confluent Platform services.
   * [Configure Confluent Schema Registry for OAuth Authentication in Confluent Platform](configure-sr.md#configure-sr-for-oauth)
   * [Configure Kafka Connect for OAuth Authentication in Confluent Platform](configure-connect.md#configure-connect-for-oauth)
   * [Configure Confluent Server Brokers for OAuth Authentication in Confluent Platform](configure-cs.md#configure-cs-for-oauth)
5. Change one or more client authentication from mTLS to OAuth.
6. Bring your cluster up.
7. Restart your client applications.
8. List the ACLs again to verify the principals remain compatible.
   ```text
   confluent kafka acl list
   ```

   Compare the list to the one you created in step 1. If the principals remain
   the same, no changes to the ACLs are necessary. The ACLs continue to work as
   before and are evaluated based on the OAuth principal instead of the mTLS
   principal.

Alternatively, you can [Deploy with Ansible Playbooks](https://docs.confluent.io/ansible/current/overview.html) and [Confluent for
Kubernetes](https://docs.confluent.io/operator/current/) to upgrade to 7.7
and migrate from mTLS to OAuth.


### Enable RBAC and Metadata Service (MDS)

Brokers are now ready to be RBAC-enabled. Perform these configuration updates for each broker and
incrementally update all brokers using rolling restart.

Configure broker to use Confluent Server Authorizer. You must retain ACL provider along with RBAC to ensure that existing ACLs
are applied. Configure at least one principal in `super.users` for brokers in the metadata cluster
to enable role bindings to be created for other clusters. In this example, the user `admin`
is granted access to create role bindings for any cluster.

```RST
authorizer.class.name=io.confluent.kafka.security.authorizer.ConfluentServerAuthorizer
confluent.authorizer.access.rule.providers=ZK_ACL,CONFLUENT
super.users=User:admin
```

Follow the instructions in [Configure Metadata Service (MDS) in Confluent Platform](../../../kafka/configure-mds/index.md#rbac-mds-config) to create a key pair for MDS. Configure MDS
on the broker. You must update paths to the key files to match your set up.

```RST
confluent.metadata.server.listeners=http://0.0.0.0:8090
confluent.metadata.server.advertised.listeners=http://localhost:8090
confluent.metadata.server.authentication.method=BEARER
confluent.metadata.server.token.key.path=<path-to-token-key-pair.pem>
```

If you are using other Confluent Platform components, create a new listener to enable token-based
authentication using MDS.

```RST
listeners=EXTERNAL://:9092,INTERNAL://:9093,TOKEN://:9094
advertised.listeners=EXTERNAL://localhost:9092,INTERNAL://localhost:9093,TOKEN://localhost:9094
listener.security.protocol.map=EXTERNAL:SASL_PLAINTEXT,INTERNAL:SASL_PLAINTEXT,TOKEN:SASL_PLAINTEXT
listener.name.token.sasl.enabled.mechanisms=OAUTHBEARER
listener.name.token.oauthbearer.sasl.jaas.config= \
  org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  publicKeyPath="/path/to/publickey.pem";
listener.name.token.oauthbearer.sasl.login.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerServerLoginCallbackHandler
listener.name.token.oauthbearer.sasl.server.callback.handler.class=io.confluent.kafka.server.plugins.auth.token.TokenBearerValidatorCallbackHandler
```

If you are using [LDAP group-based authorization](../ldap/configure.md#kafka-ldap-config) in any of your clusters,
you must [configure LDAP](../../csa-introduction.md#confluent-server-authorizer) for brokers running MDS. You must use
the prefix `ldap.` in all the LDAP configs. For example:

```RST
ldap.java.naming.provider.url=ldap://LDAPSERVER.EXAMPLE.COM:3268/DC=EXAMPLE,DC=COM
```

If you are enabling RBAC in other Confluent Platform components, you should configure brokers running MDS with
LDAP configs that match your LDAP server to enable centralized authentication using LDAP. Refer to
[Configure LDAP Authentication](../../../kafka/configure-mds/ldap-auth-mds.md#ldap-auth-mds) for details.

If your metadata cluster has less than 3 brokers, adjust the replication factor for metadata topics.
For example:

```RST
confluent.metadata.topic.replication.factor=2
confluent.license.replication.factor=2
```


### Roles for accessing topics, streams, and tables

Use the following Confluent CLI commands to give an interactive user the
necessary roles for creating streams and table.

SHOW or PRINT a topic
: - `ResourceOwner` role on the Kafka topic
  - `DeveloperRead` role on the Schema Registry subject, if the topic has an Avro,
    Protobuf, or JSON_SR schema


  This role enables an interactive user to display the specified topic by using
  the SHOW and PRINT statements. Also, users can CREATE streams and tables from
  these topics.


  The ksqlDB service principal doesn’t need a role on the topic for these statements.


  #### NOTE
  The subject’s name is the topic’s name appended with `-value`.


  ```bash
  # Grant read-only access for a user to read a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # Grant read-only access for a user to read a subject.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role DeveloperRead \
    --resource Subject:${TOPIC_NAME}-value
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID
  ```

SELECT from a stream or table
: - `ResourceOwner` role on the source topic
  - `DeveloperRead` role on the `_confluent-ksql-${KSQLDB_ID}` consumer group
  - `ResourceOwner` role on the `_confluent-ksql-${KSQLDB_ID}transient`
    transient topics, for tables
  - `DeveloperRead` role on the Schema Registry subject, if the topic has an Avro,
    Protobuf, or JSON_SR schema
  - `ResourceOwner` role on the `_confluent-ksql-<cluster-id>transient*`
    subjects, for tables that use Avro (not required for streams)


  These roles enable a user to read from a stream or a table by using the
  SELECT statement. If a SELECT statement contains a JOIN that uses an
  unauthorized topic, the SELECT fails with an authorization error.


  ```bash
  # Grant read-only access for a user to read a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # For tables: grant access to the transient query topics.
  # This is a limitation of ksqlDB tables. Giving this permission to
  # the prefixed topics lets the user view tables from other queries.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:_confluent-ksql-${KSQLDB_ID}transient \
    --prefix \
    $KAFKA_CLUSTER_ID


  # For tables that use Avro: grant access to the transient subjects.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Subject:_confluent-ksql-${KSQLDB_ID}transient \
    --prefix \
    $KAFKA_CLUSTER_ID \
    --schema-registry-cluster $SR_ID
  ```


  ```bash
  # Grant read-only access for a user to read from a consumer group.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role DeveloperRead \
    --resource Group:_confluent-ksql-${KSQLDB_ID} \
    --prefix \
    $KAFKA_ID
  ```


  ```bash
  # Grant read-only access for a user to read a subject.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role DeveloperRead \
    --resource Subject:${TOPIC_NAME}-value
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID
  ```

Write to a topic with INSERT
: - `DeveloperWrite` role on the Kafka topic
  - `ResourceOwner` or `DeveloperWrite` role on the Schema Registry subject,
    if the topic has an Avro, Protobuf, or JSON_SR schema


  These roles enable a user to write data by using INSERT statements.
  The INSERT INTO statement contains a SELECT clause that requires the user
  to have read permissions on the topic in the query.


  ```bash
  # Grant write access for a user to a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role DeveloperWrite \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # Grant full access for a user to create a subject and write to it.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Subject:${TOPIC_NAME}-value
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID
  ```

CREATE STREAM
: - `ResourceOwner` role on the source topic
  - `DeveloperRead` role on the `_confluent-ksql-${KSQLDB_ID}` consumer groups
  - `ResourceOwner` or `DeveloperWrite` role on the Schema Registry subject,
    if the source topic has an Avro, Protobuf, or JSON_SR schema


  These roles enable an interactive user to register a stream or table on the
  specified topic by using the CREATE STREAM statement.


  If the topic has an Avro, Protobuf, or JSON_SR schema, the interactive user
  and the ksqlDB service principal must have full access for the subject in Schema Registry.


  ```bash
  # Grant read-only access for a user to a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID


  # Grant read-only access for the ksql service principal to a topic.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # Grant read-only access for a user to the ksqlDB consumer groups.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role DeveloperRead \
    --resource Group:_confluent-ksql-${KSQLDB_ID} \
    --prefix \
    $KAFKA_ID


  # Grant read-only access for the ksqlDB service principal to the ksqlDB consumer groups.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role DeveloperRead \
    --resource Group:_confluent-ksql-${KSQLDB_ID} \
    --prefix \
    $KAFKA_ID
  ```


  ```bash
  # Grant full access for a user to create a subject and write to it.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Subject:${TOPIC_NAME}-value \
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID


  # Grant full access for the ksqlDB service principal to create a subject and write to it.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Subject:${TOPIC_NAME}-value \
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID
  ```

CREATE TABLE
: - `ResourceOwner` role on the source topic
  - `ResourceOwner` role on the `_confluent-ksql-${KSQLDB_ID}transient` transient topics
  - `ResourceOwner` or `DeveloperWrite` role on the Schema Registry subject,
    if the source topic has an Avro, Protobuf, or JSON_SR schema
  - `ResourceOwner` role on `_confluent-ksql-${KSQLDB_ID}*` subjects (for
    tables that use Avro, Protobuf, or JSON_SR)


  These roles enable an interactive user to register a table on the
  specified topic by using the CREATE TABLE statement.


  If the topic has an Avro, Protobuf, or JSON_SR schema, the interactive user
  and the ksqlDB service principal must have full access for the subject in Schema Registry.


  #### NOTE
  The `ResourceOwner` role on the transient topics is a limitation of KSQL
  tables. Giving this permission to the prefixed topics lets the user view
  tables from other queries.


  ```bash
  # Grant read-only access for a user to a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID


  # Grant read-only access for ksql service principal to a topic.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # Grant full access for a user to the transient query topics.
  # This is a limitation of ksqlDB tables. Giving this permission to
  # the prefixed topics lets the user view tables from other queries.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:_confluent-ksql-${KSQLDB_ID}transient \
    --prefix \
    $KAFKA_ID


  # Grant full access for the ksql service principal to ksqlDB transient topics.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Topic:_confluent-ksql-${KSQLDB_ID}transient \
    --prefix \
    $KAFKA_ID
  ```


  ```bash
  # Grant full access for a user to create a subject and write to it.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Subject:${TOPIC_NAME}-value \
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID


  # Grant full access for the ksql service principal to create a subject and write to it.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Subject:${TOPIC_NAME}-value \
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID


  # For tables that use Avro, Protobuf, or JSON_SR:
  # Grant full access for the ksql service principal to all internal ksql subjects.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Subject:_confluent-ksql-${KSQLDB_ID} \
    --prefix \
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID
  ```

Create a stream or table with a persistent query
: - `ResourceOwner` role on the source topic
  - `ResourceOwner` role on the ksqlDB sink topic
  - `ResourceOwner` or `DeveloperWrite` role on the Schema Registry sink subject,
    if the source topic has an Avro, Protobuf, or JSON_SR schema
  - `DeveloperRead` role on `_confluent-ksql-${KSQLDB_ID}*` subjects (for
    tables that use Avro, Protobuf, or JSON_SR)


  These roles enable a user to create streams and tables with persistent
  queries. Because ksqlDB creates a new *sink topic*, the user must have
  sufficient permissions to create, read, and write to the sink topic. The
  `ResourceOwner` role is necessary on the sink topic, because the interactive
  user and the ksqlDB service principal need permissions to create the sink topic if it
  doesn’t exist already.


  #### NOTE
  The sink topic has the same name as the stream or table and is all
  uppercase.


  If the topic has an Avro schema, the interactive user and the ksqlDB service principal
  must have `ResourceOwner` or `DeveloperWrite` permission on the sink
  topic’s subject in Schema Registry.


  For tables that are created with a persistent query and use Avro, Protobuf,
  or JSON_SR the ksqlDB service principal must have `DeveloperRead` permission on
  all internal subjects.


  #### NOTE
  The subject’s name is the sink topic’s name appended with `-value`.


  ```bash
  # Grant read-only access for a user to a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:$SOURCE_TOPIC_NAME \
    $KAFKA_ID


  # Grant read-only access for the ksql service principal to a topic.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Topic:$SOURCE_TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # Grant read-only access for a user to a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:$SINK_TOPIC_NAME \
    $KAFKA_ID


  # Grant read-only access for the ksql service principal to a topic.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Topic:$SINK_TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # Grant full access for a user to create a subject and write to it.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Subject:${SINK_TOPIC_NAME}-value \
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID


  # Grant full access for the ksql service principal to create a subject and write to it.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role ResourceOwner \
    --resource Subject:${SINK_TOPIC_NAME}-value \
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID


  # For tables that use Avro, Protobuf, or JSON_SR created with a persistent query:
  # Grant read access for the ksql service principal to all internal ksql subjects.
  confluent iam rbac role-binding create \
    --principal User:ksql \
    --role DeveloperRead \
    --resource Subject:_confluent-ksql-${KSQLDB_ID} \
    --prefix \
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID
  ```

Full control over a topic and schema
: - `ResourceOwner` role on the Kafka topic
  - `ResourceOwner` role on the Schema Registry subject, if the topic has an Avro,
    Protobuf, or JSON_SR schema


  Use the following commands to grant a user full control over a topic and its
  schema, including permissions to delete the topic and schema.


  ```bash
  # Grant full access for a user to manage a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # Grant full access for a user to manage a subject.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role ResourceOwner \
    --resource Subject:${TOPIC_NAME}-value
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID
  ```

Delete a topic
: - `DeveloperManage` role on the Kafka topic
  - `DeveloperManage` role on the Schema Registry subject, if the topic has an Avro,
    Protobuf, or JSON_SR schema


  These roles enable a user to delete a topic by using the DROP STREAM/TABLE
  [DELETE TOPIC] statements.


  Use the following commands to grant a user delete access to a topic and
  corresponding schema.


  ```bash
  # Grant delete access for a user to a topic.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role DeveloperManage \
    --resource Topic:$TOPIC_NAME \
    $KAFKA_ID
  ```


  ```bash
  # Grant delete access for a user to a subject.
  confluent iam rbac role-binding create \
    --principal User:$USER_NAME \
    --role DeveloperManage \
    --resource Subject:${TOPIC_NAME}-value
    $KAFKA_ID \
    --schema-registry-cluster $SR_ID
  ```


### POST /security/1.0/principals/{principal}/roles/{roleName}

**Binds the principal to a cluster-scoped role for a specific cluster or in the given scope.**

Callable by Admins.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
  * **roleName** (*string*) – The name of the cluster-scoped role to bind the user to.

**Example request:**

```http
POST /security/1.0/principals/{principal}/roles/{roleName} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusterName": "string",
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – Role Granted
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### DELETE /security/1.0/principals/{principal}/roles/{roleName}

**Remove the role (cluster or resource scoped) from the principal at the given scope/cluster.**

No-op if the user doesn’t have the role.

Callable by Admins.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
  * **roleName** (*string*) – The name of the role.

**Example request:**

```http
DELETE /security/1.0/principals/{principal}/roles/{roleName} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusterName": "string",
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – Role removal processed.
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/principals/{principal}/roles/{roleName}/bindings

**Incrementally grant the resources to the principal at the given scope/cluster using the given role.**

Callable by Admins+ResourceOwners.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
  * **roleName** (*string*) – The name of the role.

**Example request:**

```http
POST /security/1.0/principals/{principal}/roles/{roleName}/bindings HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "scope": {
        "clusterName": "string",
        "clusters": {
            "kafka-cluster": "string",
            "connect-cluster": "string",
            "ksql-cluster": "string",
            "schema-registry-cluster": "string",
            "cmf": "string",
            "flink-environment": "string"
        }
    },
    "resourcePatterns": [
        {
            "resourceType": "string",
            "name": "string",
            "patternType": "string"
        }
    ]
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – Granted
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### DELETE /security/1.0/principals/{principal}/roles/{roleName}/bindings

**Incrementally remove the resources from the principal at the given scope/cluster using the given role.**

Callable by Admins+ResourceOwners.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
  * **roleName** (*string*) – The name of the role.

**Example request:**

```http
DELETE /security/1.0/principals/{principal}/roles/{roleName}/bindings HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "scope": {
        "clusterName": "string",
        "clusters": {
            "kafka-cluster": "string",
            "connect-cluster": "string",
            "ksql-cluster": "string",
            "schema-registry-cluster": "string",
            "cmf": "string",
            "flink-environment": "string"
        }
    },
    "resourcePatterns": [
        {
            "resourceType": "string",
            "name": "string",
            "patternType": "string"
        }
    ]
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – Resources Removed
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### PUT /security/1.0/principals/{principal}/roles/{roleName}/bindings

**Overwrite existing resource grants.**

Callable by Admins+ResourceOwners.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
  * **roleName** (*string*) – The name of the role.

**Example request:**

```http
PUT /security/1.0/principals/{principal}/roles/{roleName}/bindings HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "scope": {
        "clusterName": "string",
        "clusters": {
            "kafka-cluster": "string",
            "connect-cluster": "string",
            "ksql-cluster": "string",
            "schema-registry-cluster": "string",
            "cmf": "string",
            "flink-environment": "string"
        }
    },
    "resourcePatterns": [
        {
            "resourceType": "string",
            "name": "string",
            "patternType": "string"
        }
    ]
}
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – Resources Set
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/lookup/principals/{principal}/roleNames

**Returns the effective list of role names for a principal.**

For groups, these are the roles that are bound.

For users, this is the combination of roles granted to the specific user and roles
granted to the user’s groups.

Callable by Admins+User.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.

**Example request:**

```http
POST /security/1.0/lookup/principals/{principal}/roleNames HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusterName": "string",
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    List of role names.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    [
        "Cluster Admin",
        "Security Admin"
    ]
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/lookup/role/{roleName}

**Look up the KafkaPrincipals who have the given role for the given scope.**

Callable by Admins.

* **Parameters:**
  * **roleName** (*string*) – Role name to look up.

**Example request:**

```http
POST /security/1.0/lookup/role/{roleName} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusterName": "string",
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    List of fully-qualified KafkaPrincipals.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    [
        "User:alice",
        "Group:FinanceAdmin"
    ]
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/lookup/role/{roleName}/resource/{resourceType}/name/{resourceName}

**Look up the KafkaPrincipals who have the given role on the specified resource for the given scope.**

Callable by Admins.

* **Parameters:**
  * **roleName** (*string*) – Role name to look up.
  * **resourceType** (*string*) – Type of resource to look up.
  * **resourceName** (*string*) – Name of resource to look up.

**Example request:**

```http
POST /security/1.0/lookup/role/{roleName}/resource/{resourceType}/name/{resourceName} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusterName": "string",
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    List of fully-qualified KafkaPrincipals.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    [
        "User:alice",
        "Group:FinanceAdmin"
    ]
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### GET /security/1.0/registry/clusters

**Returns a list of all clusters in the registry, optionally filtered by cluster type.**

If the calling principal doesn’t have permissions to see the full cluster info,
some information (“hosts”, “protocol”, etc) is redacted.

Callable by Admins+User.

* **Query Parameters:**
  * **clusterType** (*string*) – Optionally filter down by cluster type.

**Example request:**

```http
GET /security/1.0/registry/clusters HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    List of Clusters.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    [
        {
            "clusterName": "string",
            "scope": {
                "clusters": {
                    "kafka-cluster": "string",
                    "connect-cluster": "string",
                    "ksql-cluster": "string",
                    "schema-registry-cluster": "string",
                    "cmf": "string",
                    "flink-environment": "string"
                }
            },
            "hosts": [
                {
                    "host": "string",
                    "port": 1
                }
            ],
            "protocol": "SASL_PLAINTEXT"
        }
    ]
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/registry/clusters

**Define/overwrite named clusters.**

May result in a 409 Conflict if the name and scope combination of any cluster
conflicts with existing clusters in the registry.

Callable by Admins.

**Example request:**

```http
POST /security/1.0/registry/clusters HTTP/1.1
Host: example.com
Content-Type: application/json

[
    {
        "clusterName": "string",
        "scope": {
            "clusters": {
                "kafka-cluster": "string",
                "connect-cluster": "string",
                "ksql-cluster": "string",
                "schema-registry-cluster": "string",
                "cmf": "string",
                "flink-environment": "string"
            }
        },
        "hosts": [
            {
                "host": "string",
                "port": 1
            }
        ],
        "protocol": "SASL_PLAINTEXT"
    }
]
```

* **Status Codes:**
  * [204 No Content](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5) – Clusters added.
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### GET /security/1.0/registry/clusters/{clusterName}

**Returns the information for a single named cluster, assuming the cluster exists and is visible to the calling principal.**

Callable by Admins+User.

* **Parameters:**
  * **clusterName** (*string*) – The name of cluster (ASCII printable characters without spaces).

**Example request:**

```http
GET /security/1.0/registry/clusters/{clusterName} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The named cluster, if it exists and the caller has permission to see it.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "clusterName": "string",
        "scope": {
            "clusters": {
                "kafka-cluster": "string",
                "connect-cluster": "string",
                "ksql-cluster": "string",
                "schema-registry-cluster": "string",
                "cmf": "string",
                "flink-environment": "string"
            }
        },
        "hosts": [
            {
                "host": "string",
                "port": 1
            }
        ],
        "protocol": "SASL_PLAINTEXT"
    }
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### GET /security/1.0/lookup/managed/clusters/principal/{principal}

**Identifies the scopes for the rolebindings that a user can see.**

May include rolebindings from scopes and clusters that never existed or previously existed
(in other words, rolebindings that have been decommissioned, but are still defined in the system).

Callable by Admins+User.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
* **Query Parameters:**
  * **clusterType** (*string*) – Filter down by cluster type.

**Example request:**

```http
GET /security/1.0/lookup/managed/clusters/principal/{principal} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    List of Scopes

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    [
        {
            "clusters": {
                "kafka-cluster": "string",
                "connect-cluster": "string",
                "ksql-cluster": "string",
                "schema-registry-cluster": "string",
                "cmf": "string",
                "flink-environment": "string"
            }
        }
    ]
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### GET /security/1.0/lookup/rolebindings/principal/{principal}

**List all rolebindings for the specifed principal for all scopes and clusters that have any rolebindings.**

Be aware that this simply looks at the rolebinding data, and does not mean that
the clusters actually exist.

Callable by Admins+User.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
* **Query Parameters:**
  * **clusterType** (*string*) – Filter down by a cluster type.

**Example request:**

```http
GET /security/1.0/lookup/rolebindings/principal/{principal} HTTP/1.1
Host: example.com
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    List of RoleBindings for the user per scope.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    [
        {
            "scope": {
                "clusterName": "string",
                "clusters": {
                    "kafka-cluster": "string",
                    "connect-cluster": "string",
                    "ksql-cluster": "string",
                    "schema-registry-cluster": "string",
                    "cmf": "string",
                    "flink-environment": "string"
                }
            },
            "rolebindings": {}
        }
    ]
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/lookup/rolebindings/principal/{principal}

**List all rolebindings for the specifed principal and scope.**

Callable by Admins+User.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.

**Example request:**

```http
POST /security/1.0/lookup/rolebindings/principal/{principal} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    Item per Scope

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "scope": {
            "clusterName": "string",
            "clusters": {
                "kafka-cluster": "string",
                "connect-cluster": "string",
                "ksql-cluster": "string",
                "schema-registry-cluster": "string",
                "cmf": "string",
                "flink-environment": "string"
            }
        },
        "rolebindings": {}
    }
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/lookup/managed/clusters/principal/{principal}

**Identify the rolebinding abilities (view vs manage) the user has on the specified scope.**

Used by the Confluent Control Center UI to control access to rolebinding add/remove buttons.

Callable by Admins+ResourceOwners.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.

**Example request:**

```http
POST /security/1.0/lookup/managed/clusters/principal/{principal} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The rolebinding abilities the user has for a specified scope.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "cluster": [
            "string"
        ],
        "resources": {}
    }
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/lookup/managed/rolebindings/principal/{principal}

**Identify the rolebindings this user can see and manage.**

Callable by Admins+ResourceOwners.

* **Parameters:**
  * **principal** (*string*) – Fully-qualified KafkaPrincipal string for a user or group.
* **Query Parameters:**
  * **resourceType** (*string*) – Filter down by resource type.

**Example request:**

```http
POST /security/1.0/lookup/managed/rolebindings/principal/{principal} HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    Rolebindings that the user can manage.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "scope": {
            "clusters": {
                "kafka-cluster": "string",
                "connect-cluster": "string",
                "ksql-cluster": "string",
                "schema-registry-cluster": "string",
                "cmf": "string",
                "flink-environment": "string"
            }
        },
        "cluster_role_bindings": {},
        "resource_role_bindings": {}
    }
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


### POST /security/1.0/rbac/principals

**List of MDS cached users and groups.**

For use by a rolebinding admin on the provided scope.

Callable by Admins+ResourceOwners, but not broker super.users.

* **Query Parameters:**
  * **type** (*string*) – The type of principals requested.

**Example request:**

```http
POST /security/1.0/rbac/principals HTTP/1.1
Host: example.com
Content-Type: application/json

{
    "clusters": {
        "kafka-cluster": "string",
        "connect-cluster": "string",
        "ksql-cluster": "string",
        "schema-registry-cluster": "string",
        "cmf": "string",
        "flink-environment": "string"
    }
}
```

* **Status Codes:**
  * [200 OK](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1) –

    The list of principals for the requested type, or all principals.

    **Example response:**
    ```http
    HTTP/1.1 200 OK
    Content-Type: application/json

    [
        "Group:admin",
        "Group:developers",
        "Group:users",
        "User:alice",
        "User:bob",
        "User:charlie",
        "User:david"
    ]
    ```
  * *default* –

    Error Response

    **Example response:**
    ```http
    HTTP/1.1 default -
    Content-Type: application/json

    {
        "status_code": 1,
        "error_code": 1,
        "type": "string",
        "message": "string",
        "errors": [
            {
                "error_type": "string",
                "message": "string"
            }
        ]
    }
    ```


## Grant topic permissions

To interact with topics using the [Kafka CLI tools](../../../tools/cli-reference.md#cp-all-cli), you must
provide a JAAS configuration that enables Kafka CLI tools to authenticate with a broker.
You can provide the JAAS configuration using a file (`--command-config`) or using the
command line options `--producer-property` or `--consumer-property` for the producer or
consumer. This configuration is required for creating topics, producing, consuming,
and more. For example:

```none
kafka-console-producer --producer-property sasl.mechanism=OAUTHBEARER
```

The value you specify in `sasl.mechanism` depends on your broker’s security
configuration for the port. In this case, OAUTHBEARER is used because it is the
default configuration in the automated RBAC demo. However, you can use any
authentication mechanism exposed by your broker.


# Grant read-only access for a user to a topic.
confluent iam rbac role-binding create \
  --principal User:<user-name> \
  --role DeveloperRead \
  --resource Topic:<topic-name> \
  <kafka-id>
```

When creating role bindings for Schema Registry, ksqlDB, and Connect you must
provide two identifiers: the Kafka cluster identifier and an additional component
cluster identifier. For example, the following command assigns the `DeveloperWrite`
role on a topic in a Schema Registry cluster:

```bash

## Prerequisites

- [Migrate legacy audit log configurations](audit-logs-cli-config.md#migrate-kafka-cluster-audit-log-configs)
  from all of your Kafka clusters into a combined JSON policy.

  #### IMPORTANT
  You must satisfy this prerequisite prior to registering
  all of the Kafka clusters.
- [Register](../../cluster-registry.md#cluster-registry-registering) all of your Kafka clusters,
  including the MDS cluster, in the [Cluster Registry in Confluent Platform](../../cluster-registry.md#cluster-registry).

  #### NOTE
  MDS cluster registration does not occur by default. You must explicitly register
  the MDS cluster in the cluster registry before registering other clusters.
- Configure all of your registered clusters to use the same MDS for RBAC.
- The MDS cluster uses the admin client to communicate with registered clusters
  (managed clusters). Ensure that the MDS admin client can connect to all of your
  registered clusters by having them expose an authentication token listener (for
  example, `listener.name.external.sasl.enabled.mechanisms=OAUTHBEARER`), and
  registering that listener’s port in the cluster registry. When using SASL_SSL,
  only use TLS keys that are verifiable by certificates in your client
  trust stores.
- Configure a cluster to receive the audit logs. Set up an audit log writer user
  (with a name like `auditlogwriter`) on that cluster with the ability to write to the
  destination topics. For example, grant the DeveloperWrite role on the topic
  prefix `confluent-audit-log-events`. For details, refer to
  [Configure the audit log writer to the destination cluster](#mds-cluster-exporter-config).
- Grant the [AuditAdmin](../../authorization/rbac/rbac-predefined-roles.md#rbac-predefined-roles) role on all your Kafka
  clusters to users or groups who will be managing the audit log configuration.

  #### NOTE
  The recommended way to grant permissions is for the audit log administrator
  to run any of the
  Confluent CLI [confluent audit-log](https://docs.confluent.io/confluent-cli/current/command-reference/audit-log/index.html)
  commands. The error message returns a list of recommended role bindings to
  grant to the user:
  ```none
  confluent login --url "http://mds.example.com:8090" # authenticate as user "alice"
  confluent audit-log config describe
  Error: 403 Forbidden
  User:alice not permitted to DescribeConfigs on one or more clusters. Fix it:
  confluent iam rbac role-binding create --role AuditAdmin --principal User:alice --kafka-cluster DBS26_qTQ-mT23p5opUK_g
  confluent iam rbac role-binding create --role AuditAdmin --principal User:alice --kafka-cluster prz9a_-xqqlRgmekDoLw4U
  ```


### Docker configuration

When you enable security for the Confluent Platform, you must pass secrets (for example,
credentials, certificates, keytabs, Kerberos config) to the container. The images
handle this by using the credentials available in the secrets directory.
The containers specify a Docker volume for secrets, which the admin must map
to a directory on the host that contains the required secrets. For example, if
the `securities.properties` file is located on the host at `/scripts/security`,
and you want it mounted at `/secrets` in the Docker container, then you would
specify:

```yaml
volumes:
   - ./scripts/security:/secrets
```

To configure secrets protection in Docker images, you must manually add the
following configuration to the `docker-compose.yml` file:

```yaml
CONFLUENT_SECURITY_MASTER_KEY: <your-master-key>
<COMPONENT>_CONFIG_PROVIDERS: "securepass"
<COMPONENT>_CONFIG_PROVIDERS_SECUREPASS_CLASS: "io.confluent.kafka.security.config.provider.SecurePassConfigProvider"
```

`<COMPONENT>` can be any of the following:

- `KAFKA`
- `KSQL`
- `CONNECT`
- `SCHEMA_REGISTRY`
- `CONTROL_CENTER`

For details about Docker configuration options, refer to [Docker Image Configuration Reference for Confluent Platform](../../../installation/docker/config-reference.md#config-reference).

For a Confluent Server broker, your configuration should look like the following:

```yaml
CONFLUENT_SECURITY_MASTER_KEY: <your-master-key>
KAFKA_CONFIG_PROVIDERS: "securepass"
KAFKA_CONFIG_PROVIDERS_SECUREPASS_CLASS: "io.confluent.kafka.security.config.provider.SecurePassConfigProvider"
```


## Opening multiple ports

Alternatively, you might choose to open multiple ports so that different protocols
can be used for broker-broker and broker-client communication. If you want to
use TLS encryption throughout (for example, for broker-broker and broker-client
communication), but also want to add SASL authentication to the broker-client
connection:

1. Open two additional ports during the first restart:
   ```bash
   listeners=PLAINTEXT://broker1:9091,SSL://broker1:9092,SASL_SSL://broker1:9093
   ```
2. Again, restart the Kafka clients, changing their configuration to point at the
   newly-opened, SASL and TLS secured port:
   ```bash
   bootstrap.servers=[broker1:9093,...]
   security.protocol=SASL_SSL
   ...etc
   ```

   For more details, refer to [SASL](authentication/overview.md#kafka-sasl-auth).
3. The second server restart would switch the cluster to use encrypted broker-broker
   communication using the TLS port you previously opened on port 9092:
   ```bash
   listeners=PLAINTEXT://broker1:9091,SSL://broker1:9092,SASL_SSL://broker1:9093
   security.inter.broker.protocol=SSL
   ```
4. The final restart secures the cluster by closing the `PLAINTEXT` port:
   ```bash
   listeners=SSL://broker1:9092,SASL_SSL://broker1:9093
   security.inter.broker.protocol=SSL
   ```


### Step 3 - Start the consumer with decryption

To start the consumer with decryption, run the `kafka-avro-console-consumer`
command for the KMS provider that you want to use, where `<bootstrap-url>` is
the bootstrap URL for your Confluent Platform cluster.

```shell
./bin/kafka-avro-console-consumer --bootstrap-server <bootstrap-url> \
  --topic test  \
  --property schema.registry.url=<schema-registry-url> \
  --consumer.config config.properties
```

After you run the producer and consumer, you can verify that the data is
encrypted and decrypted by using the `kafka-configs --describe` command for
the topic.

```shell
kafka-configs --bootstrap-server <bootstrap-url> \
  --entity-type topics \
  --entity-name test \
  --describe
```

Example test record should look like this:

```text
{"f1": "foo"}

{"f1": "foo", "f2": {"string": "bar"}}
```


## Configure Confluent Server brokers

Administrators can configure a mix of secure and unsecured clients. This tutorial
ensures that all broker/client and interbroker network communication is encrypted
in the following manner:

* All broker/client communication use `SASL_SSL` security protocol, which ensures
  that the communication is encrypted and authenticated using SASL/PLAIN.
* All interbroker communication use `SSL` security protocol, which ensures that
  the communication is encrypted and authenticated using TLS.
* The unsecured `PLAINTEXT` port is not enabled.

The steps are as follows:

1. Enable the desired security protocols and ports in each Confluent Server broker’s `server.properties`.
   Notice that both `SSL` and `SASL_SSL` are enabled.
   ```bash
   listeners=SSL://:9093,SASL_SSL://:9094

   # Kraft-specific configurations for the broker role
   # process.roles should be 'broker' for a dedicated broker node, 'controller' for a dedicated controller node,
   # or 'broker,controller' for a combined node.
   process.roles=broker,controller
   node.id={unique_node_id} # Unique ID for this broker/controller node

   # The list of controller nodes in the Kraft quorum.
   # Format: <node_id>@<host>:<port>,<node_id>@<host>:<port>,...
   # For example: 1@localhost:9093,2@localhost:9093,3@localhost:9093
   controller.quorum.voters={node_id_1}@{host_1}:{port_1},{node_id_2}@{host_2}:{port_2},{node_id_3}@{host_3}:{port_3}
   ```
2. To enable the Confluent Server brokers to authenticate each other using mutual TLS (mTLS) authentication,
   you need to configure all the Confluent Server brokers for client authentication (in this case,
   the requesting broker is the “client”). We recommend setting
   `ssl.client.auth=required`. We discourage configuring it as `requested`
   because misconfigured brokers will still connect successfully and it provides
   a false sense of security.
   ```bash
   security.inter.broker.protocol=SSL
   ssl.client.auth=required
   ```
3. Define the TLS/SSL truststore, keystore, and password in the `server.properties`
   file of every Confluent Server broker. Because this stores passwords directly in the Confluent Server broker
   configuration file, it is important to restrict access to these files using
   file system permissions.
   ```bash
   ssl.truststore.location=/var/ssl/private/kafka.server.truststore.jks
   ssl.truststore.password=test1234
   ssl.keystore.location=/var/ssl/private/kafka.server.keystore.jks
   ssl.keystore.password=test1234
   ssl.key.password=test1234
   ```
4. Enable SASL/PLAIN mechanism in the `server.properties` file of every broker.
   ```bash
   sasl.enabled.mechanisms=PLAIN
   ```
5. Create the broker’s JAAS configuration file in each Confluent Server broker’s `config`
   directory, let’s call it `kafka_server_jaas.conf` for this example.
   * Configure a `KafkaServer` section used when the broker validates client
     connections, including those from other brokers. The broker properties
     `username` and `password` are used to initiate connections to other brokers,
     and in this example, `kafkabroker` is the user for interbroker communication.
     The `user_{userName}` property set defines the passwords for all other
     clients that connect to the broker. In this example, there are two users
     `kafkabroker` and `client`.

     #### NOTE
     Note the two semicolons in each section.

     ```bash
     KafkaServer {
        org.apache.kafka.common.security.plain.PlainLoginModule required
        username="kafkabroker"
        password="kafkabroker-secret"
        user_kafkabroker="kafkabroker-secret"
        user_kafka-broker-metric-reporter="kafkabroker-metric-reporter-secret"
        user_client="client-secret";
     };
     ```


1. If you are using Confluent Control Center to monitor your deployment, and if the monitoring cluster
   backing Confluent Control Center is also configured with the same security protocols, you must

   configure the Confluent Metrics Reporter for security as well. Add these configurations
   to the `server.properties` file of each Confluent Server broker.
   ```bash
   metric.reporters=io.confluent.metrics.reporter.ConfluentMetricsReporter
   confluent.metrics.reporter.security.protocol=SASL_SSL
   confluent.metrics.reporter.ssl.truststore.location=/var/ssl/private/kafka.server.truststore.jks
   confluent.metrics.reporter.ssl.truststore.password=test1234
   confluent.metrics.reporter.sasl.mechanism=PLAIN
   confluent.metrics.reporter.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
      username="kafka-broker-metric-reporter" \
      password="kafka-broker-metric-reporter-secret";
   ```
2. To enable ACLs, we need to configure an authorizer. Kafka provides a simple
   authorizer implementation, and to use it, you can add the following to `server.properties`:
   ```shell
   authorizer.class.name=kafka.security.authorizer.AclAuthorizer
   ```
3. The default behavior is such that if a resource has no associated ACLs,
   then no one is allowed to access the resource, except super users. Setting
   Confluent Server broker principals as super users is a convenient way to give them the required
   access to perform interbroker operations. Because this tutorial configures the
   interbroker security protocol as SSL, set the super user name to be the
   `distinguished name` configured in the broker’s certificate. (See other
   [authorization configuration options](authorization/acls/overview.md#kafka-auth-superuser)).
   ```bash
   .. comment: START_KRAFT_ADDITIONS_SUPER_USERS
   super.users=User:<DN of broker1>;User:<DN of broker2>;User:<DN of broker3>;User:<DN of controller>;User:kafka-broker-metric-reporter
   .. comment: END_KRAFT_ADDITIONS_SUPER_USERS
   ```

Combining the configuration steps described above, the Confluent Server broker’s `server.properties`
file contains the following configuration settings:

```bash
.. comment: START_KRAFT_ADDITIONS_COMBINED

### Configure Console Producer and Consumer

The command line tools for console producer and consumer are convenient ways to
send and receive a small amount of data to the cluster. They are clients and thus
need security configurations as well.

1. Create a `client_security.properties` file with the security configuration
   parameters described above, with no additional configuration prefix.
   ```bash
   security.protocol=SASL_SSL
   ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
   ssl.truststore.password=test1234
   sasl.mechanism=PLAIN
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
       username="client" \
       password="client-secret";
   ```
2. Pass in the properties file when using the command line tools.
   ```bash
   kafka-console-producer --bootstrap-server kafka1:9094 --topic test-topic --producer.config client_security.properties
   kafka-console-consumer --bootstrap-server kafka1:9094 --topic test-topic --consumer.config client_security.properties
   ```


## Replicator

Confluent Replicator is a type of Confluent Platform source connector that replicates data from a source
to destination Confluent Platform cluster. An embedded consumer inside Replicator consumes data
from the source cluster, and an embedded producer inside the Kafka Connect worker
produces data to the destination cluster.

Take the basic client security configuration:

```bash
security.protocol=SASL_SSL
ssl.truststore.location=/var/ssl/private/kafka.client.truststore.jks
ssl.truststore.password=test1234
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
    username="client" \
    password="client-secret";
```

And configure Replicator for the following:

* Top-level Replicator consumer from the origin cluster, with an additional
  configuration prefix `src.kafka.`

Combining the configuration steps described above, the Replicator JSON properties
file contains the following configuration settings:

```bash
{
  "name":"replicator",
  "config":{
    ....
    "src.kafka.security.protocol" : "SASL_SSL",
    "src.kafka.ssl.truststore.location" : "var/private/ssl/kafka.server.truststore.jks",
    "src.kafka.ssl.truststore.password" : "test1234",
    "src.kafka.sasl.mechanism" : "PLAIN",
    "src.kafka.sasl.jaas.config" : "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"replicator\" password=\"replicator-secret\";",
    ....
  }
}
```


#### default.timestamp.extractor

A timestamp extractor pulls a timestamp from an instance of
[ConsumerRecord](/platform/current/clients/javadocs/javadoc/org/apache/kafka/clients/consumer/ConsumerRecord.html).
Timestamps are used to control the progress of streams.

The default extractor is
[FailOnInvalidTimestamp](/platform/current/streams/javadocs/javadoc/org/apache/kafka/streams/processor/FailOnInvalidTimestamp.html).
This extractor retrieves built-in timestamps that are automatically embedded
into Kafka messages by the Kafka producer client since
[Kafka version 0.10](https://cwiki.apache.org/confluence/x/eaSnAw).
Depending on the setting of Kafka’s server-side `log.message.timestamp.type`
broker and `message.timestamp.type` topic parameters, this extractor provides
you with:

* **event-time** processing semantics if `log.message.timestamp.type` is set
  to `CreateTime` aka “producer time” (which is the default).  This
  represents the time when a Kafka producer sent the original message. If you
  use Kafka’s official producer client or one of Confluent’s producer clients,
  the timestamp represents milliseconds since the epoch.
* **ingestion-time** processing semantics if `log.message.timestamp.type` is
  set to `LogAppendTime` aka “broker time”. This represents the time when the
  Kafka broker received the original message, in milliseconds since the epoch.

The `FailOnInvalidTimestamp` extractor throws an exception if a record
contains an invalid, that is, negative, built-in timestamp, because Kafka Streams
would not process this record but silently drop it. Invalid built-in timestamps
can occur for various reasons: if, for example, you consume a topic that is
written to by pre-0.10 Kafka producer clients or by third-party producer clients
that don’t support the new Kafka 0.10 message format yet; another situation in
which this may happen is after upgrading your Kafka cluster from `0.9` to
`0.10`, where all the data that was generated with `0.9` does not include
the `0.10` message timestamps.

If you have data with invalid timestamps and want to process it, then there are
two alternative extractors available. Both work on built-in timestamps, but
handle invalid timestamps differently.

* [LogAndSkipOnInvalidTimestamp](/platform/current/streams/javadocs/javadoc/org/apache/kafka/streams/processor/LogAndSkipOnInvalidTimestamp.html):
  This extractor logs a warn message and returns the invalid timestamp to
  Kafka Streams, which will not process but silently drop the record. This
  log-and-skip strategy allows Kafka Streams to make progress instead of failing
  if there are records with an invalid built-in timestamp in your input data.
* [UsePartitionTimeOnInvalidTimestamp](/platform/current/streams/javadocs/javadoc/org/apache/kafka/streams/processor/UsePartitionTimeOnInvalidTimestamp.html).
  This extractor returns the record’s built-in timestamp if it is valid, that
  is, not negative. If the record does not have a valid built-in timestamps,
  the extractor returns the previously extracted valid timestamp from a record
  of the same topic partition as the current record as a timestamp estimation.
  In case that no timestamp can be estimated, it throws an exception.

Another built-in extractor is
[WallclockTimestampExtractor](/platform/current/streams/javadocs/javadoc/org/apache/kafka/streams/processor/WallclockTimestampExtractor.html).
This extractor does not actually “extract” a timestamp from the consumed record
but rather returns the current time in milliseconds from the system clock
(`System.currentTimeMillis()`), which effectively means Kafka Streams operates
on the basis of the so-called **processing-time** of events.

You can also provide your own timestamp extractors, for instance to retrieve
timestamps embedded in the payload of messages. If you can’t extract a valid
timestamp, you can either throw an exception, return a negative timestamp, or
estimate a timestamp. Returning a negative timestamp results in data loss, as
the corresponding record isn’t processed, but instead, it’s dropped silently.
If you want to estimate a new timestamp, you can use the value provided by
`previousTimestamp`, that is, a Kafka Streams timestamp estimation. Here is an
example of a custom `TimestampExtractor` implementation:

```java
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.streams.processor.TimestampExtractor;

// Extracts the embedded timestamp of a record (giving you "event-time" semantics).
public class MyEventTimeExtractor implements TimestampExtractor {

  @Override
  public long extract(final ConsumerRecord<Object, Object> record, final long previousTimestamp) {
    // `Foo` is your own custom class, which we assume has a method that returns
    // the embedded timestamp (milliseconds since midnight, January 1, 1970 UTC).
    long timestamp = -1;
    final Foo myPojo = (Foo) record.value();
    if (myPojo != null) {
      timestamp = myPojo.getTimestampInMillis();
    }
    if (timestamp < 0) {
      // Invalid timestamp!  Attempt to estimate a new timestamp,
      // otherwise fall back to wall-clock time (processing-time).
      if (previousTimestamp >= 0) {
        return previousTimestamp;
      } else {
        return System.currentTimeMillis();
      }
    }
    return timestamp;
  }
}
```

You would then define the custom timestamp extractor in your Kafka Streams
configuration as follows:

```java
import java.util.Properties;
import org.apache.kafka.streams.StreamsConfig;

Properties streamsConfiguration = new Properties();
streamsConfiguration.put(StreamsConfig.DEFAULT_TIMESTAMP_EXTRACTOR_CLASS_CONFIG, MyEventTimeExtractor.class);
```


## RBAC role bindings

Kafka Streams supports role-based access control (RBAC) for controlling access
to resources in your Kafka clusters.

The following table shows required RBAC roles for access to cluster resources.

| Resource                                 | Role                                 | Command                                                                                                                                                                                                                                                    | Notes                                                                                                                                        |
|------------------------------------------|--------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
| Input topic                              | `DeveloperRead`                      | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role DeveloperRead \
  --resource Topic:<kafka_topic_name> \
  --kafka-cluster <kafka_cluster_id>
```                            |                                                                                                                                              |
| Output topic                             | `DeveloperWrite`                     | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role DeveloperWrite \
  --resource Topic:<kafka_topic_name> \
  --kafka-cluster <kafka_cluster_id>
```                           |                                                                                                                                              |
| Internal topic                           | `ResourceOwner`                      | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role ResourceOwner \
  --prefix \
  --resource Topic:<application_id> \
  --kafka-cluster <kafka_cluster_id>
```             | Required on all internal topics for
internal topic management, for example,
internal delete calls.                                   |
| Idempotent Producer                      | `DeveloperWrite`                     | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role DeveloperWrite \
  --resource Cluster:<kafka_cluster_id>  \
  --kafka-cluster <kafka_cluster_id>
```                        | The role binding is on the cluster, because
no topic is involved.                                                                        |
| Transactional Producer                   | `DeveloperWrite`                     | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role DeveloperWrite \
  --prefix \
  --resource Transactional-Id:<application_id> \
  --kafka-cluster <kafka_cluster_id>
``` | When `processing.guarantee` is set to
`exactly_once` or `exactly_once_v2`.                                                               |
| Consumer group                           | `DeveloperRead`                      | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role DeveloperRead \
  --prefix \
  --resource Group:<application_id> \
  --kafka-cluster <kafka_cluster_id>
```             |                                                                                                                                              |
| Schema Registry with input/output topics | `DeveloperRead`                      | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role DeveloperRead \
  --prefix \
  --resource Subject:<topic_prefix> \
  --kafka-cluster <kafka_cluster_id>
```             | The resource also may be
`Subject:<record_name_prefix>`.                                                                                 |
| Schema Registry with internal topics     | `ResourceOwner`                      | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role ResourceOwner \
  --prefix \
  --resource Subject:<application_id> \
  --kafka-cluster <kafka_cluster_id>
```           | If internal topic schema usage is enabled.                                                                                                   |
| Confluent Cloud governance features      | `DataDiscovery` or
`DataSteward` | ```bash
confluent iam rbac role-binding create \
  --principal User:<interactive_user_name> \
  --role DataDiscovery \
  --prefix \
  --resource Subject:<application_id> \
  --kafka-cluster <kafka_cluster_id>
```           | Required for Confluent Cloud governance features,
like Stream Catalog, for searching, tagging,
or managing business metadata topics. |


# Kafka Streams Quick Start for Confluent Platform


Confluent for VS Code provides project scaffolding for many different Apache Kafka®
clients, including Kafka Streams. The generated project has everything you need
to compile and run a simple Kafka Streams application that you can extend with
your code.

This guide shows you how to build a Kafka Streams application that connects to
a Kafka cluster. You’ll learn how to:

- Create a Kafka Streams project using Confluent for VS Code
- Process streaming data with Kafka Streams operations
- Run your application in a Docker container

Confluent for VS Code generates a project for a Kafka Streams application that
consumes messages from an input topic and produces messages to an output topic
by using the following code.

```java
builder.stream(INPUT_TOPIC, Consumed.with(stringSerde, stringSerde))
       .peek((k, v) -> LOG.info("Received raw event: {}", v))
       .mapValues(value -> generateEnrichedEvent())
       .peek((k, v) -> LOG.info("Generated enriched event: {}", v))
       .to(OUTPUT_TOPIC, Produced.with(stringSerde, stringSerde));
```


### Standalone REST Proxy

For the next few steps, use the REST Proxy that is running as a standalone service.

1. Use the standalone REST Proxy to try to produce a message to the topic `users`, referencing schema id `9`. This schema was registered in Schema Registry in the previous section. It should fail due to an authorization error.
   ```text
   docker compose exec restproxy curl -X POST \
      -H "Content-Type: application/vnd.kafka.avro.v2+json" \
      -H "Accept: application/vnd.kafka.v2+json" \
      --cert /etc/kafka/secrets/restproxy.certificate.pem \
      --key /etc/kafka/secrets/restproxy.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      --data '{"value_schema_id": 9, "records": [{"value": {"user":{"userid": 1, "username": "Bunny Smith"}}}]}' \
      -u appSA:appSA \
      https://restproxy:8086/topics/users
   ```

   Your output should resemble:
   ```JSON
   {"offsets":[{"partition":null,"offset":null,"error_code":40301,"error":"Not authorized to access topics: [users]"}],"key_schema_id":null,"value_schema_id":9}
   ```
2. Create a role binding for the client permitting it produce to the topic `users`.

   Get the Kafka cluster ID:
   ```none
   KAFKA_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id --tlsv1.2 --cacert scripts/security/snakeoil-ca-1.crt | jq -r ".id")
   ```

   Create the role binding:
   ```text
   # Create the role binding for the topic ``users``
   docker compose exec tools bash -c "confluent iam rbac role-binding create \
       --principal User:appSA \
       --role DeveloperWrite \
       --resource Topic:users \
       --kafka-cluster-id $KAFKA_CLUSTER_ID"
   ```
3. Again try to produce a message to the topic `users`. It should pass this time.
   ```text
   docker compose exec restproxy curl -X POST \
      -H "Content-Type: application/vnd.kafka.avro.v2+json" \
      -H "Accept: application/vnd.kafka.v2+json" \
      --cert /etc/kafka/secrets/restproxy.certificate.pem \
      --key /etc/kafka/secrets/restproxy.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      --data '{"value_schema_id": 9, "records": [{"value": {"user":{"userid": 1, "username": "Bunny Smith"}}}]}' \
      -u appSA:appSA \
      https://restproxy:8086/topics/users
   ```

   Your output should resemble:
   ```JSON
   {"offsets":[{"partition":1,"offset":0,"error_code":null,"error":null}],"key_schema_id":null,"value_schema_id":9}
   ```
4. Create consumer instance `my_avro_consumer`.
   ```text
   docker compose exec restproxy curl -X POST \
      -H "Content-Type: application/vnd.kafka.v2+json" \
      --cert /etc/kafka/secrets/restproxy.certificate.pem \
      --key /etc/kafka/secrets/restproxy.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      --data '{"name": "my_consumer_instance", "format": "avro", "auto.offset.reset": "earliest"}' \
      -u appSA:appSA \
      https://restproxy:8086/consumers/my_avro_consumer
   ```

   Your output should resemble:
   ```text
   {"instance_id":"my_consumer_instance","base_uri":"https://restproxy:8086/consumers/my_avro_consumer/instances/my_consumer_instance"}
   ```
5. Subscribe `my_avro_consumer` to the `users` topic.
   ```text
   docker compose exec restproxy curl -X POST \
      -H "Content-Type: application/vnd.kafka.v2+json" \
      --cert /etc/kafka/secrets/restproxy.certificate.pem \
      --key /etc/kafka/secrets/restproxy.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      --data '{"topics":["users"]}' \
      -u appSA:appSA \
      https://restproxy:8086/consumers/my_avro_consumer/instances/my_consumer_instance/subscription
   ```
6. Try to consume messages for `my_avro_consumer` subscriptions. It should fail due to an authorization error.
   ```text
   docker compose exec restproxy curl -X GET \
      -H "Accept: application/vnd.kafka.avro.v2+json" \
      --cert /etc/kafka/secrets/restproxy.certificate.pem \
      --key /etc/kafka/secrets/restproxy.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      -u appSA:appSA \
      https://restproxy:8086/consumers/my_avro_consumer/instances/my_consumer_instance/records
   ```

   Your output should resemble:
   ```text
   {"error_code":40301,"message":"Not authorized to access group: my_avro_consumer"}
   ```
7. Create a role binding for the client permitting it access to the consumer group `my_avro_consumer`.

   Get the Kafka cluster ID:
   ```none
   KAFKA_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id --tlsv1.2 --cacert scripts/security/snakeoil-ca-1.crt | jq -r ".id")
   ```

   Create the role binding:
   ```text
   # Create the role binding for the group ``my_avro_consumer``
   docker compose exec tools bash -c "confluent iam rbac role-binding create \
       --principal User:appSA \
       --role ResourceOwner \
       --resource Group:my_avro_consumer \
       --kafka-cluster-id $KAFKA_CLUSTER_ID"
   ```
8. Again try to consume messages for `my_avro_consumer` subscriptions. It should fail due to a different authorization error.
   ```text
   # Note: Issue this command twice due to https://github.com/confluentinc/kafka-rest/issues/432
   docker compose exec restproxy curl -X GET \
      -H "Accept: application/vnd.kafka.avro.v2+json" \
      --cert /etc/kafka/secrets/restproxy.certificate.pem \
      --key /etc/kafka/secrets/restproxy.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      -u appSA:appSA \
      https://restproxy:8086/consumers/my_avro_consumer/instances/my_consumer_instance/records

   docker compose exec restproxy curl -X GET \
      -H "Accept: application/vnd.kafka.avro.v2+json" \
      --cert /etc/kafka/secrets/restproxy.certificate.pem \
      --key /etc/kafka/secrets/restproxy.key \
      --tlsv1.2 \
      --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
      -u appSA:appSA \
      https://restproxy:8086/consumers/my_avro_consumer/instances/my_consumer_instance/records
   ```

   Your output should resemble:
   ```JSON
   {"error_code":40301,"message":"Not authorized to access topics: [users]"}
   ```
9. Create a role binding for the client permitting it access to the topic `users`.

   Get the Kafka cluster ID:
   ```none
   KAFKA_CLUSTER_ID=$(curl -s https://localhost:8091/v1/metadata/id --tlsv1.2 --cacert scripts/security/snakeoil-ca-1.crt | jq -r ".id")
   ```

   Create the role binding:
   ```text
   # Create the role binding for the group my_avro_consumer
   docker compose exec tools bash -c "confluent iam rbac role-binding create \
       --principal User:appSA \
       --role DeveloperRead \
       --resource Topic:users \
       --kafka-cluster-id $KAFKA_CLUSTER_ID"
   ```
10. Again try to consume messages for `my_avro_consumer` subscriptions. It should pass this time.
    ```text
    # Note: Issue this command twice due to https://github.com/confluentinc/kafka-rest/issues/432
    docker compose exec restproxy curl -X GET \
       -H "Accept: application/vnd.kafka.avro.v2+json" \
       --cert /etc/kafka/secrets/restproxy.certificate.pem \
       --key /etc/kafka/secrets/restproxy.key \
       --tlsv1.2 \
       --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
       -u appSA:appSA \
       https://restproxy:8086/consumers/my_avro_consumer/instances/my_consumer_instance/records

    docker compose exec restproxy curl -X GET \
       -H "Accept: application/vnd.kafka.avro.v2+json" \
       --cert /etc/kafka/secrets/restproxy.certificate.pem \
       --key /etc/kafka/secrets/restproxy.key \
       --tlsv1.2 \
       --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
       -u appSA:appSA \
       https://restproxy:8086/consumers/my_avro_consumer/instances/my_consumer_instance/records
    ```

    Your output should resemble:
    ```JSON
    [{"topic":"users","key":null,"value":{"userid":1,"username":"Bunny Smith"},"partition":1,"offset":0}]
    ```
11. Delete the consumer instance `my_avro_consumer`.
    ```text
    docker compose exec restproxy curl -X DELETE \
       -H "Content-Type: application/vnd.kafka.v2+json" \
       --cert /etc/kafka/secrets/restproxy.certificate.pem \
       --key /etc/kafka/secrets/restproxy.key \
       --tlsv1.2 \
       --cacert /etc/kafka/secrets/snakeoil-ca-1.crt \
       -u appSA:appSA \
       https://restproxy:8086/consumers/my_avro_consumer/instances/my_consumer_instance
    ```


### Configure monitoring connection

1. Define the listener configuration for the monitoring interceptors:
   ```yaml
   kafka_connect_replicator_monitoring_interceptor_listener:
     ssl_enabled: true
     sasl_protocol: kerberos
   ```
2. Define the basic monitoring configuration:
   ```yaml
   kafka_connect_replicator_monitoring_interceptor_bootstrap_servers: <monitoring cluster hostname:port>
   ```
3. Define the security configuration for the monitoring connection.
   ```yaml
   kafka_connect_replicator_monitoring_interceptor_kerberos_principal: <kafka principal primary>
   kafka_connect_replicator_monitoring_interceptor_kerberos_keytab_path: <path to your keytab>
   kafka_connect_replicator_monitoring_interceptor_ssl_ca_cert_path: <path to your CA cert>
   kafka_connect_replicator_monitoring_interceptor_ssl_cert_path: <path to your signed cert>
   kafka_connect_replicator_monitoring_interceptor_ssl_key_path: <path to your ssl key>
   kafka_connect_replicator_monitoring_interceptor_ssl_key_password: <ssl key password>
   ```
4. For RBAC-enabled deployment, define additional custom properties for the
   monitoring connection.

   `kafka_connect_replicator_monitoring_interceptor` configs default to match
   `kafka_connect_replicator` configs. The following are required only if you
   are producing metrics to a different cluster than where you are storing your
   configs.

   Specify either the Kafka cluster id
   (`kafka_connect_replicator_monitoring_interceptor_kafka_cluster_id`) or the cluster name
   (`kafka_connect_replicator_monitoring_interceptor_kafka_cluster_name`).
   ```yaml
   kafka_connect_replicator_monitoring_interceptor_rbac_enabled: true
   kafka_connect_replicator_monitoring_interceptor_erp_tls_enabled: <true if Confluent REST API has TLS enabled>
   kafka_connect_replicator_monitoring_interceptor_erp_host: <Confluent REST API host URL>
   kafka_connect_replicator_monitoring_interceptor_erp_admin_user: <mds or your Kafka super user>
   kafka_connect_replicator_monitoring_interceptor_erp_admin_password: password
   kafka_connect_replicator_monitoring_interceptor_kafka_cluster_id: <destination cluster id>
   kafka_connect_replicator_monitoring_interceptor_kafka_cluster_name: <destination cluster name>
   kafka_connect_replicator_monitoring_interceptor_erp_pem_file: <path to oauth pem file>
   ```


### Connect to Confluent Cloud Schema Registry

To enable components to connect to Confluent Cloud Schema Registry, get the Schema Registry URL, the api key,
and the secret, and set the following variables in the `hosts.yml` file:

* `ccloud_schema_registry_enabled`
* `ccloud_schema_registry_url`
* `ccloud_schema_registry_key`
* `ccloud_schema_registry_secret`

For example:

```yaml
all:
  vars:
    ccloud_schema_registry_enabled: true
    ccloud_schema_registry_url: https://psrc-zzzzz.europe-west3.gcp.confluent.cloud
    ccloud_schema_registry_key: AAAAAAAAAAAAAAAA
    ccloud_schema_registry_secret: bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
```

See a sample inventory file for Confluent Cloud Kafka and Schema Registry configuration at the
following location:

```bash
https://github.com/confluentinc/cp-ansible/blob/8.1.0-post/docs/sample_inventories/ccloud.yml
```


### Configure SASL/GSSAPI (Kerberos) authentication

The Ansible playbook does not currently configure Key Distribution Center (KDC)
and Active Directory KDC configurations. You must set up your own KDC
independently of the playbook and provide your own keytabs to configure
SASL/GSSAPI (SASL with Kerberos):

* Create principals within your organization’s Kerberos KDC server for each
  component and for each host in each component.
* Generate keytabs for these principals. The keytab files must be present on the
  Ansible control node.

To install Kerberos packages and configure the client configuration file on each
host, add the following configuration parameters in the `hosts.yaml` file.

* Specify whether to install Kerberos packages and to configure the client
  configuration file. The default value is `true`.

  If the hosts already have the client configuration file configured, set
  `kerberos_configure` to `false`.
  ```yaml
  all:
    vars:
      kerberos_configure: <true-or-false>
  ```
* Specify the client configuration file. The default value is
  `/etc/krb5.conf`.

  Use this variable only when you want to specify a custom location of the
  client configuration file.
  ```yaml
  all:
    vars:
      kerberos_client_config_file_dest:
  ```

  If `kerberos_configure` is set to `true`, Confluent Ansible will generate
  the client config file at this location on the host nodes.

  If `kerberos_configure` is set to `false`, Confluent Ansible will expect the
  client configuration file to be present at this location on the host nodes.
* Specify the *realm* part of the Kafka broker Kerberos principal and the
  hostname of machine with KDC running.
  ```yaml
  all:
    vars:
      kerberos:
        realm: <kafka-principal-realm>
        kdc_hostname: <kdc-hostname>
        admin_hostname: <kdc-hostname>
  ```

The example below shows the Kerberos configuration settings for the Kerberos
principal, `kafka/kafka1.hostname.com@EXAMPLE.COM`.

```yaml
all:
  vars:
    kerberos_configure: true
    kerberos:
      realm: example.com
      kdc_hostname: ip-192-24-45-82.us-west.compute.internal
      admin_hostname: ip-192-24-45-82.us-west.compute.internal
```

Each host in the inventory file also needs to set variables that define their
Kerberos principal and the location of the keytab on the Ansible controller.

The `hosts.yml` inventory file should look like:

```yaml
kafka_controller:
  hosts:
    ip-192-24-34-224.us-west.compute.internal:
      kafka_controller_kerberos_keytab_path: /tmp/keytabs/kafka-ip-192-24-34-224.us-west.compute.internal.keytab
      kafka_controller_kerberos_principal: kafka/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
    ip-192-24-37-15.us-west.compute.internal:
      kafka_controller_kerberos_keytab_path: /tmp/keytabs/kafka-ip-192-24-34-224.us-west.compute.internal.keytab
      kafka_controller_kerberos_principal: kafka/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
    ip-192-24-34-224.us-west.compute.internal:
      kafka_controller_kerberos_keytab_path: /tmp/keytabs/kafka-ip-192-24-34-224.us-west.compute.internal.keytab
      kafka_controller_kerberos_principal: kafka/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
```

```yaml
kafka_broker:
  hosts:
    ip-192-24-34-224.us-west.compute.internal:
      kafka_broker_kerberos_keytab_path: /tmp/keytabs/kafka-ip-192-24-34-224.us-west.compute.internal.keytab
      kafka_broker_kerberos_principal: kafka/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
    ip-192-24-37-15.us-west.compute.internal:
      kafka_broker_kerberos_keytab_path: /tmp/keytabs/kafka-ip-192-24-34-224.us-west.compute.internal.keytab
      kafka_broker_kerberos_principal: kafka/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
    ip-192-24-34-224.us-west.compute.internal:
      kafka_broker_kerberos_keytab_path: /tmp/keytabs/kafka-ip-192-24-34-224.us-west.compute.internal.keytab
      kafka_broker_kerberos_principal: kafka/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
```

```yaml
schema_registry:
  hosts:
    ip-192-24-34-224.us-west.compute.internal:
      schema_registry_kerberos_keytab_path: /tmp/keytabs/schemaregistry-ip-192-24-34-224.us-west.compute.internal.keytab
      schema_registry_kerberos_principal: schemaregistry/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
```

```yaml
kafka_connect:
  hosts:
    ip-192-24-34-224.us-west.compute.internal:
      kafka_connect_kerberos_keytab_path: /tmp/keytabs/connect-ip-192-24-34-224.us-west.compute.internal.keytab
      kafka_connect_kerberos_principal: connect/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
```

```yaml
kafka_rest:
  hosts:
    ip-192-24-34-224.us-west.compute.internal:
      kafka_rest_kerberos_keytab_path: /tmp/keytabs/restproxy-ip-192-24-34-224.us-west.compute.internal.keytab
      kafka_rest_kerberos_principal: restproxy/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
```

```yaml
ksql:
  hosts:
    ip-192-24-34-224.us-west.compute.internal:
      ksql_kerberos_keytab_path: /tmp/keytabs/ksql-ip-192-24-34-224.us-west.compute.internal.keytab
      ksql_kerberos_principal: ksql/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
```

```yaml
control_center_next_gen:
  hosts:
    ip-192-24-34-224.us-west.compute.internal:
      control_center_next_gen_kerberos_keytab_path: /tmp/keytabs/controlcenter-ip-192-24-34-224.us-west.compute.internal.keytab
      control_center_next_gen_kerberos_principal: controlcenter/ip-192-24-34-224.us-west.compute.internal@REALM.EXAMPLE.COM
```


### Configure single sign-on authentication for Confluent Control Center and Confluent CLI

In Confluent Ansible, you can configure single sign-on (SSO) authentication for
Control Center using OpenID Connect (OIDC).

As a prerequisite for SSO, you need to configure:

* [An OIDC-compliant identity provider (IdP)](https://docs.confluent.io/platform/current/security/authentication/sso-for-c3/configure-sso-using-oidc.html#step-1-establish-a-trust-relationship-between-cp-and-identity-provider).

  For RBAC with mTLS, you can use the [file-based authentication](ansible-authorize.md#ansible-file-based-authentication) without IdP.
* [The MDS](https://docs.confluent.io/ansible/current/ansible-authorize.html#role-based-access-control).

  For SSO, RBAC needs to be enabled, and RBAC requires MDS.

To use SSO in Control Center or Confluent CLI, specify the following variables in
your inventory file. For details on the setting variables, refer to [Configure
SSO for Confluent Control Center using OIDC](https://docs.confluent.io/platform/current/control-center/security/sso/configure-sso-using-oidc.html).

* `sso_mode`

  To enable SSO, set to `oidc`.
* `sso_groups_claim`

  Groups in JSON Web Tokens (JWT)

  Default: `groups`
* `sso_sub_claim`: Sub in JWT.

  Default: `sub`
* `sso_issuer_url`

  The issuer URL, which is typically the authorization server’s URL. This value
  is compared to the issuer claim in the JWT  token for verification.
* `sso_jwks_uri`

  The JSON Web Key Set (JWKS) URI. It is used to verify any JSON Web Token
  (JWT) issued by the IdP.
* `sso_authorize_uri`

  The base URI for authorize endpoint, that initiates an OAuth authorization
  request.
* `sso_token_uri`

  The IdP token endpoint, from where a token is requested by
  the MDS.
* `sso_client_id`

  The client id for authorization and token request to IdP.
* `sso_client_password`

  The client password for authorize and token request to IdP.
* `sso_groups_scope`

  Optional. The name of the custom groups. Use this setting to handle a case
  where the `groups` field is not present in tokens by default, and you have
  configured a custom scope for issuing groups. The name of the scope could be
  anything, such as `groups`, `allow_groups`, `offline_access`, etc.

  `offline_access` is a well-defined scope used to request refresh token. This
  scope can be requested when the `sso_refresh_token` setting is set to
  `true`. The scope is defined in OIDC RFC, and is not specific to any IdP.

  Possible values: `groups`, `openid`, `offline_access`, etc.

  Default: `groups`
* `sso_refresh_token`

  Configures whether the `offline_access` scope can be requested in the
  authorization URI. Set this to `false` if offline tokens are not allowed for
  the user or client in the IdP.

  As described in [SSO Session management](https://docs.confluent.io/platform/current/control-center/security/sso/configure-sso-using-oidc.html#step-3-customize-additional-security-and-usability),
  for RBAC to work as expected, the default value of `true` should not be
  changed to `false`.

  Default: `true`
* `sso_cli`

  To enable SSO in Confluent CLI, set it to `true. When enabling SSO in CLI,
  you must also provide ``sso_device_authorization_uri`.

  Default: `false`
* `sso_device_authorization_uri`

  Device Authorization endpoint of Idp, Required to enable SSO in
  Confluent CLI.
* `sso_idp_cert_path`

  TLS certificate (full path of file on the control node) of the IdP domain for
  OIDC SSO in Control Center or Confluent CLI. Required when the IdP server has TLS
  enabled with custom certificate.

The following is an example snippet of an inventory file for setting up Confluent Platform
with RBAC, SASL/PLAIN protocol, and Control Center SSO:

```yaml
all:
  vars:
    ansible_connection: ssh
    ansible_user: ec2-user
    ansible_become: true
    ansible_ssh_private_key_file: /home/ec2-user/guest.pem

    ## TLS Configuration - Custom Certificates
    ssl_enabled: true

    #### SASL Authentication Configuration ####
    sasl_protocol: plain

    ## RBAC Configuration
    rbac_enabled: true

    ## LDAP CONFIGURATION
    kafka_broker_custom_properties:
      ldap.java.naming.factory.initial: com.sun.jndi.ldap.LdapCtxFactory
      ldap.com.sun.jndi.ldap.read.timeout: 3000
      ldap.java.naming.provider.url: ldaps://ldap1:636
      ldap.java.naming.security.principal: uid=mds,OU=rbac,DC=example,DC=com
      ldap.java.naming.security.credentials: password
      ldap.java.naming.security.authentication: simple
      ldap.user.search.base: OU=rbac,DC=example,DC=com
      ldap.group.search.base: OU=rbac,DC=example,DC=com
      ldap.user.name.attribute: uid
      ldap.user.memberof.attribute.pattern: CN=(.*),OU=rbac,DC=example,DC=com
      ldap.group.name.attribute: cn
      ldap.group.member.attribute.pattern: CN=(.*),OU=rbac,DC=example,DC=com
      ldap.user.object.class: account

    ## LDAP USERS
    mds_super_user: mds
    mds_super_user_password: password
    kafka_broker_ldap_user: kafka_broker
    kafka_broker_ldap_password: password
    schema_registry_ldap_user: schema_registry
    schema_registry_ldap_password: password
    kafka_connect_ldap_user: connect_worker
    kafka_connect_ldap_password: password
    ksql_ldap_user: ksql
    ksql_ldap_password: password
    kafka_rest_ldap_user: rest_proxy
    kafka_rest_ldap_password: password
    control_center_next_gen_ldap_user: control_center
    control_center_next_gen_ldap_password: password

    ## Variables to enable SSO in Control Center
    sso_mode: oidc

    # necessary configs in MDS server for sso in C3
    sso_groups_claim: groups
    sso_sub_claim: sub
    sso_groups_scope: groups
    sso_issuer_url: <issuer url>
    sso_jwks_uri: <jwks uri>
    sso_authorize_uri: <OAuth authorization endpoint>
    sso_token_uri: <IdP token endpoint>
    sso_client_id: <client id>
    sso_client_password: <client password>
    sso_refresh_token: true

kafka_controller:
  hosts:
    demo-controller-0:
    demo-controller-1:
    demo-controller-2:

kafka_broker:
  hosts:
    demo-broker-0:
    demo-broker-1:
    demo-broker-2:

schema_registry:
  hosts:
    demo-sr-0:

kafka_connect:
  hosts:
    demo-connect-0:

kafka_rest:
  hosts:
    demo-rest-0:

ksql:
  hosts:
    demo-ksql-0:

control_center_next_gen:
  hosts:
    demo-c3-0:
```


# Specify exporter for switchover
sr_switch_over_exporter_name: "cp-to-cc-exporter"

password_encoder_secret: <secret>
```

**Sync Schemas to a specific context:**

1. Export the default context in Confluent Platform to the `site1` context in Confluent Cloud, and
   import all schemas in the `site1` context in Confluent Cloud to the default
   context in Confluent Platform.
   ```yaml
   # When using contexts
   unified_stream_manager:
     schema_registry_endpoint: "https://psrc-xyz.us-east-1.aws.confluent.cloud"
     authentication_type: basic
     basic_username: "your-cc-api-key"
     basic_password: "your-cc-api-secret"
     remote_context: "site1"

   schema_exporters:
     - name: "production-exporter"
       subjects: ["*"]
       context_type: "CUSTOM"
       context: "site1"

   schema_importers:
     - name: "production-importer"
       subjects: [":.site1:*"]
       context: "."

   sr_switch_over_exporter_name: "production-exporter"

   password_encoder_secret: <secret>
   ```
2. Export the `corp` context in Confluent Platform to the `site1` context in Confluent Cloud, and
   import all schemas in the `site1.corp` context in Confluent Cloud to the `corp`
   context in Confluent Platform.
   ```yaml
   # When using contexts
   unified_stream_manager:
     schema_registry_endpoint: "https://psrc-xyz.us-east-1.aws.confluent.cloud"
     authentication_type: basic
     basic_username: "your-cc-api-key"
     basic_password: "your-cc-api-secret"
     remote_context: "site1"

   schema_exporters:
     - name: "production-exporter-2"
       subjects: [":.corp:*"]
       context_type: "CUSTOM"
       context: "site1"

   schema_importers:
     - name: "production-importer-2"
       subjects: [":.site1.corp:*"]
       context: "site1"

   sr_switch_over_exporter_name: "production-exporter-2"

   password_encoder_secret: <secret>
   ```

**Greenfield setup - enable forwarding and importer:**

```yaml

## Upgrade notes

Before you start the upgrade process, review the following changes and make any
necessary updates.

* ZooKeeper removal in Confluent Platform 8.0

  ZooKeeper was removed in Confluent Platform 8.0 and is no longer supported with the 8.0 version.
  Follow the steps in  [Upgrade ZooKeeper-based Confluent Platform deployment](#ansible-upgrade-zk) to migrate your ZooKeeper-based
  deployment to KRaft before you upgrade to Confluent Platform 8.0.
* Upgrade Control Center from 2.0 or 2.1 to 2.2 in Confluent Ansible 8.0

  Confluent Platform 8.0 does not work with Control Center 2.0 or 2.1. So, when upgrading Confluent Platform to 8.0
  and Control Center to 2.2, you must upgrade Control Center before upgrading Kafka. This
  dependency is only specific to when you upgrade to Confluent Platform 8.0.
* Upgrade Confluent Control Center (Legacy) alerts to Control Center in Confluent Ansible 8.0

  Starting in the 8.0 release, Confluent Ansible and Confluent Platform no longer support Confluent Control Center (Legacy).

  If you have alerts you need to migrate from Confluent Control Center (Legacy) to Control Center, before you upgrade
  your Confluent Platform deployment to 8.0, upgrade the Confluent Platform to 7.9.1 and migrate your alerts
  as described in [Control Center Alert Migration](https://docs.confluent.io/control-center/current/installation/alert-migrate.html).

  Your Confluent Control Center (Legacy) and Control Center must be configured with the same Kafka bootstrap endpoint
  to point to the same Kafka cluster for alert migration.

  Upgrading Confluent Control Center (Legacy) to Control Center or upgrading Confluent Control Center (Legacy) metrics to Control Center are not
  supported.
* Upgrade Log4j to Log4j 2 in Confluent Ansible 8.0

  Starting in the 8.0 release, Confluent Ansible and Confluent Platform only support Log4j2.

  When upgrading from Confluent Ansible 7.x to 8.x, the custom Log4j
  configurations on your 7.x cluster are not automatically
  converted to Log4j 2 configurations. You need to explicitly define the
  variables for Log4j 2 as described in [Configure Log4j 2](ansible-configure.md#ansible-log4j2). In 8.x, by
  default, Confluent Ansible  sets up Log4j 2 with the default values mentioned in
  [VARIABLES.md](https://github.com/confluentinc/cp-ansible/blob/master/docs/VARIABLES.md).
* SASL/SCRAM default version

  The default SASL/SCRAM version was changed from 256 to 512.

  If the version of SASL/SCRAM is specified as 256 in your `server.properties`,
  you must update your inventory and change `sasl_protocol: scram` to
  `sasl_protocol: scram256`.
* Enable Admin REST APIs

  When upgrading from 5.5.x to 6.2.x, you must enable Admin REST APIs by setting
  the following property in your inventory file. If Admin REST APIs is not enabled,
  component upgrades will fail:
  ```yaml
  kafka_broker_rest_proxy_enabled: true
  ```
* Disable canonicalization

  If canonicalization has not been enabled during the Confluent Platform cluster creation,
  explicitly set the following property in the `hosts.yml` inventory file.
  ```yaml
  kerberos:
    canonicalize: false
  ```
* Variable name updates in `hosts.yaml`

  Misspelled variable names were corrected in the `7.2.2` version.

  If upgrading from a version, earlier than `7.2.2`, to a version, `7.2.2`
  or later, make the following updates in your inventory file:
  * From: `kakfa_connect_replicator_<property_name>`
  * To: `kafka_connect_replicator_<property_name>`


## Vector search with Pinecone

The following example assumes a Pinecone API key as shown in
[Pinecone Quick Start](https://docs.pinecone.io/guides/get-started/quickstart)
and an OpenAI connection as shown in [Connection resource](../../flink/reference/statements/create-model.md#flink-sql-create-model-connection-resource).

- Follow this
  [Pinecone notebook](https://colab.research.google.com/github/pinecone-io/examples/blob/master/docs/semantic-search.ipynb)
  to create a index of LangChain docs.

This example shows the following steps:

1. Run the following command to create a connection resource named
   `azureopenai_connection` that uses your Azure API key.
   ```sql
   CREATE CONNECTION azureopenai_connection
     WITH (
       'type' = 'azureopenai',
       'endpoint' = '<EMBEDDING_ENDPOINT>',
       'api-key' = '<YOUR_AZURE_API_KEY>'
     );
   ```
2. Run the following command to create a connection resource named
   “pinecone_connection” that uses your Pinecone credentials.
   ```sql
   CREATE CONNECTION pinecone_connection
     WITH (
       'type' = 'pinecone',
       'endpoint' = '<pinecone_query_endpoint>',
       'api-key' = '<pinecone_api_key>'
     );
   ```
3. Run the following statements to create the tables.
   ```sql
   CREATE TABLE text_input (input STRING);
   CREATE TABLE embedding_output (question STRING, embedding ARRAY<FLOAT>);

   -- Create the search table.
   CREATE TABLE pinecone (
     text STRING,
     embeddings ARRAY<FLOAT>
   ) WITH (
     'connector' = 'pinecone',
     'pinecone.connection' = 'pinecone_connection',
   );
   ```
4. Run the following statements to create and run the embedding model.
   ```sql
   -- Create the embedding model.
   CREATE MODEL openaiembed
   INPUT (input STRING)
   OUTPUT (embedding ARRAY<FLOAT>)
   WITH(
     'task' = 'classification',
     'provider'= 'azureopenai',
     'azureopenai.input_format'='OPENAI-EMBED',
     'azureopenai.connection' = 'azureopenai_connection'
   );

   -- Insert testing data.
   INSERT INTO embedding_output SELECT * FROM text_input,
     LATERAL TABLE(ML_PREDICT('openaiembed', input));

   INSERT INTO text_input VALUES
   ('what is LangChain?'),
   ('how do I use the LLMChain in LangChain?'),
   ('what is a pipeline in LangChain?'),
   ('how to partially format prompt templates');
   ```
5. Run the following statements to execute the vector search.
   ```sql
   -- Run the vector search.
   SELECT * FROM embedding_output,
     LATERAL TABLE(VECTOR_SEARCH_AGG(pinecone, DESCRIPTOR(embeddings), embedding, 3));

   -- Or flatten the result.
   CREATE TABLE pinecone_result AS SELECT * FROM embedding_output,
     LATERAL TABLE(VECTOR_SEARCH_AGG(pinecone, DESCRIPTOR(embeddings), embedding, 3));

   SELECT * FROM pinecone_result CROSS JOIN UNNEST(search_results) AS T(text, embeddings, score);
   ```


## Do I get charged for internal topics created by Kafka Streams or ksqlDB?

For Basic and Standard clusters using the legacy billing model, partitions
for internal topics, prefixed with an underscore `_`, created by Confluent components like
ksqlDB and Kafka Streams do count toward partition billing. However, topics internal to
Kafka itself, like consumer offsets, do not count. For more information, see [Partitions](overview.md#partition-billing).


### Partitions

Confluent Cloud does not charge for partitions on any type of Kafka cluster, but the number of partitions you use can
have an impact on eCKU. To determine eCKUs limits for partitions, Confluent Cloud bills only for pre-replication
(leader partitions) across a cluster. For more information, see [eCKU/CKU comparison](../clusters/cluster-types.md#ecku-comparison-table).

<details>
<summary style="display: list-item; cursor:pointer; color:#337ab7;">
  Legacy partition billing for Basic and Standard clusters
</summary>

Confluent Cloud charges for partitions on Basic and Standard clusters. You are charged
for the number of unique partitions that exist on your cluster during a given hour.

- Basic clusters receive 10 partitions free of charge.
- Standard clusters receive 500 partitions free of charge.
- Enterprise clusters have no partition-based charges.
- Dedicated clusters have no partition-based charges.

For billing purposes, partitions for topics that you create and partitions
for internal topics are counted. Internal topics are topics that are
automatically created by Confluent components such as
ksqlDB, Kafka Streams, and Connect, and prefixed with an underscore (`_`).
Partitions for topics that are internal to Kafka itself
and are not visible in the Cloud Console, such as consumer offsets,
do not count against partition limits or toward partition billing.

</details>


### Common properties

The following table provides several common configuration properties for
Producers and Consumers that you can review for potential modification.

| Configuration property                   | Java default      | librdkafka default                 | Notes                                                                                                                                                                                                       |
|------------------------------------------|-------------------|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `client.id`                              | empty string      | rdkafka                            | You should set the `client.id` to something meaningful in your
application, especially if you are running multiple clients or want to
easily trace logs or activities to specific client instances. |
| `connections.max.idle.ms`                | 540000 ms (9 min) | See librdkafka `socket.timeout.ms` | You can change this when an intermediate load balancer disconnects idle
connections after inactivity. For example: AWS 350 seconds, Azure 4
minutes, Google Cloud 10 minutes.                       |
| `sasl.kerberos.service.name`             | null              | kafka                              | Changing the default service name will cause issues for those who don’t
have it configured.                                                                                                             |
| `socket.connection.setup.timeout.max.ms` | 30000 ms (30 sec) | not available                      | librdkafka doesn’t have exponential backoff for this timeout.                                                                                                                                               |
| `socket.connection.setup.timeout.ms`     | 10000 ms (10 sec) | 30000 ms (30 sec)                  | librdkafka doesn’t have exponential backoff for this timeout.                                                                                                                                               |
| `metadata.max.age.ms`                    | 300000 ms (5 min) | 900000 ms (15 min)                 | librdkafka has the `topic.metadata.refresh.interval.ms` property that
defaults to 300000 milliseconds (5 minutes).                                                                                      |
| `reconnect.backoff.max.ms`               | 1000 ms (1 sec)   | 10000 ms (10 sec)                  |                                                                                                                                                                                                             |
| `reconnect.backoff.ms`                   | 50 ms             | 100 ms                             |                                                                                                                                                                                                             |
| `max.in.flight.requests.per.connection`  | 5                 | 1000000                            | librdkafka produces to a single partition per batch, setting it to 5
limits producing to 5 partitions per broker.                                                                                       |


## Features

All clusters have the following features:

- [Kafka ACLs](../security/access-control/acls/overview.md#acl-manage)
- [Fully-managed replica placement](resilience.md#confluent-cloud-resilience)
- [User interface to manage consumer lag](../monitoring/monitor-lag.md#cloud-monitoring-lag)
- [Topic management](../topics/overview.md#cloud-topics-manage)
- [Fully-Managed Connectors](../connectors/overview.md#kafka-connect-cloud)
- [View and consume Connect logs](../connectors/logging-cloud-connectors.md#ccloud-connector-logging)
- [Stream Governance](../stream-governance/index.md#cloud-dg)
- [Stream Catalog](../stream-governance/stream-catalog.md#cloud-stream-catalog)
- [Stream Lineage](../stream-governance/stream-lineage.md#cloud-stream-lineage)
- [Encryption-at-rest](https://confluent.safebase.us/?itemUid=ef061e5b-a2f4-469e-92bc-ab973e3d7842&source=title)
- [TLS for data in transit](../security/encrypt/tls.md#manage-data-in-transit-with-tls)
- [Role-based Access Control (RBAC)](../security/access-control/rbac/overview.md#cloud-rbac) (Basic clusters do not support RBAC roles for resources within the Kafka cluster)


### Feature comparison table

The tables below offer comparisons of the features supported by only some Kafka cluster types.

| Feature                                                                                                             | [Basic](#basic-cluster)   | [Standard](#standard-cluster)   | [Enterprise](#enterprise-cluster)   | [Dedicated](#dedicated-cluster)                          | [Freight](#freight-cluster)   |
|---------------------------------------------------------------------------------------------------------------------|---------------------------|---------------------------------|-------------------------------------|----------------------------------------------------------|-------------------------------|
| [Exactly Once Semantics](/platform/current/streams/concepts.html#streams-concepts-processing-guarantees)            | Yes                       | Yes                             | Yes                                 | Yes                                                      | No                            |
| [Key based compacted storage](/platform/current/kafka/design.html#log-compaction)                                   | Yes                       | Yes                             | Yes                                 | Yes                                                      | No                            |
| [Custom Connectors](../connectors/bring-your-connector/overview.md#cc-bring-your-connector)                         | Yes                       | Yes                             | No                                  | Yes                                                      | No                            |
| [Flink](../flink/overview.md#ccloud-flink)                                                                          | Yes                       | Yes                             | Yes                                 | Yes                                                      | No                            |
| [ksqlDB](../ksqldb/overview.md#cloud-ksqldb-create-stream-processing-apps)                                          | Yes                       | Yes                             | No                                  | Yes                                                      | No                            |
| [Public networking](../networking/overview.md#cloud-networking-support-public)                                      | Yes                       | Yes                             | No                                  | Yes                                                      | No                            |
| [Private networking](../networking/overview.md#cloud-networking-support-public)                                     | No                        | No                              | Yes                                 | Yes                                                      | Yes                           |
| [OAuth](../security/authenticate/workload-identities/identity-providers/oauth/overview.md#oauth-overview)           | No                        | Yes                             | Yes                                 | Yes                                                      | Yes                           |
| [Mutual TLS (mTLS)](../security/authenticate/workload-identities/identity-providers/mtls/overview.md#mtls-overview) | No                        | No                              | No                                  | Yes                                                      | No                            |
| [Audit logs](../monitoring/audit-logging/cloud-audit-log-concepts.md#cloud-audit-logs)                              | No                        | Yes                             | Yes                                 | Yes                                                      | Yes                           |
| [Self-managed encryption keys](../security/encrypt/byok/overview.md#byok-encrypted-clusters)                        | No                        | No                              | Yes                                 | Yes                                                      | No                            |
| [Automatic Elastic scaling](../billing/overview.md#e-cku-definition)                                                | Yes                       | Yes                             | Yes                                 | No                                                       | Yes                           |
| [Stream Sharing](../stream-sharing/index.md#cloud-data-sharing)                                                     | Yes                       | Yes                             | No                                  | Yes but all private networking options are not supported | No                            |
| [Client Quotas](client-quotas.md#client-quotas)                                                                     | No                        | No                              | No                                  | Yes                                                      | No                            |
| [Access Transparency](../monitoring/audit-logging/access-transparency-overview.md#access-transparency-overview)     | No                        | No                              | No                                  | Yes                                                      | No                            |


## Replicator Deployment Options

Migrating topic data is achieved by running Replicator in one of three modes. They
are functionally equivalent, but you might prefer one over the other based on
your starting point.

| Replicator Mode                                               | Advantages and Scenarios                                                                                                                                                                                                                                                                                       |
|---------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| As a connector within a distributed Connect cluster (on a VM) | Ideal if you already have a Connect cluster in use with the destination cluster.                                                                                                                                                                                                                               |
| As a packaged executable on a VM                              | Isolates three easy-to-use config files (for replicator, consumer, producer), and avoids having to explicitly configure the Connect cluster.

The [Quick Start](replicator-cloud-quickstart.md#cloud-replicator-quickstart) walks through an example of running Replicator as this type of executable. |
| As a packaged executable on Kubernetes                        | Similar to the above, but might be easier to start as a single isolated task. Ideal if you are already managing tasks within Kubernetes.                                                                                                                                                                       |


#### Configure properties

There are three config files for the executable (consumer, producer, and replication), and the minimal configuration changes for these are shown below.

* `consumer.properties`
  ```none
  ssl.endpoint.identification.algorithm=https
  sasl.mechanism=PLAIN
  bootstrap.servers=<source bootstrap-server>
  retry.backoff.ms=500
  sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<api-key>" password="<secret>";
  security.protocol=SASL_SSL
  ```
* `producer.properties`
  ```none
  ssl.endpoint.identification.algorithm=https
  sasl.mechanism=PLAIN
  bootstrap.servers=<destination bootstrap-server>
  sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<api-key>" password="<secret>";
  security.protocol=SASL_SSL
  ```
* `replication.properties` - No special configuration is required in `replication.properties`.


### Replicator

- [Replicator Quick Start to Migrate Topic Data on Confluent Cloud](replicator-cloud-quickstart.md#cloud-replicator-quickstart)
- [Confluent Replicator to Confluent Cloud Configurations](/platform/current/tutorials/examples/ccloud/docs/replicator-to-cloud-configuration-types.html)
- [On-Premises to Confluent Cloud example](/platform/current/tutorials/cp-demo/docs/index.html)
- [Multi-DC Deployment Architectures](/platform/current/multi-dc-deployments/index.html)
- [Replicator for Multi-Datacenter Replication](/platform/current/multi-dc-deployments/replicator/index.html)
- [Tutorial: Replicating Data Between Clusters](/platform/current/multi-dc-deployments/replicator/replicator-quickstart.html#replicator-quickstart)
- [Configure and Run Replicator](/platform/current/multi-dc-deployments/replicator/replicator-run.html#replicator-run)
- [Disaster Recovery for Multi-Datacenter Apache Kafka Deployments](https://www.confluent.io/white-paper/disaster-recovery-for-multi-datacenter-apache-kafka-deployments/).


#### Create a custom connector

Use the following command to create a custom connector.

Command syntax:

```bash
confluent connect cluster create [flags]
```

For example:

```bash
confluent connect cluster create --config-file connector-config.json --cluster lkc-abcd123 --environment env-a12b34
```

Example output:

```bash
+------+---------------------+
| ID   | clcc-wzxp69         |
| Name | my-custom-connector |
+------+---------------------+
```

Note that the ID of a Custom Connector starts with a prefix of `clcc` and the
ID of a Managed Connector starts with a prefix of `lcc`.

The JSON payload file consists of the following configuration properties. Note
that the connector name used in both instances of `"name"` in the payload must
be consistent.

```json
{
   "name": "my-custom-connector",
   "config": {
         "name": "my-custom-connector",
         "kafka.api.key": "********",
         "kafka.api.secret": "********",
         "confluent.connector.type": "CUSTOM",
         "confluent.custom.plugin.id": "custom-plugin-l65664",
         "tasks.max": "1",
         "interval.ms": "10000",
         "kafka.topic": "my-kafka-topic"
         }
}
```


#### Update a custom connector configuration

Use the following command to update a custom connector configuration. You use a
JSON payload file that contains all the configuration properties used to create
the original connector, with any changes needed for the update. In the example
JSON used in this example, the `tasks.max` is updated from `1` to `2`.

Command syntax:

```bash
confluent connect cluster update <id> [flags]
```

For example:

```bash
confluent connect cluster clcc-wzxp69 --config-file connector-config.json --cluster lkc-abcd123 --environment env-a12b34
```

Example output:

```bash
Updated connector "clcc-wzxp69"
```

The JSON payload file consists of the following configuration properties. Note
that the connector name used in both instances of `"name"` in the payload must
be consistent.

```json
{
   "name": "my-custom-connector",
   "config": {
         "name": "my-custom-connector",
         "kafka.api.key": "********",
         "kafka.api.secret": "********",
         "confluent.connector.type": "CUSTOM",
         "confluent.custom.plugin.id": "custom-plugin-l65664",
         "tasks.max": "2",
         "interval.ms": "10000",
         "kafka.topic": "my-kafka-topic"
         }
}
```


### Export log messages

The connector stores log messages in a Kafka topic. You can export log data using
any of the following options:

* Export logs using a Confluent connector: For example the [Elasticsearch
  Service Sink connector for Confluent Cloud](../cc-elasticsearch-service-sink.md#cc-elasticsearch-service-sink),
  or the [Elasticsearch Service Sink connector for Confluent Platform](https://docs.confluent.io/kafka-connectors/elasticsearch/current/overview.html)
  can export logs to Elasticsearch. Several additional connectors are available
  that may also be used for exporting logs.
* Create a custom integration using the [Kafka REST API for topics](https://docs.confluent.io/cloud/current/api.html#tag/Topic-(v3)) to get
  log messages to a destination logs service.

To manually configure a destination service to capture logs, you will need the
following:

* Bootstrap server endpoint: This is provided on the **Cluster Settings**
  page. For example, `pkc-abc123.<aws-region>.aws.confluent.cloud:9092`. You
  can also get this information using the following Confluent CLI command:
  ```bash
  confluent kafka cluster describe
  ```
* Log topic name: Get this from the topics page. For example, `clcc-<cluster-ID>-app-logs`.
  You can also get this information using the following Confluent CLI command:
  ```bash
  confluent kafka topic list
  ```

This information is also provided in the UI in **Cluster settings**.

![View cluster settings](images/ccloud-byoc-log-cluster-settings.png)


#### NOTE
Configuration properties that are not shown in the
Cloud Console use the default values.  See
[Configuration Properties](#cc-alloydb-sink-config-properties) for all property
values and definitions.

1. Select an **Input Kafka record value format**: (data coming from the
   Kafka topic) AVRO, JSON_SR (JSON Schema), or PROTOBUF. A valid schema
   must be available in [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) to use a
   schema-based message format..
2. Select an **insert mode** (insertion mode) to use:
   - `INSERT`: Use the standard `INSERT` row function. An error occurs if the row already exists in the table.
   - `UPSERT`: This mode is similar to `INSERT`. However, if the row already exists, the `UPSERT` function overwrites column values with the new values provided.

   ### **Show advanced configurations**

   - **Schema context**: Select a schema context to use for this connector, if using
     a schema-based data format. This property defaults to the **Default** context,
     which configures the connector to use the default schema set up for Schema Registry in your
     Confluent Cloud environment. A schema context allows you to use separate schemas (like
     schema sub-registries) tied to topics in different Kafka clusters that share the
     same Schema Registry environment. For example, if you select a non-default context, a
     **Source** connector uses only that schema context to register a schema and a
     **Sink** connector uses only that schema context to read from. For more
     information about setting up a schema context, see [What are schema contexts and when should you use them?](../sr/faqs-cc.md#faq-schema-contexts).
   - **Auto create table**: Whether to automatically create the
     destination table if it is missing.
   - **Auto add columns**: Whether to automatically add columns in the
     table if they are missing.

     #### NOTE
     Auto create tables and Auto add columns are optional. These
     properties set whether to automatically create tables or
     columns if they are missing relative to the input record
     schema. If not used, both default to `false`. When Auto
     create tables is set to `true`, the connector creates a table
     name using `${topic}` (that is, the Kafka topic name). For
     more information, see [Table names and Kafka topic names](#cc-alloydb-sink-truncation-behavior) and the
     [AlloyDB Sink configuration properties](#cc-alloydb-sink-config-properties).
   - **Database timezone**: Name of the timezone used in the
     connector when querying with time-based criteria. Defaults to `UTC`.
   - **Table name format**: A format string for the destination table
     name, which may contain `${topic}` as a placeholder for the
     originating topic name.
   - **Table types**: The comma-separated types of database tables to
     which the sink connector can write.
   - **Fields included**: List of comma-separated record value field
     names. If empty, all fields from the record value are used.
   - **PK mode**: The primary key mode. Options are:
     - `kafka`: Kafka coordinates are used as the primary key. Must
       be used with the **PK Fields** property.
     - `none`: No primary keys used.
     - `record_key`: Fields from the record key are used. May be a
       primitive or a struct.
     - `record_value`: Fields from the Kafka record value are used.
       Must be a struct type.
   - **PK Fields**: List of comma-separated primary key field names.
     Options are:
     - `kafka`: Must be three values representing the Kafka
       coordinates. If left empty, the coordinates default to
       `__connect_topic,__connect_partition,__connect_offset`.
     - `none`: PK Fields not used.
     - `record_key`: If left empty, all fields from the key struct
       are used. Otherwise, this is used to extract the fields in the
       property. A single field name must be configured for a
       primitive key.
     - `record_value`: Used to extract fields from the record value.
       If left empty, all fields from the value struct are used.
   - **When to quote SQL identifiers**: When to quote table names,
     column names, and other identifiers in SQL statements.
   - **Max rows per batch**: Maximum number of rows to include in a
     single batch when polling for new data. This setting can be used
     to limit the amount of data buffered internally in the connector.
   - **Input Kafka record key format**: Sets the input Kafka record key
     format. This need to be set to a proper format if using
     `pk.mode=record_key`. Valid entries are AVRO, JSON_SR, PROTOBUF,
     STRING. Note that you must have Confluent Cloud Schema Registry configured if
     using a schema-based message format like AVRO, JSON_SR, and
     PROTOBUF.
   - **Delete on null**: Whether to treat null record
     values as deletes. Requires `pk.mode` to be `record_key`.

   **Auto-restart policy**
   - **Enable Connector Auto-restart**: Control the auto-restart behavior of the connector and its
     task in the event of user-actionable errors. Defaults to `true`, enabling the connector to
     automatically restart in case of user-actionable errors. Set this property to `false` to
     disable auto-restart for failed connectors. In such cases, you would need to manually restart
     the connector.

   **Consumer configuration**
   - **Max poll interval(ms)**: Set the maximum delay between subsequent consume requests to Kafka. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 300,000 milliseconds (5 minutes).
   - **Max poll records**: Set the maximum number of records to consume from Kafka in a single request. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 500 records.

   **Transforms**
   - **Single Message Transforms**: To add a new SMT, see [Add transforms](single-message-transforms.md#cc-single-message-transforms-ui).
     For more information about unsupported SMTs, see
     [Unsupported transformations](single-message-transforms.md#cc-single-message-transforms-unsupported-transforms).

   See [Configuration Properties](#cc-alloydb-sink-config-properties) for all property
   values and definitions.
3. Click **Continue**.


#### NOTE
Configuration properties that are not shown in the
Cloud Console use the default values.  See
[Configuration Properties](#cc-amazon-cloudwatch-metrics-sink-config-properties) for all
property values and definitions.

1. Select the **Input Kafka record value** format (data coming from the
   Kafka topic): AVRO, JSON_SR, or PROTOBUF. A valid schema must be
   available in [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) to use a
   schema-based message format.

   ### **Show advanced configurations**

   - **Schema context**: Select a schema context to use for this connector, if using
     a schema-based data format. This property defaults to the **Default** context,
     which configures the connector to use the default schema set up for Schema Registry in your
     Confluent Cloud environment. A schema context allows you to use separate schemas (like
     schema sub-registries) tied to topics in different Kafka clusters that share the
     same Schema Registry environment. For example, if you select a non-default context, a
     **Source** connector uses only that schema context to register a schema and a
     **Sink** connector uses only that schema context to read from. For more
     information about setting up a schema context, see [What are schema contexts and when should you use them?](../sr/faqs-cc.md#faq-schema-contexts).
   - **Behavior on malformed metric**: The connector’s behavior if the
     Kafka record does not contain an expected field. Valid options are
     `LOG` and `FAIL`. `LOG` will log and skip the malformed
     records, and `FAIL` will fail the connector.

   **Auto-restart policy**
   - **Enable Connector Auto-restart**: Control the auto-restart behavior of the connector and its
     task in the event of user-actionable errors. Defaults to `true`, enabling the connector to
     automatically restart in case of user-actionable errors. Set this property to `false` to
     disable auto-restart for failed connectors. In such cases, you would need to manually restart
     the connector.

   **Consumer configuration**
   - **Max poll interval(ms)**: Set the maximum delay between subsequent consume requests to Kafka. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 300,000 milliseconds (5 minutes).
   - **Max poll records**: Set the maximum number of records to consume from Kafka in a single request. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 500 records.

   **Transforms**
   - **Single Message Transforms**: To add a new SMT, see [Add transforms](single-message-transforms.md#cc-single-message-transforms-ui).
     For more information about unsupported SMTs, see
     [Unsupported transformations](single-message-transforms.md#cc-single-message-transforms-unsupported-transforms).

   See [Configuration Properties](#cc-amazon-cloudwatch-metrics-sink-config-properties) for
   all property values and definitions.
2. Click **Continue**.


#### NOTE
Configuration properties that are not shown in the
Cloud Console use the default values.  See
[Configuration Properties](#cc-amazon-dynamodb-sink-config-properties) for all property
values and definitions.

1. Select the **Input Kafka record value** (data coming from the Kafka
   topic): AVRO, JSON_SR, PROTOBUF, or JSON. A valid schema must be
   available in [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) to use a
   schema-based message format (for example, Avro, JSON Schema, or
   Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for additional
   information.
2. In the **DynamoDB hash key** and **DynamoDB sort key** fields, enter
   the hash key and sort key, respectively. By default, the Kafka partition
   number is used for the hash key and the record offset is used as the
   sort key. For a few examples of how these keys work with other record
   references, see [DynamoDB hash keys and sort keys](#cc-amazon-dynamodb-sink-hash-sort). Note that the
   maximum size of a partition using the default configuration is limited
   to 10 GB (defined by Amazon DynamoDB).

   ### **Show advanced configurations**

   - **Schema context**: Select a schema context to use for this connector, if using
     a schema-based data format. This property defaults to the **Default** context,
     which configures the connector to use the default schema set up for Schema Registry in your
     Confluent Cloud environment. A schema context allows you to use separate schemas (like
     schema sub-registries) tied to topics in different Kafka clusters that share the
     same Schema Registry environment. For example, if you select a non-default context, a
     **Source** connector uses only that schema context to register a schema and a
     **Sink** connector uses only that schema context to read from. For more
     information about setting up a schema context, see [What are schema contexts and when should you use them?](../sr/faqs-cc.md#faq-schema-contexts).
   - **Table name format**: A format string for the destination table
     name, which may contain `${topic}` as a placeholder for the
     originating topic name. For example, to create a table named
     `kafka-orders` based on a Kafka topic named `orders`, you would
     enter `kafka-${topic}` in this field.

   **Auto-restart policy**
   - **Enable Connector Auto-restart**: Control the auto-restart behavior of the connector and its
     task in the event of user-actionable errors. Defaults to `true`, enabling the connector to
     automatically restart in case of user-actionable errors. Set this property to `false` to
     disable auto-restart for failed connectors. In such cases, you would need to manually restart
     the connector.

   **Transforms**
   - **Single Message Transforms**: To add a new SMT, see [Add transforms](single-message-transforms.md#cc-single-message-transforms-ui).
     For more information about unsupported SMTs, see
     [Unsupported transformations](single-message-transforms.md#cc-single-message-transforms-unsupported-transforms).

   See [Configuration Properties](#cc-amazon-dynamodb-sink-config-properties) for all
   property values and definitions.
3. Click **Continue**.


#### NOTE
Configuration properties that are not shown in the
Cloud Console use the default values.  See
[Configuration Properties](#cc-amazon-redshift-sink-config-properties) for all
property values and definitions.

1. Select the **Input Kafka record value** format (data coming from the
   Kafka topic): AVRO, JSON_SR (JSON Schema), or PROTOBUF. A valid schema
   must be available in [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) to use a
   schema-based message format (for example, Avro, JSON_SR (JSON Schema),
   or Protobuf). See [Schema Registry Enabled Environments](limits.md#connect-ccloud-environment-limits) for
   additional information.

   ### **Show advanced configurations**

   - **Schema context**: Select a schema context to use for this connector, if using
     a schema-based data format. This property defaults to the **Default** context,
     which configures the connector to use the default schema set up for Schema Registry in your
     Confluent Cloud environment. A schema context allows you to use separate schemas (like
     schema sub-registries) tied to topics in different Kafka clusters that share the
     same Schema Registry environment. For example, if you select a non-default context, a
     **Source** connector uses only that schema context to register a schema and a
     **Sink** connector uses only that schema context to read from. For more
     information about setting up a schema context, see [What are schema contexts and when should you use them?](../sr/faqs-cc.md#faq-schema-contexts).
   - **Table name format**: A format string for the destination table
     name, which may contain `${topic}` as a placeholder for the
     originating topic name. For example, to create a table named
     `kafka-orders` based on a Kafka topic named `orders`, you would
     enter `kafka-${topic}` in this field.
   - **Database timezone**: Name of the JDBC timezone that should be
     sed in the connector when inserting time-based values.
   - **Batch size**: Specifies how many records to attempt to batch
     together for insertion into the destination table.
   - **Auto create table**: Whether to automatically create the
     destination table if it is missing.
   - **Auto add columns**: Whether to automatically add columns in the
     table if they are missing.

   **Auto-restart policy**
   - **Enable Connector Auto-restart**: Control the auto-restart behavior of the connector and its
     task in the event of user-actionable errors. Defaults to `true`, enabling the connector to
     automatically restart in case of user-actionable errors. Set this property to `false` to
     disable auto-restart for failed connectors. In such cases, you would need to manually restart
     the connector.

   **Consumer configuration**
   - **Max poll interval(ms)**: Set the maximum delay between subsequent consume requests to Kafka. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 300,000 milliseconds (5 minutes).
   - **Max poll records**: Set the maximum number of records to consume from Kafka in a single request. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 500 records.

   **Transforms**
   - **Single Message Transforms**: To add a new SMT, see [Add transforms](single-message-transforms.md#cc-single-message-transforms-ui).
     For more information about unsupported SMTs, see
     [Unsupported transformations](single-message-transforms.md#cc-single-message-transforms-unsupported-transforms).

   See [Configuration Properties](#cc-amazon-redshift-sink-config-properties) for all property
   values and definitions.
2. Click **Continue**.


#### NOTE
* This Quick Start is for the fully-managed Confluent Cloud connector. If you are
  installing the connector locally for Confluent Platform, see [Amazon SQS
  Source Connector for Confluent Platform](https://docs.confluent.io/kafka-connectors/sqs/current/).
* If you require private networking for fully-managed connectors, make sure to set up the proper
  networking beforehand. For more information, see [Manage Networking for Confluent Cloud Connectors](networking/internet-resource.md#clusters-connect-cloud).

The connector converts an Amazon SQS message into a Kafka record, with the
following structure:

* The key encodes the SQS queue name and message ID in a struct. For FIFO queues, it also includes the message group ID.
* The value encodes the body of the SQS message and various message attributes in a struct.
* Each header encodes message attributes that may be present in the SQS message.

For record schema details, see [Record Schemas](#cc-amazon-sqs-record-schemas).

For **standard queues**, the connector supports best-effort ordering guarantees.
This means that there is a chance records will end up in a different order in
Kafka.

For **FIFO queues**, the connector guarantees records are inserted into Kafka in
the order they were inserted in Amazon SQS, as long as the destination Kafka
topic has exactly one partition. If the destination topic has more than one
partition, you can use a [Single Message Transforms (SMT)](single-message-transforms.md#cc-single-message-transforms) to set the partition based on the MessageGroupId
field in the key.

Note that the connector provides **least once delivery**. This means there is a
chance that the connector can introduce duplicate records in Kafka for both
standard and FIFO queues.


#### NOTE
Configuration properties that are not shown in the
Cloud Console use the default values.  See
[Configuration Properties](#cc-amazon-lambda-sink-config-properties) for all property values
and definitions.

1. Select the **Input Kafka record value** format (data coming from the
   Kafka topic): AVRO, JSON_SR, PROTOBUF, JSON, or BYTES. A valid schema
   must be available in [Schema Registry](../get-started/schema-registry.md#cloud-sr-config) to use a
   schema-based message format.

   ### **Show advanced configurations**

   - **Schema context**: Select a schema context to use for this connector, if using
     a schema-based data format. This property defaults to the **Default** context,
     which configures the connector to use the default schema set up for Schema Registry in your
     Confluent Cloud environment. A schema context allows you to use separate schemas (like
     schema sub-registries) tied to topics in different Kafka clusters that share the
     same Schema Registry environment. For example, if you select a non-default context, a
     **Source** connector uses only that schema context to register a schema and a
     **Sink** connector uses only that schema context to read from. For more
     information about setting up a schema context, see [What are schema contexts and when should you use them?](../sr/faqs-cc.md#faq-schema-contexts).
   - **AWS Lambda invocation type**: The mode in which the AWS Lambda
     function is invoked. Two modes are supported: **sync** and
     **async**. For more details about Lambda invocation, see
     [Synchronous invocation](https://docs.aws.amazon.com/lambda/latest/dg/invocation-sync.html)
     or [Asynchronous invocation](https://docs.aws.amazon.com/lambda/latest/dg/invocation-async.html).
   - **Batch size**: The maximum number of Kafka records to combine in
     a single AWS Lambda function invocation. You should set this as
     high as possible, without exceeding AWS Lambda invocation payload
     limits. To disable batching of records, set this value to 1.
   - **Record Converter Class**: Record converter class to convert
     Kafka records to AWS Lambda payload.

   **Auto-restart policy**
   - **Enable Connector Auto-restart**: Control the auto-restart behavior of the connector and its
     task in the event of user-actionable errors. Defaults to `true`, enabling the connector to
     automatically restart in case of user-actionable errors. Set this property to `false` to
     disable auto-restart for failed connectors. In such cases, you would need to manually restart
     the connector.

   **Consumer configuration**
   - **Max poll interval(ms)**: Set the maximum delay between subsequent consume requests to Kafka. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 300,000 milliseconds (5 minutes).
   - **Max poll records**: Set the maximum number of records to consume from Kafka in a single request. Use this property to
     improve connector performance in cases when the connector cannot send records to the sink system.
     The default is 500 records.

   **Transforms**
   - **Single Message Transforms**: To add a new SMT, see [Add transforms](single-message-transforms.md#cc-single-message-transforms-ui).
     For more information about unsupported SMTs, see
     [Unsupported transformations](single-message-transforms.md#cc-single-message-transforms-unsupported-transforms).

   See [Configuration Properties](#cc-amazon-lambda-sink-config-properties) for all property
   values and definitions.
2. Click **Continue**.


#### JDBC-based Source Connectors and the MongoDB Atlas Source Connector

The [Source connector service account](#cloud-service-account-source-connectors) section provides basic ACL
entries for source connector service accounts. Several source connectors allow a
topic prefix. When a prefix is used and the following connectors are created
using the CLI or API, you need to add ACL entries.

* [MySQL Source (JDBC) Connector for Confluent Cloud](cc-mysql-source.md#cc-mysql-source)
* [PostgreSQL Source (JDBC) Connector for Confluent Cloud](cc-postgresql-source.md#cc-postgresql-source)
* [Microsoft SQL Server Source (JDBC) Connector for Confluent Cloud](cc-microsoft-sql-server-source.md#cc-microsoft-sql-server-source)
* [Oracle Database Source (JDBC) Connector for Confluent Cloud](cc-oracle-db-source.md#cc-oracle-db-source)
* [Get Started with the MongoDB Atlas Source Connector for Confluent Cloud](cc-mongo-db-source.md#cc-mongo-db-source)
* [Snowflake Source Connector for Confluent Cloud](cc-snowflake-source/cc-snowflake-source.md#cc-snowflake-source)

Add the following ACL entries for these source connectors:

```none
confluent kafka acl create --allow --service-account "<service-account-id>" --operations create --prefix --topic "<topic.prefix>"
```

```none
confluent kafka acl create --allow --service-account "<service-account-id>" --operations write --prefix --topic "<topic.prefix>"
```


### Datagen Source

```json
{
  "connector.class": "DatagenSource",
  "kafka.api.key": "${KEY}",
  "kafka.api.secret": "${SECRET}",
  "kafka.topic": "datagen-source-smt-insert-field",
  "max.interval": "3000",
  "name": "DatagenSourceSmtInsertField",
  "output.data.format": "JSON",
  "quickstart": "ORDERS",
  "tasks.max": "1",
  "transforms": "insert",
  "transforms.insert.type": "org.apache.kafka.connect.transforms.InsertField$Value",
  "transforms.insert.partition.field": "PartitionField",
  "transforms.insert.static.field": "InsertedStaticField",
  "transforms.insert.static.value": "SomeValue",
  "transforms.insert.timestamp.field": "TimestampField",
  "transforms.insert.topic.field": "TopicField"
}
```


### Datagen Source

```json
{
  "connector.class": "DatagenSource",
  "kafka.api.key": "${KEY}",
  "kafka.api.secret": "${SECRET}",
  "kafka.topic": "datagen-source-smt-set-schema-metadata",
  "max.interval": "3000",
  "name": "DatagenSourceSmtSetSchemaMetadata",
  "output.data.format": "AVRO",
  "quickstart": "ORDERS",
  "tasks.max": "1",
  "transforms": "setSchemaMetadata",
  "transforms.setSchemaMetadata.type": "org.apache.kafka.connect.transforms.SetSchemaMetadata$Value",
  "transforms.setSchemaMetadata.schema.name": "schema_name",
  "transforms.setSchemaMetadata.schema.version": "12"
}
```


### Routes configuration

Routes are Confluent Gateway endpoints where client applications connect to stream
data.

Confluent Gateway uses routes to define how client applications connect to Kafka
clusters. Clients connect to the Gateway as if it were a Kafka cluster, while the
Gateway handles routing and governance.

```yaml
gateway:
  routes:
    - name:                         --- [1]
      endpoint:                     --- [2]
      brokerIdentificationStrategy: --- [3]
        type:                       --- [4]
        pattern:                    --- [5]
      streamingDomain:              --- [6]
        name:                       --- [7]
        bootstrapServerId:          --- [8]
      security:                     --- [9]
```

* [1] The unique name for the route.
* [2] The `host:port` combination that Confluent Gateway will listen on. This is
  the external address clients use to bootstrap to the Kafka cluster.
* [3] Specifies the strategy for mapping client requests to a specific Kafka
  broker.
* [4] The type of broker identification strategy. Set to `port` (default) or
  `host`.
  * `port` strategy: Each Kafka broker is identified using a unique port
    number. This is the default strategy.

    Clients connect to different ports to reach specific brokers (for example,
    port 9092 to connect with broker-0, port 9093 to connect with broker-1).

    The `nodeIdRanges` for the streaming domain you set in
    [Streaming domains configuration](#gateway-config-streaming-domains-docker) is used.

    `nodeIdRanges` should be present in all of the clusters associated with route’s streaming domain.
  * `host` strategy: Each Kafka broker is represented using a unique hostname.

    Clients use different host names to reach specific brokers (for example, `broker-0.kafka.company.com`, `broker-1.kafka.company.com`), and the gateway routes based on the SNI header.

    The `pattern` setting ([5]) is used.
* [5] The pattern for the broker identification strategy. Required if the type
  ([4]) is `host`. For example,
  `broker-$(nodeId).eu-gw.sales.example.com:9092`.
* [6] The reference to a streaming domain.
* [7] The name of the streaming domain. Must be a valid name from the
  `gateway.streamingDomains[].name`.
* [8] The bootstrap server ID. Must match
  `kafkaCluster.bootstrapServers[].id`.
* [9] The security configuration. See [Configure Security for Confluent Cloud Gateway](gateway-security.md#gateway-security-docker) section for
  details.

An example configuration for Confluent Gateway routes:

```yaml
routes:
  - name: eu-sales
    endpoint: eu-gw.sales.example.com:9092
    brokerIdentificationStrategy:
      type: host
      pattern: broker-$(nodeId).eu-gw.sales.example.com:9092
    streamingDomain:
      name: sales
      bootstrapServerId: SASL_SSL-1
```


## Client switchover between Kafka clusters

To perform client switchover:

1. Have your Confluent Gateway configured with two Streaming Domains for two Kafka
   clusters.

   For example:
   ```yaml
   streamingDomains:
     - name: kafka1-domain
       type: kafka
       kafkaCluster:
         name: kafka-cluster-1
         bootstrapServers:
           - id: internal-plaintext-listener
             endpoint: "kafka-1:44444"
     - name: kafka2-domain
       type: kafka
       kafkaCluster:
         name: kafka-cluster-2
         bootstrapServers:
           - id: internal-plaintext-listener
             endpoint: "kafka-2:22222"
   ```
2. Reconfigure the Route to point to the destination Kafka cluster by updating
   the Streaming Domain and corresponding bootstrap server ID in the Route.

   For example, the following configuration points the `switchover-route` to
   the `kafka1-domain` streaming domain, and the clients send and receive
   messages from the source Kafka cluster, `kafka-cluster-1`.
   ```yaml
   streamingDomains:
     - name: kafka1-domain
       type: kafka
       kafkaCluster:
         name: kafka-cluster-1
         bootstrapServers:
           - id: internal-plaintext-listener
             endpoint: "kafka-1:44444"
     - name: kafka2-domain
       type: kafka
       kafkaCluster:
         name: kafka-cluster-2
         bootstrapServers:
           - id: internal-plaintext-listener
             endpoint: "kafka-2:22222"
   routes:
     - name: switchover-route
       endpoint: "host.docker.internal:19092"
       streamingDomain:
         name: kafka1-domain
         bootstrapServerId: internal-plaintext-listener
   ```

   When you update the `switchover-route` to point to the `kafka2-domain`
   streaming domain, the clients will start sending and receiving new messages
   from the destination Kafka cluster, `kafka-cluster-2`.
   ```yaml
   streamingDomains:
     - name: kafka1-domain
       type: kafka
       kafkaCluster:
         name: kafka-cluster-1
         bootstrapServers:
           - id: internal-plaintext-listener
             endpoint: "kafka-1:44444"
     - name: kafka2-domain
       type: kafka
       kafkaCluster:
         name: kafka-cluster-2
         bootstrapServers:
           - id: internal-plaintext-listener
             endpoint: "kafka-2:22222"
   routes:
     - name: switchover-route
       endpoint: "host.docker.internal:19092"
       streamingDomain:
         name: kafka2-domain
         bootstrapServerId: internal-plaintext-listener
   ```
3. Stop and restart the Confluent Gateway container.

   When the Confluent Gateway container is restarted, the clients continue
   sending and receiving new messages from the destination Kafka cluster.

   No changes are required on the producer or consumer side.


# Configure Security for Confluent Cloud Gateway

This section provides details on the following security configurations for
Confluent Cloud Gateway (Confluent Gateway) using Docker Compose.

* [Authentication](#gateway-auth-docker)
* [TLS/SSL](#gateway-ssl-docker)
* [Secret stores](#gateway-secret-stores-docker)
* [Passwords](#gateway-password-docker)

For the security configuration steps using Confluent for Kubernetes (CFK), see [Configure Security for Confluent Gateway using CFK](https://docs.confluent.io/operator/current/gateway/co-gateway-security.html).

The top-level layout for the Confluent Gateway security configuration is as follows:

```yaml
gateway:
  secretStores:
  streamingDomains:
    kafkaCluster:
      bootstrapServers:
        - id:
          endpoint:
          ssl:
  routes:
    - name:
      security:
        auth:
        ssl:
        swapConfig:
```

For `streamingDomains.kafkaCluster.bootstrapServers.ssl` and `routes.security.ssl`, see the [SSL configuration](#gateway-ssl-docker) section.


#### **Cluster authentication for authentication swapping**

Configure how Confluent Gateway authenticates to the Kafka cluster for authentication swapping.

**SASL authentication**

```yaml
gateway:
  routes:
    - name:
      security:
        auth: swap
        swapConfig:
          clusterAuth:
            sasl:
              mechanism:               --- [1]
              callbackHandlerClass:    --- [2]
              jaasConfig:
                file:                  --- [3]
              oauth:
                tokenEndpointUri:      --- [4]
              connectionsMaxReauthMs:  --- [5]
```

* [1] The SASL mechanism to use. Set to `PLAIN` for SASL/PLAIN authentication,
  or set to `OAUTHBEARER` for SASL/OAUTHBEARER authentication.
* [2] The callback handler class to use. Set to
  `org.apache.kafka.common.security.plain.PlainServerCallbackHandler` for
  SASL/PLAIN authentication.
* [3] The path to the JAAS configuration file.
* [4] The URI for the OAuth token endpoint.
* [5] The maximum re-authentication time in milliseconds.

**JAAS configuration file content for SASL/PLAIN authentication**

```properties
org.apache.kafka.common.security.plain.PlainLoginModule required username="%s" password="%s";
```

**JAAS configuration file content for SASL/OAUTHBEARER authentication**

```properties
org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required clientId="%s" clientSecret="%s";
```


## Access a ksqlDB cluster by using an API key

Confluent Cloud ksqlDB supports authentication with a Confluent Cloud API
key. You can use an API key to access the hosted ksqlDB
cluster by using the ksqlDB CLI or HTTPS requests.

Run the `confluent ksql cluster list` command to get the URL of the ksqlDB
endpoint.

```bash
confluent ksql cluster list
```

Your output should resemble:

```none
       ID      |   Name      | Topic Prefix |   Kafka Cluster    | Storage |                         Endpoint                         | Status
---------------+-------------+--------------+--------------------+---------+----------------------------------------------------------+---------
  lksqlc-ab123 | ksqldb-app1 | pksqlc-zz321 | lkc-bc456j         |     500 | https://pksqlc-zz321.us-central1.gcp.confluent.cloud:443 | UP
```

Follow these guidelines for both the ksqlDB CLI and REST API commands:

- For `<cloud-ksqldb-url>`, use the endpoint value provided by the
  `confluent ksql cluster list` command, for example,
  `https://pksqlc-zz321.us-central1.gcp.confluent.cloud:443`.
- For `<ksqldb-specific-api-key>` and `<ksqldb-specific-secret>`, use an
  API key provided by the `confluent api-key create --resource <ksqldb-cluster-id>`
  command.

  #### IMPORTANT
  You must use a resource-specific key created for the ksqlDB cluster.
  API keys for Confluent Cloud or the Kafka cluster don’t work and cause an
  authorization error.


## Set up the cluster

In the following steps, you install the Confluent CLI and use it to sign in to Confluent Cloud,
get the cluster endpoint, and create a topic and an API key that you will use to configure the MQTT proxy.

1. Install Confluent CLI as described in
   [the Confluent CLI installation guide](https://docs.confluent.io/confluent-cli/current/install.html).
   For a list of all of the Confluent CLI commands, see
   [Confluent CLI Command Reference](https://docs.confluent.io/confluent-cli/current/command-reference/index.html).
2. Sign in to your Confluent Cloud cluster.
   ```bash
   confluent login
   ```

   Your output should resemble:
   ```none
   Enter your Confluent credentials:
   Email: jdoe@myemail.io
   Password: ***********************

   Logged in as "jdoe@myemail.io"
   Using environment "t118" ("default")
   ```
3. Run the `confluent kafka cluster list command` to get the Kafka cluster ID.
   ```bash
   confluent kafka cluster list
   ```

   Your output should resemble:
   ```bash
        Id      |        Name         |   Type    | Cloud    |  Region  | Availability | Status
   -------------+---------------------+-----------+----------+----------+--------------+---------
     lkc-m1234  | Dev                 | BASIC     | gcp      | us-west4 | single-zone  | UP
     lkc-r1234  | Test                | BASIC     | gcp      | us-east4 | single-zone  | UP
     lkc-g1234  | Prod                | DEDICATED | gcp      | us-west4 | single-zone  | UP
   ```
4. Set the active Kafka cluster. In this example, the cluster ID is `lkc-m1234`.
   ```bash
   confluent kafka cluster use lkc-m1234
   ```
5. Run the `confluent kafka cluster describe` command to get the endpoint for
   your Confluent Cloud cluster.
   ```bash
   confluent kafka cluster describe
   ```

   Your output should resemble:
   ```text
    +--------------+--------------------------------------------------------+
    | Id           | lkc-m1234                                              |
    | Name         | mqtt-proxy-quickstart                                  |
    | Type         | BASIC                                                  |
    | Ingress      |                                                    100 |
    | Egress       |                                                    100 |
    | Storage      |                                                   5000 |
    | Cloud        | gcp                                                    |
    | Availability | single-zone                                            |
    | Region       | us-west2                                               |
    | Status       | UP                                                     |
    | Endpoint     | SASL_SSL://pkc-12345.us-west2.gcp.confluent.cloud:9092 |
    | ApiEndpoint  | https://pkac-12345.us-west2.gcp.confluent.cloud        |
    +--------------+--------------------------------------------------------+
   ```

   Save the `Endpoint` value, which you’ll use to configure the bootstrap server for the
   MQTT Proxy.
6. Create a Kafka topic that the MQTT proxy will produce to.
   Use the Confluent CLI to create a topic named “temperature”.
   ```bash
   confluent kafka topic create temperature
   ```


1. Create a Kafka API key and secret that the MQTT proxy can use to access Confluent Cloud.
   You must specify the cluster with the `resource` flag for this step.
   ```bash
   confluent api-key create --resource lkc-m1234
   ```

   Your output should resemble:
   ```text
   It may take a couple of minutes for the API key to be ready.
   Save the API key and secret. The secret is not retrievable later.
   +---------+------------------------------------------------------------------+
   | API Key | ABCXQHYDZXMMUDEF                                                 |
   | Secret  | aBCde3s54+4Xv36YKPLDKy2aklGr6x/ShUrEX5D1Te4AzRlphFlr6eghmPX81HTF |
   +---------+------------------------------------------------------------------+
   ```

   #### IMPORTANT
   **Save the API key and secret.** You need this information to
   configure your applications that communicate with Confluent Cloud. This is the *only*
   time that you can access, view, and save the key and secret.


## Setup your environment and run a Table API program

Use [uv](https://docs.astral.sh/uv/) to create a virtual environment
that contains all required dependencies and project files.

1. Use one of the following commands to install uv.
   ```bash
   curl -LsSf https://astral.sh/uv/install.sh | sh
   # or
   brew install uv
   # or
   pip install uv
   ```
2. Create a new virtual environment.
   ```bash
   uv venv --python 3.11
   ```
3. Copy the following code into a file named `hello_table_api.py`.
   ```python
   # /// script
   # requires-python = ">=3.9,<3.12"
   # dependencies = [
   #   "confluent-flink-table-api-python-plugin>=2.1-8",
   # ]
   # ///

   from pyflink.table.confluent import ConfluentSettings, ConfluentTools
   from pyflink.table import TableEnvironment, Row
   from pyflink.table.expressions import col, row

   def run():
       # Set up the connection to Confluent Cloud
       settings = ConfluentSettings.from_global_variables()
       env = TableEnvironment.create(settings)

       # Run your first Flink statement in Table API
       env.from_elements([row("Hello world!")]).execute().print()

       # Or use SQL
       env.sql_query("SELECT 'Hello world!'").execute().print()

       # Structure your code with Table objects - the main ingredient of Table API.
       table = env.from_path("examples.marketplace.clicks") \
           .filter(col("user_agent").like("Mozilla%")) \
           .select(col("click_id"), col("user_id"))

       table.print_schema()
       print(table.explain())

       # Use the provided tools to test on a subset of the streaming data
       expected = ConfluentTools.collect_materialized_limit(table, 50)
       actual = [Row(42, 500)]
       if expected != actual:
           print("Results don't match!")

   if __name__ == "__main__":
       run()
   ```
4. Run the following command to execute the Table API program from the
   directory where you created `hello_table_api.py`.
   ```bash
   uv run hello_table_api.py
   ```


## Step 5. Deploy a Flink SQL statement

To use Flink, you must create a Flink compute pool. A compute pool represents
a set of compute resources that are bound to a region and are used to run your
Flink SQL statements. For more information, see [Compute Pools](../concepts/compute-pools.md#flink-sql-compute-pools).

1. Create a new compute pool by adding the following code to “main.tf”.
   ```terraform
   # Create a Flink compute pool to execute a Flink SQL statement.
   resource "confluent_flink_compute_pool" "my_compute_pool" {
     display_name = "my_compute_pool"
     cloud        = local.cloud
     region       = local.region
     max_cfu      = 10

     environment {
       id = confluent_environment.my_env.id
     }

     depends_on = [
       confluent_environment.my_env
     ]
   }
   ```
2. Create a Flink-specific API key, which is required for submitting statements
   to Confluent Cloud, by adding the following code to “main.tf”.
   ```terraform
   # Create a Flink-specific API key that will be used to submit statements.
   data "confluent_flink_region" "my_flink_region" {
     cloud  = local.cloud
     region = local.region
   }

   resource "confluent_api_key" "my_flink_api_key" {
     display_name = "my_flink_api_key"

     owner {
       id          = confluent_service_account.my_service_account.id
       api_version = confluent_service_account.my_service_account.api_version
       kind        = confluent_service_account.my_service_account.kind
     }

     managed_resource {
       id          = data.confluent_flink_region.my_flink_region.id
       api_version = data.confluent_flink_region.my_flink_region.api_version
       kind        = data.confluent_flink_region.my_flink_region.kind

       environment {
         id = confluent_environment.my_env.id
       }
     }

     depends_on = [
       confluent_environment.my_env,
       confluent_service_account.my_service_account
     ]
   }
   ```
3. Deploy a Flink SQL statement on Confluent Cloud by adding the following code to
   “main.tf”.

   The statement consumes data from `examples.marketplace.orders`, aggregates in 1 minute windows and
   ingests the filtered data into `sink_topic`.

   Because you’re using a Service Account, the statement runs in Confluent Cloud
   continuously until manually stopped.
   ```terraform
   # Deploy a Flink SQL statement to Confluent Cloud.
   resource "confluent_flink_statement" "my_flink_statement" {
     organization {
       id = data.confluent_organization.my_org.id
     }

     environment {
       id = confluent_environment.my_env.id
     }

     compute_pool {
       id = confluent_flink_compute_pool.my_compute_pool.id
     }

     principal {
       id = confluent_service_account.my_service_account.id
     }

     # This SQL reads data from source_topic, filters it, and ingests the filtered data into sink_topic.
     statement = <<EOT
       CREATE TABLE my_sink_topic AS
       SELECT
         window_start,
         window_end,
         SUM(price) AS total_revenue,
         COUNT(*) AS cnt
       FROM
       TABLE(TUMBLE(TABLE `examples`.`marketplace`.`orders`, DESCRIPTOR($rowtime), INTERVAL '1' MINUTE))
       GROUP BY window_start, window_end;
       EOT

     properties = {
       "sql.current-catalog"  = confluent_environment.my_env.display_name
       "sql.current-database" = confluent_kafka_cluster.my_kafka_cluster.display_name
     }

     rest_endpoint = data.confluent_flink_region.my_flink_region.rest_endpoint

     credentials {
       key    = confluent_api_key.my_flink_api_key.id
       secret = confluent_api_key.my_flink_api_key.secret
     }

     depends_on = [
       confluent_api_key.my_flink_api_key,
       confluent_flink_compute_pool.my_compute_pool,
       confluent_kafka_cluster.my_kafka_cluster
     ]
   }
   ```
4. Push all changes to your repository and check the **Actions** page to ensure
   the workflow runs successfully.
5. In Confluent Cloud Console, verify that the statement has been deployed and that
   `sink_topic` is receiving the data.

You have a fully functioning CI/CD pipeline with Confluent Cloud and Terraform. This
pipeline enables automating the deployment and management of your
infrastructure, making it more efficient and scalable.


### Confluent Cloud APIs

To create an API key for Flink access by using the Confluent Cloud APIs,
you must first create a Cloud API key.

To generate the Flink key, you send your Cloud API key and secret
in the request header, encoded as a base64 string.

1. Create a Cloud API key for the principal, which is either a service
   account or your user account. For more information, see
   [Add an API key](../../security/authenticate/workload-identities/service-accounts/api-keys/manage-api-keys.md#create-cloud-api-key).
2. Assign the Cloud API key and secret to environment variables that you
   use in your REST API requests.
   ```bash
   export CLOUD_API_KEY="<cloud-api-key>"
   export CLOUD_API_SECRET="<cloud-api-secret>"
   export PRINCIPAL_ID="<service-account-id>" # or "<user-account-id>"
   export ENV_REGION_ID="<environment-id>.<cloud-region>" # example: "env-z3y2x1.aws.us-east-1"
   ```

   The ENV_REGION_ID variable is a concatenation of your environment ID
   and the cloud provider region of your Kafka cluster, separated by a `.`
   character. To see the available regions, run the
   `confluent flink region list` command.
3. Run the following command to send a POST request to the `api-keys`
   endpoint. The REST API uses basic authentication, which means that you
   provide a base64-encoded string made from your Cloud API key and
   secret in the request header.
   ```bash
   curl --request POST \
     --url 'https://api.confluent.cloud/iam/v2/api-keys' \
     --header "Authorization: Basic $(echo -n "${CLOUD_API_KEY}:${CLOUD_API_SECRET}" | base64 -w 0)" \
     --header 'content-type: application/json' \
     --data "{"spec":{"display_name":"flinkapikey","owner":{"id":"${PRINCIPAL_ID}"},"resource":{"api_version":"fcpm/v2","id":"${ENV_REGION_ID}"}}}"
   ```

   Your output should resemble:
   ```json
   {
     "api_version": "iam/v2",
     "id": "KJDYFDMBOBDNQEIU",
     "kind": "ApiKey",
     "metadata": {
       "created_at": "2023-12-15T23:10:20.406556Z",
       "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/user=u-lq1dr3/api-key=KJDYFDMBOBDNQEIU",
       "self": "https://api.confluent.cloud/iam/v2/api-keys/KJDYFDMBOBDNQEIU",
       "updated_at": "2023-12-15T23:10:20.406556Z"
     },
     "spec": {
       "description": "",
       "display_name": "flinkapikey",
       "owner": {
         "api_version": "iam/v2",
         "id": "u-lq1dr3",
         "kind": "User",
         "related": "https://api.confluent.cloud/iam/v2/users/u-lq2dr7",
         "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/user=u-lq2dr7"
       },
       "resource": {
         "api_version": "fcpm/v2",
         "id": "env-z3q9rd.aws.us-east-1",
         "kind": "Region",
         "related": "https://api.confluent.cloud/fcpm/v2/regions?cloud=aws",
         "resource_name": "crn://api.confluent.cloud/organization=b0b21724-4586-4a07-b787-d0bb5aacbf87/environment=env-z3q9rd/flink-region=aws.us-east-1"
       },
       "secret": "B0BYFzyd0bb5Q58ZZJJYV52mbwDDHnZx21f0gOTz2k6Qv2V9I4KraVztwFOlQx6z"
     }
   }
   ```


## AI_TOOL_INVOKE

Invoke a registered tool, either externally by using an MCP server or locally
by using a [UDF](../../concepts/user-defined-functions.md#flink-sql-udfs), as part of an AI workflow.

Syntax
: ```sql
  AI_TOOL_INVOKE(model_name, input_prompt, remote_udf_descriptor, mcp_tool_descriptor [, invocation_config]);
  ```

Description
: The AI_TOOL_INVOKE function enables large language models (LLMs) to access
  various tools. The LLM decides which tools should be accessed, then the
  AI_TOOL_INVOKE function invokes the tools, gets the responses, and returns the
  responses to the LLM. The function returns a map that includes all the tools
  that were accessed, along with their responses and the status of the call,
  indicating whether it was a SUCCESS or FAILURE.


  The following models are supported:


  - Anthropic
  - AzureOpenAI
  - Gemini
  - OpenAI


  #### NOTE
  The AI_TOOL_INVOKE function is available for preview.


  A Preview feature is a Confluent Cloud component that is being introduced to gain
  early feedback from developers. Preview features can be used for evaluation
  and non-production testing purposes or to provide feedback to Confluent.
  The warranty, SLA, and Support Services provisions of your agreement with
  Confluent do not apply to Preview features. Confluent may discontinue
  providing preview releases of the Preview features at any time in
  Confluent’s’ sole discretion.

Configuration
: - `model_name`: Name of the model entity to call [STRING].
  - `input_prompt`: Input prompt to pass to the LLM [STRING].
  - `remote_udf_descriptor`: Map to pass UDF names as key and function
    description as value [MAP<String, String>]. A maximum of 3 UDFs can be
    passed.
  - `mcp_tool_descriptor`: Map to pass MCP tool names as key and tool
    description as value [MAP<String, String>]. A maximum of 5 tools can be
    passed. This additional description is passed to the LLM as
    “Additional description”. If the MCP server already has a description, and
    if the server doesn’t have a description, `mcp_tool_descriptor` is added as
    the description. You can leave it empty, in which case no changes are
    made to the description provided by the server.
  - `invocation_config[optional]`: Map to pass the config to manage function
    behavior, for example, `MAP['debug', true, 'on_error', 'continue']`.

Example
: The following example shows how to invoke a UDF and a registered external
  tool or API as part of an AI workflow.


  When you create an MCP server connection, specify the following options:


  - `endpoint`: Defines the base URL for all non-SSE communications with the
    MCP server, including other http calls and general data exchange.
  - `sse_endpoint`: Specifies the explicit URL endpoint used to establish a
    Server-Sent Events (SSE) connection with the MCP server. If omitted, the
    client defaults to constructing the SSE endpoint by appending `/sse` to
    the domain specified in `endpoint`.
  - `transport-type`: Specifies the transport type to use for the connection.
    Valid values are `SSE` and `STREAMABLE_HTTP`. The default is `SSE`.


  ```sql
  # Create an MCP server connection.
  CREATE CONNECTION claims_mcp_server
    WITH (
      'type' = 'mcp_server',
      'endpoint' = 'https://mcp.deepwiki.com',
      'sse-endpoint' = 'https://mcp.deepwiki.com/sse',
      'api-key' = 'api_key'
    );
  ```


  ```sql
  -- Create a model that uses the MCP server connection.
  CREATE MODEL tool_invoker
    INPUT (input_message STRING)
    OUTPUT (tool_calls STRING)
    WITH(
      'provider' = 'openai',
      'openai.connection' = openai_connection,
      'openai.system_prompt' = 'Select the best tools to complete the task',
      'mcp.connection' = 'claims_mcp_server'
    );


  -- Create a table that contains the input prompts.
  CREATE TABLE claims_verified (
    id int,
    customer_id int
  );


  -- Run the AI_TOOL_INVOKE function.
  SELECT
    id,
    customer_id,
    AI_TOOL_INVOKE(
      'tool_invoker',
      customer_id,
      MAP['udf_1', 'udf_1 description', 'udf_2', 'udf_2 description'],
      MAP['tool_1', 'tool_1_description', 'tool_2', 'tool_2_description']
    ) AS verified_result
  FROM claims_verified;
  ```


## Step 4: Query Iceberg tables from Spark

In this step, you read Iceberg tables created by Tableflow by using
[PySpark](https://spark.apache.org/docs/latest/api/python/index.html).

- Ensure that Docker is installed and running in your development environment.

1. Run the following command to start PySpark in a docker container. In this
   command, the AWS_REGION option must match your Kafka cluster region, for
   example, `us-west-2`.
   ```bash
   docker run -d \
     --name spark-iceberg \
     -v $(pwd)/warehouse:/home/iceberg/warehouse \
     -v $(pwd)/notebooks:/home/iceberg/notebooks/notebooks \
     -e AWS_REGION=${YOUR_CLUSTER_REGION} \
     -p 8888:8888 \
     -p 8080:8080 \
     -p 10000:10000 \
     -p 10001:10001 \
     tabulario/spark-iceberg
   ```

   Once the container has started successfully, you can access Jupyter
   notebooks in your browser by going to [http://localhost:8888](http://localhost:8888).
   ![Screenshot of Jupyter notebooks in PySpark](topics/tableflow/images/tableflow-iceberg-reader.png)
2. Upload the following `ipynb` file by clicking **Upload**. This file
   pre-populates the notebook that you use to test Tableflow.
   <details id="target-details">
   <summary id="target-summary" style="display: list-item; cursor:pointer; color:#337ab7;">
     tableflow-quickstart.ipynb
   </summary>
   ```json
   {
      "cells": [
   {
      "cell_type": "markdown",
      "id": "2b3b8256-432a-46a8-8542-837777aada52",
      "metadata": {},
      "source": [
      "## Register rest catalog as default catalog for Spark"
      ]
   },
   {
      "cell_type": "code",
      "execution_count": 1,
      "id": "e4d27656-867c-464e-a8c0-4b590fd7aae2",
      "metadata": {},
      "outputs": [
      {
      "name": "stderr",
      "output_type": "stream",
      "text": [
         "24/05/18 07:27:44 WARN SparkSession: Using an existing Spark session; only runtime SQL configurations will take effect.\n"
      ]
      }
      ],
      "source": [
      "from pyspark.sql import SparkSession\n",
      "\n",
      "conf = (\n",
      "    pyspark.SparkConf()\n",
      "        .setAppName('Jupyter')\n",
      "        .set(\"spark.sql.extensions\", \"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions\")\n",
      "        .set(\"spark.sql.catalog.tableflowdemo\", \"org.apache.iceberg.spark.SparkCatalog\")\n",
      "        .set(\"spark.sql.catalog.tableflowdemo.type\", \"rest\")\n",
      "        .set(\"spark.sql.catalog.tableflowdemo.uri\", \"<Tableflow REST Catalog URI>\")\n",
      "        .set(\"spark.sql.catalog.tableflowdemo.credential\", \"<api_key>:<secret>\")\n",
      "        .set(\"spark.sql.catalog.tableflowdemo.io-impl\", \"org.apache.iceberg.aws.s3.S3FileIO\")\n",
      "        .set(\"spark.sql.catalog.tableflowdemo.rest-metrics-reporting-enabled\", \"false\")\n",
      "        .set(\"spark.sql.defaultCatalog\", \"tableflowdemo\")\n",
      "        .set(\"spark.sql.catalog.tableflowdemo.s3.remote-signing-enabled\", \"true\")\n",
      ")\n",
      "spark = SparkSession.builder.config(conf=conf).getOrCreate()\n"
      ]
   },
   {
      "cell_type": "markdown",
      "id": "3f7f0ed8-39bf-4ad1-ad72-d2f6e010c4b5",
      "metadata": {},
      "source": [
      "## List all the tables in the db"
      ]
   },
   {
      "cell_type": "code",
      "execution_count": null,
      "id": "89fc7044-4f9e-47f7-8fca-05b15da88a9c",
      "metadata": {},
      "outputs": [],
      "source": [
      "%%sql \n",
      "SHOW TABLES in `<your-kafka-cluster-id>`"
      ]
   },
   {
      "cell_type": "markdown",
      "id": "9282572f-557d-4bb7-9f3e-511e86889304",
      "metadata": {},
      "source": [
      "## Query all records in the table"
      ]
   },
   {
      "cell_type": "code",
      "execution_count": null,
      "id": "345f8fef-9d1f-4cc5-8015-babdf4102988",
      "metadata": {},
      "outputs": [],
      "source": [
      "%%sql \n",
      "SELECT *\n",
      "FROM `<your-kafka-cluster-id>`.`stock-trades`;"
      ]
   }
   ],
   "metadata": {
   "kernelspec": {
      "display_name": "Python 3 (ipykernel)",
      "language": "python",
      "name": "python3"
   },
   "language_info": {
      "codemirror_mode": {
      "name": "ipython",
      "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.9.18"
   }
   },
   "nbformat": 4,
   "nbformat_minor": 5
   }
   ```

   </details>


   A new notebook named **tableflow-quickstart** appears.
3. Double-click the **tableflow-quickstart** notebook to open it.
   ![Screenshot of Jupyter notebooks in PySpark showing the Confluent Tableflow Playground notebook](topics/tableflow/images/tableflow-playground-notebook.png)
4. Update the following properties of the Spark configuration with the values
   from [Step 3](#cloud-tableflow-quick-start-managed-storage-credentials).
   - `spark.sql.catalog.tableflowdemo.uri`
   - `spark.sql.catalog.tableflowdemo.credential <apikey>:<secret>`
5. Run each cell individually from the **Run** menu by updating the query with
   the information corresponding to your cluster and topics.

   #### NOTE
   Query with the cluster ID, not the cluster name.

   You can see the list of tables and table data in the cells’ output.


## How do I connect Confluent Cloud for Flink SQL to a Confluent Cloud Kafka topic?

Connect using Apache Flink® connectors with proper Kafka properties and Schema Registry integration. For more information, see [Stream Processing with Confluent Cloud for Apache Flink](../flink/overview.md#ccloud-flink).


### Create a new service account with an API key/secret pair

1. Run the following command to create a new service account:
   ```bash
   confluent iam service-account create demo-app-1 --description "Service account for demo application" -o json
   ```
2. Verify your output resembles:
   ```text
   {
      "id": "sa-123456",
      "name": "demo-app-1",
      "description": "Service account for demo application"
   }
   ```

   The value of the service account ID, in this case `sa-123456`, will differ in
   your output.
3. Create an API key and secret for the service account `sa-123456` for the Kafka
   cluster `lkc-x6m01` by running the following command:
   ```bash
   confluent api-key create --service-account sa-123456 --resource lkc-x6m01 -o json
   ```
4. Verify your output resembles:
   ```text
   {
     "key": "ESN5FSNDHOFFSUEV",
     "secret": "nzBEyC1k7zfLvVON3vhBMQrNRjJR7pdMc2WLVyyPscBhYHkMwP6VpPVDTqhctamB"
   }
   ```

   The value of the service account’s API key, in this case
   `ESN5FSNDHOFFSUEV`, and API secret, in this case
   `nzBEyC1k7zfLvVON3vhBMQrNRjJR7pdMc2WLVyyPscBhYHkMwP6VpPVDTqhctamB`, will
   differ in your output.
5. Create a local configuration file `/tmp/client.config` with Confluent Cloud
   connection information using the newly created Kafka cluster and the API key
   and secret for the service account. Substitute your values for the bootstrap
   server and credentials just created.
   ```text
   sasl.mechanism=PLAIN
   security.protocol=SASL_SSL
   bootstrap.servers=pkc-4kgmg.us-west-2.aws.confluent.cloud:9092
   sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='ESN5FSNDHOFFSUEV' password='nzBEyC1k7zfLvVON3vhBMQrNRjJR7pdMc2WLVyyPscBhYHkMwP6VpPVDTqhctamB';
   ```
6. Wait about 90 seconds for the Confluent Cloud cluster to be ready and for the
   service account credentials to propagate.


## Clean up Confluent Cloud resources

1. Complete the following steps to delete the managed connector:
   1. Find the connector ID:
      ```bash
      confluent connect list
      ```

      Which should display a something similar to below. Locate your connector ID, in this case the connector ID is `lcc-zno83`.
      ```text
            ID     |           Name            | Status  |  Type  | Trace
      -------------+---------------------------+---------+--------+--------
         lcc-zno83 | datagen_ccloud_pageviews  | RUNNING | source |
      ```
   2. Delete the connector, referencing the connector ID from the previous step:
      ```bash
      confluent connect delete lcc-zno83
      ```

      You should see: `Deleted connector "lcc-zno83".`.
2. Run the following command to delete the service account:
   ```bash
   confluent iam service-account delete sa-123456
   ```
3. Complete the following steps to delete all the Kafka topics:
   1. Delete `demo-topic-1`:
      ```bash
      confluent kafka topic delete demo-topic-1
      ```

      You should see: `Deleted topic "demo-topic-1"`.
   2. Delete `demo-topic-2`:
      ```bash
      confluent kafka topic delete demo-topic-2
      ```

      You should see: `Deleted topic "demo-topic-2"`.
   3. Delete `demo-topic-3`:
      ```bash
      confluent kafka topic delete demo-topic-3
      ```

      You should see: `Deleted topic "demo-topic-3"`.
4. Run the following command to delete the user API key:
   ```bash
   confluent api-key delete QX7X4VA4DFJTTOIA
   ```

   Note that the service account API key was deleted when you deleted the service account.
5. Delete the Kafka cluster:
   ```bash
   confluent kafka cluster delete lkc-x6m01
   ```
6. Delete the environment:
   ```bash
   confluent environment delete env-5qz2q
   ```

   You should see: `Deleted environment "env-5qz2q"`.

If the tutorial ends prematurely, you may receive the following error message
when trying to run the example again (`confluent environment create
ccloud-stack-000000-beginner-cli`):

```text
Error: 1 error occurred:
   * error creating account: Account name is already in use

Failed to create environment ccloud-stack-000000-beginner-cli. Please troubleshoot and run again
```

In this case, run the following script to delete the example’s topics, Kafka
cluster, and environment:

```bash
./cleanup.sh
```


## Flags

```none
--file string                       REQUIRED: Input filename.
--overwrite                         Overwrite existing topics with the same name.
--kafka-api-key string              Kafka cluster API key.
--schema-registry-endpoint string   The URL of the Schema Registry cluster.
--kafka-endpoint string             Endpoint to be used for this Kafka cluster.
```


## Examples

Create a Java client configuration file.

```none
confluent kafka client-config create java --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a Java client configuration file with arguments.

```none
confluent kafka client-config create java --environment env-123 --cluster lkc-123456 --api-key my-key --api-secret my-secret --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a Java client configuration file, redirecting the configuration to a file and the warnings to a separate file.

```none
confluent kafka client-config create java --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2> my-warnings-file
```

Create a Java client configuration file, redirecting the configuration to a file and keeping the warnings in the console.

```none
confluent kafka client-config create java --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2>&1
```


## Examples

Create a Ktor client configuration file.

```none
confluent kafka client-config create ktor --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a Ktor client configuration file with arguments.

```none
confluent kafka client-config create ktor --environment env-123 --cluster lkc-123456 --api-key my-key --api-secret my-secret --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a Ktor client configuration file, redirecting the configuration to a file and the warnings to a separate file.

```none
confluent kafka client-config create ktor --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2> my-warnings-file
```

Create a Ktor client configuration file, redirecting the configuration to a file and keeping the warnings in the console.

```none
confluent kafka client-config create ktor --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2>&1
```


## Examples

Create a Python client configuration file.

```none
confluent kafka client-config create python --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a Python client configuration file with arguments.

```none
confluent kafka client-config create python --environment env-123 --cluster lkc-123456 --api-key my-key --api-secret my-secret --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a Python client configuration file, redirecting the configuration to a file and the warnings to a separate file.

```none
confluent kafka client-config create python --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2> my-warnings-file
```

Create a Python client configuration file, redirecting the configuration to a file and keeping the warnings in the console.

```none
confluent kafka client-config create python --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2>&1
```


## Examples

Create a REST API client configuration file.

```none
confluent kafka client-config create restapi --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a REST API client configuration file with arguments.

```none
confluent kafka client-config create restapi --environment env-123 --cluster lkc-123456 --api-key my-key --api-secret my-secret --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a REST API client configuration file, redirecting the configuration to a file and the warnings to a separate file.

```none
confluent kafka client-config create restapi --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2> my-warnings-file
```

Create a REST API client configuration file, redirecting the configuration to a file and keeping the warnings in the console.

```none
confluent kafka client-config create restapi --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2>&1
```


## Examples

Create a Spring Boot client configuration file.

```none
confluent kafka client-config create springboot --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a Spring Boot client configuration file with arguments.

```none
confluent kafka client-config create springboot --environment env-123 --cluster lkc-123456 --api-key my-key --api-secret my-secret --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret
```

Create a Spring Boot client configuration file, redirecting the configuration to a file and the warnings to a separate file.

```none
confluent kafka client-config create springboot --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2> my-warnings-file
```

Create a Spring Boot client configuration file, redirecting the configuration to a file and keeping the warnings in the console.

```none
confluent kafka client-config create springboot --schema-registry-api-key my-sr-key --schema-registry-api-secret my-sr-secret 1> my-client-config-file.config 2>&1
```


### On-Premises

```none
--destination-cluster string            Destination cluster ID.
--destination-bootstrap-server string   Bootstrap server address of the destination cluster. Can alternatively be set in the configuration file using key "bootstrap.servers".
--remote-cluster string                 Remote cluster ID for bidirectional cluster links.
--remote-bootstrap-server string        Bootstrap server address of the remote cluster for bidirectional links. Can alternatively be set in the configuration file using key "bootstrap.servers".
--source-api-key string                 An API key for the source cluster. For links at destination cluster, this is used for remote cluster authentication. For links at source cluster, this is used for local cluster authentication. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--source-api-secret string              An API secret for the source cluster. For links at destination cluster, this is used for remote cluster authentication. For links at source cluster, this is used for local cluster authentication. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--destination-api-key string            An API key for the destination cluster. This is used for remote cluster authentication links at the source cluster. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--destination-api-secret string         An API secret for the destination cluster. This is used for remote cluster authentication for links at the source cluster. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--remote-api-key string                 An API key for the remote cluster for bidirectional links. This is used for remote cluster authentication. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--remote-api-secret string              An API secret for the remote cluster for bidirectional links. This is used for remote cluster authentication. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--local-api-key string                  An API key for the local cluster for bidirectional links. This is used for local cluster authentication if remote link's connection mode is Inbound. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--local-api-secret string               An API secret for the local cluster for bidirectional links. This is used for local cluster authentication if remote link's connection mode is Inbound. If specified, the cluster will use SASL_SSL with PLAIN SASL as its mechanism for authentication. If you wish to use another authentication mechanism, do not specify this flag, and add the security configurations in the configuration file.
--config strings                        A comma-separated list of "key=value" pairs, or path to a configuration file containing a newline-separated list of "key=value" pairs.
--dry-run                               Validate a link, but do not create it.
--no-validate                           Create a link even if the source cluster cannot be reached.
--url string                            Base URL of REST Proxy Endpoint of Kafka Cluster (include "/kafka" for embedded Rest Proxy). Must set flag or CONFLUENT_REST_URL.
--certificate-authority-path string     Path to a PEM-encoded Certificate Authority to verify the Confluent REST Proxy.
--client-cert-path string               Path to client cert to be verified by Confluent REST Proxy. Include for mTLS authentication.
--client-key-path string                Path to client private key, include for mTLS authentication.
--no-authentication                     Include if requests should be made without authentication headers and user will not be prompted for credentials.
--prompt                                Bypass use of available login credentials and prompt for Kafka Rest credentials.
--context string                        CLI context name.
```


### On-Premises

```none
--url string                          Base URL of REST Proxy Endpoint of Kafka Cluster (include "/kafka" for embedded Rest Proxy). Must set flag or CONFLUENT_REST_URL.
--certificate-authority-path string   Path to a PEM-encoded Certificate Authority to verify the Confluent REST Proxy.
--client-cert-path string             Path to client cert to be verified by Confluent REST Proxy. Include for mTLS authentication.
--client-key-path string              Path to client private key, include for mTLS authentication.
--no-authentication                   Include if requests should be made without authentication headers and user will not be prompted for credentials.
--prompt                              Bypass use of available login credentials and prompt for Kafka Rest credentials.
--partitions uint32                   Number of topic partitions.
--replication-factor uint32           Number of replicas.
--config strings                      A comma-separated list of "key=value" pairs, or path to a configuration file containing a newline-separated list of "key=value" pairs.
--if-not-exists                       Exit gracefully if topic already exists.
```


### On-Premises

| Command                                                                         | Description                                                |
|---------------------------------------------------------------------------------|------------------------------------------------------------|
| [confluent audit-log](audit-log/index.md#confluent-audit-log)                   | Manage audit log configuration.                            |
| [confluent cloud-signup](confluent_cloud-signup.md#confluent-cloud-signup)      | Sign up for Confluent Cloud.                               |
| [confluent cluster](cluster/index.md#confluent-cluster)                         | Retrieve metadata about Confluent Platform clusters.       |
| [confluent completion](confluent_completion.md#confluent-completion)            | Print shell completion code.                               |
| [confluent configuration](configuration/index.md#confluent-configuration)       | Configure the Confluent CLI.                               |
| [confluent connect](connect/index.md#confluent-connect)                         | Manage Kafka Connect.                                      |
| [confluent context](context/index.md#confluent-context)                         | Manage CLI configuration contexts.                         |
| [confluent flink](flink/index.md#confluent-flink)                               | Manage Apache Flink.                                       |
| [confluent iam](iam/index.md#confluent-iam)                                     | Manage RBAC, ACL and IAM permissions.                      |
| [confluent kafka](kafka/index.md#confluent-kafka)                               | Manage Apache Kafka.                                       |
| [confluent ksql](ksql/index.md#confluent-ksql)                                  | Manage ksqlDB.                                             |
| [confluent local](local/index.md#confluent-local)                               | Manage a local Confluent Platform development environment. |
| [confluent login](confluent_login.md#confluent-login)                           | Log in to Confluent Cloud or Confluent Platform.           |
| [confluent logout](confluent_logout.md#confluent-logout)                        | Log out of Confluent Platform.                             |
| [confluent plugin](plugin/index.md#confluent-plugin)                            | Manage Confluent plugins.                                  |
| [confluent prompt](confluent_prompt.md#confluent-prompt)                        | Add Confluent CLI context to your terminal prompt.         |
| [confluent schema-registry](schema-registry/index.md#confluent-schema-registry) | Manage Schema Registry.                                    |
| [confluent secret](secret/index.md#confluent-secret)                            | Manage secrets for Confluent Platform.                     |
| [confluent shell](confluent_shell.md#confluent-shell)                           | Start an interactive shell.                                |
| [confluent update](confluent_update.md#confluent-update)                        | Update the Confluent CLI.                                  |
| [confluent version](confluent_version.md#confluent-version)                     | Show version of the Confluent CLI.                         |


### Produce and consume with Schema Registry

Confluent CLI supports producing and consuming with Schema Registry functionalities. You
can register a schema with a local file as you produce (writing data in that
schema), and then read data from the schema as you consume.

1. In the Schema Registry cluster you registered, retrieve the following information the
   CLI requires:
   - Schema Registry endpoint.
   - The value format.
   - A local schema file.
2. When using Schema Registry, you must log in to Kafka as the MDS token will be used to
   authenticate the Schema Registry client:
   ```text
   confluent login --ca-cert-path <ca-cert-path> \
     --url https://<host>:<port>
   ```
3. Produce and consume using the Confluent CLI commands.

   An example CLI command to produce to `test-topic`:
   ```text
   confluent kafka topic produce test-topic \
     --protocol SASL_SSL \
     --bootstrap ":19091" \
     --username admin --password secret \
     --value-format avro \
     --schema ~/schema.avsc \
     --sr-endpoint https://localhost:8085 \
     --ca-location scripts/security/snakeoil-ca-1.crt
   ```

   An example CLI command to consume from `test-topic`:
   ```text
   confluent kafka topic consume test-topic -b \
     --protocol SASL_SSL \
     --bootstrap ":19091" \
     --username admin --password secret \
     --value-format avro \
     --sr-endpoint https://localhost:8085 \
     --ca-location scripts/security/snakeoil-ca-1.crt
   ```

   - `--schema` is the path to your local schema file.
   - Specify `--value-format` according to the format of the schema file:
     `avro`, `json` or `protobuf`. When later consuming, it should also be
     set to the same value.
   - `--sr-endpoint` is the endpoint to the Schema Registry cluster.
   - `--ca-location` is required flag when working with schemas. It’s used to
     authenticate the Schema Registry client. It might be the same file that you use for
     SSL verification.


## Use case 1: Confluent Cloud cluster with public networking

For users with a simple setup–with Confluent CLI installed and with internet
connectivity –you want to do two tasks:

1. List the API keys:
   ```text
   confluent api-key list
   ```
2. List the topics on the cluster:
   ```text
   confluent kafka topic list --environment env-639yqq --cluster lkc-nykw7z
   ```

In this use case, `lkc-nykw7z` is a basic cluster with public/internet
endpoints. Both of the previous commands will egress using your workstation’s
default gateway to the internet, but to two different internet endpoints. The
API Key request goes to an `api.confluent.cloud` endpoint (Control Plane), and
the topic list request goes to the broker’s Kafka REST API endpoint (Data Plane).

Here are the two requests:

```text
GET https://api.confluent.cloud/iam/v2/api-keys?page_size=100&spec.owner=&spec.resource= HTTP/2.0
```

```text
GET https://pkc-ldvj1.ap-southeast-2.aws.confluent.cloud/kafka/v3/clusters/lkc-nykw7z/topics
```

If network administration and firewall rules prevent direct outbound internet
connections and an outbound forward proxy server is required, you must configure
the Confluent CLI to use the proxy server. The workstation user must add the
following to the `.bashrc` file. Note that you must supply the proxy server
host and port, for example, `http://<PROXYHOST>:<PROXYPORT>`:

```text
export HTTPS_PROXY=http://localhost:8080
```

Both of the previous connections will be routed using the proxy.


## Cluster overview page

The overview page for a single Apache Kafka® cluster provides a summary view of the cluster
and its connected services.

![Normal mode and Reduced infrastructure mode](images/basics-c3-cluster-overview.png)

The following table describes the panels found on the Clusters page by mode.
All of the panels are clickable and navigate you directly to the relevant
sections.

| Section                                                     | Normal mode                                                                            | Reduced infrastructure mode                 |
|-------------------------------------------------------------|----------------------------------------------------------------------------------------|---------------------------------------------|
| [Brokers overview](brokers.md#c3-brokers-overview-metrics)  | Total brokers with production and consumption throughput.                              | Not visible in Reduced infrastructure mode. |
| [Topics overview](topics/overview.md#c3-all-topics)         | Total topics, total partitions, under
replicated partitions, out of sync replicas. | Total topics and total partitions.          |
| [Connect overview](connect.md#c3-all-connect-clusters-page) | Number of Connect clusters and connector status.                                       | Same as Normal mode.                        |
| [ksqlDB overview](ksql.md#c3-ksql-clusters-page)            | Number of ksqlDB clusters and persistent queries.                                      | Same as Normal mode.                        |

You can view and edit cluster properties and broker configurations in the
Cluster settings pages. When you click the Cluster settings sub-menu for a
cluster, the **General** tab appears by default.


### confluent.controlcenter.auth.restricted.roles

Specify a list of roles with limited read-only access. You must include roles
added here in `confluent.controlcenter.rest.authentication.roles`. For users that are members of
roles included in this list, the following features and options are unavailable:

* Add, delete, pause, or resume connectors
* Browse connectors
* View connector settings
* Upload connector configs
* Create, delete, or edit alerts (triggers or actions)
* Edit a license
* Edit brokers
* Press submit on cluster forms
* Edit, create, or delete schemas
* Edit data flow queries
* [Inspect topics](../topics/messages.md#c3-topic-message-browser)
* Type in the KSQL editor
* [Run or stop ksqlDB querie](../ksql.md#controlcenter-userguide-ksql)
* Add ksqlDB streams or table

For fine-grained access control, consider configuring
[role-based access control (RBAC)](../security/c3-rbac.md#controlcenter-security-rbac).

* Type: list
* Default: “”
* Importance: low


### ZooKeeper

Considerations:
: - You must use a special command to start Prometheus on MacOS.

1. Download the Confluent Platform archive (7.7 to 7.9 supported) and run these commands:
   ```bash
   wget https://packages.confluent.io/archive/7.9/confluent-7.9.0.tar.gz
   ```

   ```bash
   tar -xvf confluent-7.9.0.tar.gz
   ```

   ```bash
   cd confluent-7.9.0
   ```

   ```bash
   export CONFLUENT_HOME=`pwd`
   ```
2. Update broker configurations to emit metrics to Prometheus by adding
   the following configurations to: `etc/kafka/server.properties`
   ```bash
   metric.reporters=io.confluent.telemetry.reporter.TelemetryReporter
   confluent.telemetry.exporter._c3.type=http
   confluent.telemetry.exporter._c3.enabled=true
   confluent.telemetry.exporter._c3.metrics.include=io.confluent.kafka.server.request.(?!.*delta).*|io.confluent.kafka.server.server.broker.state|io.confluent.kafka.server.replica.manager.leader.count|io.confluent.kafka.server.request.queue.size|io.confluent.kafka.server.broker.topic.failed.produce.requests.rate.1.min|io.confluent.kafka.server.tier.archiver.total.lag|io.confluent.kafka.server.request.total.time.ms.p99|io.confluent.kafka.server.broker.topic.failed.fetch.requests.rate.1.min|io.confluent.kafka.server.broker.topic.total.fetch.requests.rate.1.min|io.confluent.kafka.server.partition.caught.up.replicas.count|io.confluent.kafka.server.partition.observer.replicas.count|io.confluent.kafka.server.tier.tasks.num.partitions.in.error|io.confluent.kafka.server.broker.topic.bytes.out.rate.1.min|io.confluent.kafka.server.request.total.time.ms.p95|io.confluent.kafka.server.controller.active.controller.count|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.total|io.confluent.kafka.server.request.total.time.ms.p999|io.confluent.kafka.server.controller.active.broker.count|io.confluent.kafka.server.request.handler.pool.request.handler.avg.idle.percent.rate.1.min|io.confluent.kafka.server.session.expire.listener.zookeeper.disconnects.rate.1.min|io.confluent.kafka.server.controller.unclean.leader.elections.rate.1.min|io.confluent.kafka.server.replica.manager.partition.count|io.confluent.kafka.server.controller.unclean.leader.elections.total|io.confluent.kafka.server.partition.replicas.count|io.confluent.kafka.server.broker.topic.total.produce.requests.rate.1.min|io.confluent.kafka.server.controller.offline.partitions.count|io.confluent.kafka.server.socket.server.network.processor.avg.idle.percent|io.confluent.kafka.server.partition.under.replicated|io.confluent.kafka.server.log.log.start.offset|io.confluent.kafka.server.log.tier.size|io.confluent.kafka.server.log.size|io.confluent.kafka.server.tier.fetcher.bytes.fetched.total|io.confluent.kafka.server.request.total.time.ms.p50|io.confluent.kafka.server.tenant.consumer.lag.offsets|io.confluent.kafka.server.session.expire.listener.zookeeper.expires.rate.1.min|io.confluent.kafka.server.log.log.end.offset|io.confluent.kafka.server.broker.topic.bytes.in.rate.1.min|io.confluent.kafka.server.partition.under.min.isr|io.confluent.kafka.server.partition.in.sync.replicas.count|io.confluent.telemetry.http.exporter.batches.dropped|io.confluent.telemetry.http.exporter.items.total|io.confluent.telemetry.http.exporter.items.succeeded|io.confluent.telemetry.http.exporter.send.time.total.millis|io.confluent.kafka.server.controller.leader.election.rate.(?!.*delta).*|io.confluent.telemetry.http.exporter.batches.failed
   confluent.telemetry.exporter._c3.client.base.url=http://localhost:9090/api/v1/otlp
   confluent.telemetry.exporter._c3.client.compression=gzip
   confluent.telemetry.exporter._c3.api.key=dummy
   confluent.telemetry.exporter._c3.api.secret=dummy
   confluent.telemetry.exporter._c3.buffer.pending.batches.max=80
   confluent.telemetry.exporter._c3.buffer.batch.items.max=4000
   confluent.telemetry.exporter._c3.buffer.inflight.submissions.max=10
   confluent.telemetry.metrics.collector.interval.ms=60000
   confluent.telemetry.remoteconfig._confluent.enabled=false
   confluent.consumer.lag.emitter.enabled=true
   ```
3. Download the Control Center archive and run these commands:
   ```bash
   wget https://packages.confluent.io/confluent-control-center-next-gen/archive/confluent-control-center-next-gen-2.3.0.tar.gz
   ```

   ```bash
   tar -xvf confluent-control-center-next-gen-2.3.0.tar.gz
   ```

   ```bash
   cd confluent-control-center-next-gen-2.3.0
   ```
4. Start Control Center.

   To start Control Center, you must have three dedicated command windows: one for Prometheus, another for the Control Center process,
   and a third for Alertmanager. Run the following commands from `CONTROL_CENTER_HOME` in all command windows.
   1. Start Prometheus.
      ```bash
      bin/prometheus-start
      ```
   2. Start Alertmanager.
      ```bash
      bin/alertmanager-start
      ```
   3. Start Control Center.
      ```bash
      bin/control-center-start etc/confluent-control-center/control-center-dev.properties
      ```
5. Start Confluent Platform.

   Start ZooKeeper.
   ```bash
   bin/zookeeper-server-start etc/kafka/zookeeper.properties
   ```

   Start Kafka.
   ```bash
   bin/kafka-server-start etc/kafka/server.properties
   ```


### Bad security configuration

* Check the security configuration for all brokers, Telemetry Reporter, and Control Center (see [debugging check configuration](#check-configurations)).
  For example, is it SASL_SSL, SASL_PLAINTEXT, SSL?
* Possible errors include:
  ```bash
  ERROR SASL authentication failed using login context 'Client'. (org.apache.zookeeper.client.ZooKeeperSaslClient)
  ```

  ```bash
  Caused by: org.apache.kafka.common.KafkaException: java.lang.IllegalArgumentException: No serviceName defined in either JAAS or Kafka configuration
  ```

  ```bash
  org.apache.kafka.common.errors.IllegalSaslStateException: Unexpected handshake request with client mechanism GSSAPI, enabled mechanisms are [GSSAPI]
  ```
* Verify that the correct Java Authentication and Authorization Service (JAAS) configuration was detected.
* If ACLs are enabled, check them.
* To verify that you can communicate with the cluster, try to produce and consume using `console-*` with the same security settings.


### HTTP Basic authentication enabled for Schema Registry

Whenever you have HTTP Basic authentication configured for Schema Registry, you must
provide a username and password for Control Center to communicate correctly with Schema Registry.
For a single cluster or the first cluster in a multi-cluster deployment, set the
following properties, where the `user.info` contains a `<username>:<password>`
that you have configured for Schema Registry.

```bash
confluent.controlcenter.schema.registry.basic.auth.credentials.source=USER_INFO
confluent.controlcenter.schema.registry.basic.auth.user.info=<sr-username>:<sr-password>
```

For multi-cluster deployment, to set the remaining clusters, use:

```bash
confluent.controlcenter.schema.registry.<sr-cluster-name>.basic.auth.credentials.source=USER_INFO
confluent.controlcenter.schema.registry.<sr-cluster-name>.basic.auth.user.info=<sr-username>:<sr-password>
```

A multi-cluster deployment Schema Registry might look like the following:

```bash
// first Schema Registry cluster
confluent.controlcenter.schema.registry.url=<sr1-endpoint>
confluent.controlcenter.schema.registry.basic.auth.credentials.source=USER_INFO
confluent.controlcenter.schema.registry.basic.auth.user.info=<sr1-username>:<sr1-password>

// additional Schema Registry clusters
confluent.controlcenter.schema.registry.<sr2-name>.url=<sr2-endpoint>
confluent.controlcenter.schema.registry.<sr2-name>.basic.auth.credentials.source=USER_INFO
confluent.controlcenter.schema.registry.<sr2-name>.basic.auth.user.info=<sr2-username>:<sr2-password>
```

See [Schema Registry](/platform/current/security/authentication/http-basic-auth/overview.html#basic-auth-sr) for steps to configure HTTP Basic authentication for Schema Registry.


### HTTP Basic authentication enabled for Connect

Whenever you have HTTP Basic authentication configured for Connect, you must
provide a username and password for Control Center to communicate correctly with Connect.
Set the `confluent.controlcenter.connect.<connect-cluster-name>.basic.auth.user.info` property
to a value that contains `<username>:<password>` that you have configured for Connect.

```bash
confluent.controlcenter.connect.<connect1-name>.basic.auth.user.info=<connect-username>:<connect-password>
```

See [Connect REST API](/platform/current/security/authentication/http-basic-auth/overview.html#basic-auth-kconnect) for steps to
configure HTTP Basic authentication for Connect.


## View My Role Assignments

To access the Assignments page:

1. [Log in](c3-rbac-login.md#c3-rbac-login) to Control Center.
2. From the Control Center **Administration menu**, click the **View my role assignments**
   option.
3. Click the **Assignments** tab.
   ![View My Role Assignments Cluster Level page](images/c3-rbac-my-role-assign.png)

Use this page to:

- View the clusters for which you have role assignments.
- Search for clusters by name and ID.
- Filter the cluster view by cluster type: Connect, Kafka, ksqlDB, Schema Registry.

  #### NOTE
  If there is only one type of cluster you are authorized for, the
  **Cluster type** (All clusters) list does not appear. Only the
  cluster types that you have role permissions for appear in the list.
- Drill into a cluster to access the Cluster roles and Resource roles pages where
  you can view your role assignments for a cluster and its resources.
  ![View My Connect Cluster Resource Role Assignments page](images/c3-rbac-view-connect-cluster-role.png)
- Click the relevant tab to navigate to the appropriate resource scope page for
  a cluster, such as:
  - Consumer **Group**, **Topic**, or **Transactional ID** tab for a Kafka cluster.
  - **Subject** tab for a Schema Registry cluster.
  - **Connector** tab for a Connect cluster.


# Configure Control Center to work with Kafka ACLs on Confluent Platform

Before attempting to create and use Access Control Lists (ACLs), you should familiarize yourself with [ACL concepts](/platform/current/security/authorization/acls/overview.html#acl-concepts).
Doing so can help you avoid common pitfalls that can occur when creating and using
ACLs to manage access to components and cluster data.

Standard Apache Kafka® authorization and encryption options are available for
[control center](../installation/configuration.md#kafka-encryption-authentication-authorization-settings).


### Asynchronous writes

With librdkafka, you first need to create a `rd_kafka_topic_t`
handle for the topic you want to write to. Then you can use
`rd_kafka_produce` to send messages to it. For example:

```c
rd_kafka_topic_t *rkt = rd_kafka_topic_new(rk, topic, topic_conf);

if (rd_kafka_produce(rkt, RD_KAFKA_PARTITION_UA,
                     RD_KAFKA_MSG_F_COPY,
         payload, payload_len,
                     key, key_len,
   NULL) == -1) {
  fprintf(stderr, "%% Failed to produce to topic %s: %s\n",
 topic, rd_kafka_err2str(rd_kafka_errno2err(errno)));
}
```

You can pass topic-specific configuration in the third argument to
`rd_kafka_topic_new`. The previous example passed the `topic_conf`
and seeded with a configuration for acknowledgments. Passing `NULL` will cause
the producer to use the default configuration.

The second argument to `rd_kafka_produce` can be used to set the
desired partition for the message. If set to `RD_KAFKA_PARTITION_UA`,
as in this case, librdkafka will use the default partitioner to select
the partition for this message. The third argument indicates that
librdkafka should copy the payload and key, which would let us free it
upon returning.

If you want to invoke some code after the write has completed, you have
to configure it on initialization:

```c
static void on_delivery(rd_kafka_t *rk,
                        const rd_kafka_message_t *rkmessage
                        void *opaque) {
  if (rkmessage->err)
    fprintf(stderr, "%% Message delivery failed: %s\n",
            rd_kafka_message_errstr(rkmessage));
}

void init_rd_kafka() {
  rd_kafka_conf_t *conf = rd_kafka_conf_new();
  rd_kafka_conf_set_dr_msg_cb(conf, on_delivery);

  // initialization omitted
}
```

The delivery callback in librdkafka is invoked in the user’s thread by
calling `rd_kafka_poll`. A common pattern is to call this function
after every call to the produce API, but this may not be sufficient to
ensure regular delivery reports if the message produce rate is not
steady. However, this API does not provide a direct way to
block for the result of any particular message delivery. If you need
to do this, then see the synchronous write example below.


#### Asynchronous Commits

```python
def consume_loop(consumer, topics):
    try:
        consumer.subscribe(topics)

        msg_count = 0
        while running:
            msg = consumer.poll(timeout=1.0)
            if msg is None: continue

            if msg.error():
                if msg.error().code() == KafkaError._PARTITION_EOF:
                    # End of partition event
                    sys.stderr.write('%% %s [%d] reached end at offset %d\n' %
                                     (msg.topic(), msg.partition(), msg.offset()))
                elif msg.error():
                    raise KafkaException(msg.error())
            else:
                msg_process(msg)
                msg_count += 1
                if msg_count % MIN_COMMIT_COUNT == 0:
                    consumer.commit(asynchronous=True)
    finally:
        # Close down consumer to commit final offsets.
        consumer.close()
```

In this example, the consumer sends the request and returns
immediately by using asynchronous commits. The `asynchronous` parameter to `commit()` is
changed to `True`. The value is passed in explicitly, but asynchronous
commits are the default if the parameter is not included.

The API gives you a callback which is invoked
when the commit either succeeds or fails.
The commit callback can be any callable and can be passed
as a configuration parameter to the consumer constructor.

```python
from confluent_kafka import Consumer

def commit_completed(err, partitions):
    if err:
        print(str(err))
    else:
        print("Committed partition offsets: " + str(partitions))

conf = {'bootstrap.servers': "host1:9092,host2:9092",
        'group.id': "foo",
        'default.topic.config': {'auto.offset.reset': 'smallest'},
        'on_commit': commit_completed}

consumer = Consumer(conf)
```


### Override Default Configuration Properties

You can override the replication factor using
`confluent.topic.replication.factor`. For example, when using a Kafka cluster
as a destination with less than three brokers (for development and testing) you
should set the `confluent.topic.replication.factor` property to `1`.

You can override producer-specific properties by using the
`producer.override.*` prefix (for source connectors) and consumer-specific
properties by using the `consumer.override.*` prefix (for sink connectors).

You can use the defaults or customize the other properties as well. For example,
the `confluent.topic.client.id` property defaults to the name of the connector
with `-licensing` suffix. You can specify the configuration settings for
brokers that require SSL or SASL for client connections using this prefix.

You cannot override the cleanup policy of a topic because the topic always has a
single partition and is compacted. Also, do not specify serializers and
deserializers using this prefix; they are ignored if added.


## Distributed

This configuration is used typically along with [distributed
mode](/platform/current/connect/concepts.html#distributed-workers).

1. Create a file named `connector.json` using the following JSON configuration example:
   ```bash
   {
      "name": "connector1",
      "config": {
         "connector.class": "io.confluent.connect.activemq.ActiveMQSourceConnector",
         "kafka.topic":"MyKafkaTopicName",
         "activemq.url":"tcp://localhost:61616",
         "jms.destination.name":"testing",
         "confluent.license":"",
         "confluent.topic.bootstrap.servers":"localhost:9092"
      }
   }
   ```

   You can change the `confluent.topic.*` properties to fit your specific
   environment. If running on a single-node Kafka cluster, you must include the
   following: `"confluent.topic.replication.factor":"1"`. Leave the
   `confluent.license` property blank for a 30-day trial. For more details,
   see the [configuration options](source_connector_config.md#activemq-source-connector-license-config).

   To explore other options when connecting to ActiveMQ, see the
   [Configuration Reference for ActiveMQ Source Connector for Confluent Platform](source_connector_config.md#activemq-source-connector-config) page. For details about the ActiveMQ
   URL parameters, see the [Apache ActiveMQ](https://activemq.apache.org/connection-configuration-uri.html)
   documentation.
2. Use `curl` to post the configuration to one of the Kafka Connect Workers.
   Change `http://localhost:8083/` the endpoint of one of your Kafka Connect
   worker(s).
   ```bash
   curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
   ```


### Source Connector Configuration

Start the services using the Confluent CLI:

```bash
confluent local start
```

Create a configuration file named aws-cloudwatch-logs-source-config.json with the following
contents.

```text
{
  "name": "aws-cloudwatch-logs-source",
  "config": {
    "connector.class": "io.confluent.connect.aws.cloudwatch.logs.AwsCloudWatchSourceConnector",
    "tasks.max": "1",
    "aws.cloudwatch.logs.url": "https://logs.us-east-2.amazonaws.com",
    "aws.cloudwatch.log.group": "my-log-group",
    "aws.cloudwatch.log.streams": "my-log-stream",
    "name": "aws-cloudwatch-logs-source",
    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1"
  }
}
```

The important configuration parameters used here are:

- **aws.cloudwatch.logs.url**: The endpoint URL that the source connector connects to pull
  the specified logs.
- **aws.cloudwatch.log.group**: The AWS CloudWatch log group under which the log streams are
  contained.
- **aws.cloudwatch.log.streams**: A list of AWS CloudWatch log streams from which the logs are
  pulled from. The default value is to use all log streams from the configured log group.
- **tasks.max**: The maximum number of tasks that should be created for
  this connector.
- You may pass your [AWS Credentials](https://docs.confluent.io/kafka-connect-kinesis/current/index.html#aws-credentials)
  to the AWS CloudWatch Logs Connector through your source connector
  configuration. To pass AWS credentials in the source configuration set the
  **aws.access.key.id** and the **aws.secret.key.id**: parameters.
  ```text
  "aws.access.key.id":<your-access-key-id>
  "aws.secret.access.key":<your-secret-access-key>
  ```

Run this command to start the AWS CloudWatch Logs Source connector.

```bash
confluent local load aws-cloudwatch-logs-source --config aws-cloudwatch-logs-source-config.json
```

To check that the connector started successfully view the Connect
worker’s log by running:

```bash
confluent local services connect log
```

Start a Kafka Consumer in a separate terminal session to view the data exported by
the connector into the kafka topic

```text
path/to/confluent/bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic my-log-group.my-log-stream --from-beginning
```

Finally, stop the Confluent services using the command:

```bash
confluent local stop
```


### REST based example

This configuration is used typically along with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`connector.json`, configure all of the required values, and use the command
below to post the configuration to one the distributed connect worker(s). Check
here for more information about the Kafka Connect [REST
API](/platform/current/connect/references/restapi.html).

```bash
{
  "name" : "aws-cloudwatch-logs-source-connector",
  "config" : {
    "name" : "aws-cloudwatch-logs-source-connector",
    "connector.class" : "io.confluent.connect.aws.cloudwatch.logs.AwsCloudWatchSourceConnector",
    "tasks.max" : "1",
    "aws.access.key.id" : "< Optional Configuration >",
    "aws.secret.access.key" : "< Optional Configuration >",
    "aws.cloudwatch.log.group" : "< Required Configuration >",
    "aws.cloudwatch.log.streams : "< Optional Configuration - defaults to all log streams in
    the log group >"
  }
}
```

Use curl to post the configuration to one of the Kafka Connect Workers. Change
`http://localhost:8083/` the endpoint of one of your Kafka Connect workers.

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
```

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json \
http://localhost:8083/connectors/aws-cloudwatch-logs-source-connector/config
```


### REST-based example

This configuration is used typically along with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`connector.json`, configure all of the required values, and use the command
below to post the configuration to one the distributed connect workers. Check
here for more information about the Kafka Connect [Kafka Connect
REST Interface](/platform/current/connect/references/restapi.html).

```bash
{
  "name" : "aws-cloudwatch-metrics-sink-connector",
  "config" : {
   "name": "aws-cloudwatch-metrics-sink",
   "connector.class": "io.confluent.connect.aws.cloudwatch.metrics.AwsCloudWatchMetricsSinkConnector",
   "tasks.max": "1",
   "aws.cloudwatch.metrics.url": "https://monitoring.us-east-2.amazonaws.com",
   "aws.cloudwatch.metrics.namespace": "service-namespace",
   "behavior.on.malformed.metric": "fail",
   "confluent.topic.bootstrap.servers": "localhost:9092",
   "confluent.topic.replication.factor": "1"
  }
}
```

Use curl to post the configuration to one of the Kafka Connect workers. Change
`http://localhost:8083/` to the endpoint of one of your Kafka Connect workers.

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
```

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json \
http://localhost:8083/connectors/aws-cloudwatch-metrics-sink-connector/config
```


### REST-based example

This configuration is typically used with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`connector.json`, configure all of the required values. Use the command below
to post the configuration to one of the distributed Kafka Connect worker(s).
See Kafka Connect [REST API](/platform/current/connect/references/restapi.html) for
more information.

```bash
 {
   "name": "LambdaSinkConnector",
   "config" : {
     "connector.class" : "io.confluent.connect.aws.lambda.AwsLambdaSinkConnector",
     "tasks.max" : "1",

     "topics" : "< Required Configuration >",

     "aws.lambda.function.name" : "< Required Configuration >",
     "aws.lambda.invocation.type" : "sync",
     "aws.lambda.batch.size" : "50",

     "behavior.on.error" : "fail",

     "confluent.topic.bootstrap.servers" : "localhost:9092",
     "confluent.topic.replication.factor" : "1"
   }
 }
```

Use curl to post the configuration to one of the Kafka Connect workers. Change
`http://localhost:8083/` the endpoint of one of your Kafka Connect workers.

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
```

For details about using this connector with Kafka Connect Reporter, see
[Connect Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).


## REST-based Example

This configuration is used typically along with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`connector.json`, configure all of the required values, and use the command
below to post the configuration to one the distributed connect workers. Refer to
[REST API](/platform/current/connect/references/restapi.html) for more information about
the Kafka Connect.

**Connect Distributed REST example:**

```json
 {
   "name": "EventHubsSourceConnector1",
   "config": {
     "confluent.topic.bootstrap.servers": "< Required Configuration >",
     "connector.class": "io.confluent.connect.azure.eventhubs.EventHubsSourceConnector",
     "kafka.topic": "< Required Configuration >",
     "tasks.max": "1",
     "max.events": "< Optional Configuration >",
     "azure.eventhubs.sas.keyname": "< Required Configuration >",
     "azure.eventhubs.sas.key": "< Required Configuration >",
     "azure.eventhubs.namespace": "< Required Configuration >",
     "azure.eventhubs.hub.name": "< Required Configuration >"
   }
 }
```

Use curl to post the configuration to one of the Kafka Connect Workers. Change
http://localhost:8083/ to the endpoint of one of your Kafka Connect workers.

**Create a new connector:**

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
```

**Update an existing connector:**

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/EventHubsSourceConnector1/config
```


#### NOTE
Provide `datadog.api.key`, `datadog.domain` and `behavior.on.error` and
start the connector.

Then start the Datadog metrics connector by loading its configuration with the
following command.

```bash
confluent local load datadog-metrics-sink --config datadog-metrics-sink-connector.properties
{
 "name": "datadog-metrics-sink",
 "config": {
     "connector.class": "io.confluent.connect.datadog.metrics.DatadogMetricsSinkConnector",
     "tasks.max":"1",
     "topics":"datadog-metrics-topic",
     "datadog.api.key": "< your-api-key > "
     "datadog.domain": "COM"
     "behavior.on.error": "fail",
     "key.converter":"io.confluent.connect.avro.AvroConverter",
     "key.converter.schema.registry.url":"http://localhost:8081",
     "value.converter":"io.confluent.connect.avro.AvroConverter",
     "value.converter.schema.registry.url":"http://localhost:8081",
     "confluent.topic.bootstrap.servers":"localhost:9092",
     "confluent.topic.replication.factor":"1",
     "reporter.bootstrap.servers": "localhost:9092"
 },
  "tasks": []
}
```


#### NOTE
Change the `confluent.topic.bootstrap.servers` property to include your
broker address(es) and change the `confluent.topic.replication.factor` to
`3` for staging or production use.

Use curl to post a configuration to one of the Kafka Connect workers. Change
`http://localhost:8083/` to the endpoint of one of your Kafka Connect
worker(s).

```bash
curl -sS -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
```

Use the following command to update the configuration of existing connector.

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors/FirebaseSinkConnector/config
```

Confirm that the connector is in a `RUNNING` state by running the following
command:

```bash
curl http://localhost:8083/connectors/FirebaseSinkConnector/status | jq
```

The output should resemble:

```bash
{
   "name":"FirebaseSinkConnector",
   "connector":{
      "state":"RUNNING",
      "worker_id":"127.0.1.1:8083"
   },
   "tasks":[
      {
         "id":0,
         "state":"RUNNING",
         "worker_id":"127.0.1.1:8083"
      }
   ],
   "type":"sink"
}
```

Search for the endpoint `/connectors/FirebaseSinkConnector/status`, the state
of the connector and tasks should have status as `RUNNING`.

To produce Avro data to Kafka topic: `artists`, use the following command.

```bash
./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic artists \
--property parse.key=true \
--property key.schema='{"type":"string"}' \
--property "key.separator=:" \
--property value.schema='{"type":"record","name":"artists","fields":[{"name":"name","type":"string"},{"name":"genre","type":"string"}]}'
```

While the console is waiting for the input, use the following three records and
paste each of them on the console.

```bash
"artistId1":{"name":"Michael Jackson","genre":"Pop"}
"artistId2":{"name":"Bob Dylan","genre":"American folk"}
"artistId3":{"name":"Freddie Mercury","genre":"Rock"}
```

To produce Avro data to Kafka topic: `songs`, use the following command.

```bash
./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic songs \
--property parse.key=true \
--property key.schema='{"type":"string"}' \
--property "key.separator=:" \
--property value.schema='{"type":"record","name":"songs","fields":[{"name":"title","type":"string"},{"name":"artist","type":"string"}]}'
```

While the console is waiting for the input, paste the following three records on
the Firebase console.

```bash
"songId1":{"title":"billie jean","artist":"Michael Jackson"}
"songId2":{"title":"hurricane","artist":"Bob Dylan"}
"songId3":{"title":"bohemian rhapsody","artist":"Freddie Mercury"}
```

Finally, check the Firebase console to ensure that the collections named
`artists` and `songs` were created and the records are in the format defined
in the [Firebase database structure](#firebase-data-format).


### REST-based example

Use this setting with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`config.json`, configure all of the required values, and use the following
command to post the configuration to one of the distributed connect workers. For
more information about the Kafka Connect REST API, see [this
documentation](/platform/current/connect/references/restapi.html).

```json
{
  "name" : "FirebaseSourceConnector",
  "config" : {
    "connector.class" : "io.confluent.connect.firebase.FirebaseSourceConnector",
    "tasks.max" : "1",

    "gcp.firebase.credentials.path" : "file-path-to-your-gcp-service-account-json-file",
    "gcp.firebase.database.reference": "https://<gcp-project-id>.firebaseio.com",
    "gcp.firebase.snapshot" : "true",

    "confluent.topic.bootstrap.servers": "localhost:9092",
    "confluent.topic.replication.factor": "1",
    "confluent.license": " Omit to enable trial mode "
  }
}
```


#### NOTE
For staging or production use:

- Change the `confluent.topic.bootstrap.servers` property to include your
  broker address(es).
- Change the `confluent.topic.replication.factor` to `3` for staging or
  production use.
- Change `http://localhost:8083/` to the endpoint of one of your Connect
  worker(s).

Use curl to post a configuration to one of the Connect workers.

```bash
curl -sS -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
```

Confirm that the connector is in a `RUNNING` state by running the following command:

```bash
curl http://localhost:8083/connectors/MyGithubConnector/status
```

The output should resemble the example below:

```bash
{
   "name":"MyGithubConnector",
   "connector":{
      "state":"RUNNING",
      "worker_id":"127.0.1.1:8083"
   },
   "tasks":[
      {
         "id":0,
         "state":"RUNNING",
         "worker_id":"127.0.1.1:8083"
      }
   ],
   "type":"source"
}
```

Enter the following command to consume records written by the connector to the Kafka topic:

```bash
./kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic github-stargazers --from-beginning
```

The output should resemble the example below:

```bash
{
    "type": {
      "string": "STARGAZERS"
    },
    "createdAt": null,
    "data": {
      "data": {
        "login": {
          "string": "User.Name"
        },
        "id": {
          "int": 1234
        },
        "node_id": {
          "string": "MDQ6VXNlcjM0OTE3MTE="
        },
        "avatar_url": {
          "string": "https://avatars2.githubusercontent.com/u/1234?v=4"
        },
        "gravatar_id": {
          "string": ""
        },
        "url": {
          "string": "https://api.github.com/users/User.Name"
        },
        "html_url": {
          "string": "https://github.com/User.Name"
        },
        "followers_url": {
          "string": "https://api.github.com/users/User.Name/followers"
        },
        "following_url": {
          "string": "https://api.github.com/users/User.Name/following{/other_user}"
        },
        "gists_url": {
          "string": "https://api.github.com/users/User.Name/gists{/gist_id}"
        },
        "starred_url": {
          "string": "https://api.github.com/users/User.Name/starred{/owner}{/repo}"
        },
        "subscriptions_url": {
          "string": "https://api.github.com/users/User.Name/subscriptions"
        },
        "organizations_url": {
          "string": "https://api.github.com/users/User.Name/orgs"
        },
        "repos_url": {
          "string": "https://api.github.com/users/User.Name/repos"
        },
        "events_url": {
          "string": "https://api.github.com/users/User.Name/events{/privacy}"
        },
        "received_events_url": {
          "string": "https://api.github.com/users/User.Name/received_events"
        },
        "type": {
          "string": "User"
        },
        "site_admin": {
          "boolean": false
        }
      }
    },
    "id": {
      "string": "1234"
    }
  }
```


### Template parameters

The HTTP Sink connector forwards the message (record) value to the HTTP API. You
can add parameters to have the connector construct a unique HTTP API URL
containing the record key and topic name. For example, you enter
`http://eshost1:9200/api/messages/${topic}/${key}` to have the HTTP API URL
contain the topic name and record key.

In addition to the `${topic}` and `${key}` parameters, you can also refer to
fields from the Kafka record. As shown in the following example, you may want the
connector to construct a URL that uses the Order ID and Customer ID.

The following example shows the Avro format the producer uses to generate
records in the Apache Kafka® topic `order`:

```json
{
  "name": "MyClass",
  "type": "record",
  "namespace": "com.acme.avro",
  "fields": [
    {
      "name": "customerId",
      "type": "int"
    },
    {
      "name": "order",
      "type": {
        "name": "order",
        "type": "record",
        "fields": [
          {
            "name": "id",
            "type": "int"
          },
          {
            "name": "amount",
            "type": "int"
          }
        ]
      }
    }
  ]
}
```

To send the Order ID and Customer ID, you would use the following URL in the
HTTP API URL (`http.api.url`) configuration property:

```properties
"http.api.url" : "http://eshost1:9200/api/messages/order/${order.id}/customer/${customerId}/"
```

Assuming the data in the Kafka topic contains the following values:

```json
{
  "customerId": 123,
  "order": {
    "id": 1,
    "amount": 12345
  }
}
```

The connector constructs the following URL:

```bash
http://eshost1:9200/api/messages/order/1/customer/123/
```


## Distributed

This configuration is used typically along with [distributed mode](/platform/current/connect/concepts.html#distributed-workers).
Write the following JSON to connector.json, configure all of the required values, and use the command below to post the configuration to one the distributed connect worker(s).

```bash
{
  "name": "connector1",
  "config": {
    "connector.class": "io.confluent.connect.ibm.mq.IbmMQSourceConnector",
    "kafka.topic":"MyKafkaTopicName",
    "mq.hostname":"localhost",
    "mq.transport.type":"client",
    "mq.queue.manager":"QMA",
    "mq.channel":"SYSTEM.DEF.SVRCONN",
    "jms.destination.name":"testing",
    "confluent.license":"",
    "confluent.topic.bootstrap.servers":"localhost:9092"
  }
}
```

Change the `confluent.topic.*` properties as required to suit your environment.
If running on a single-node Kafka cluster you will need to include `"confluent.topic.replication.factor":"1"`.
Leave the `confluent.license` property blank for a 30 day trial.
See the [configuration options](source_connector_config.md#ibmmq-source-connector-license-config) for more details.

Use curl to post the configuration to one of the Kafka Connect Workers.
Change `http://localhost:8083/` the endpoint of one of your Kafka Connect worker(s).

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
```


#### Exactly once delivery

The connector supports exactly once semantics when the following conditions are met:

- All the connect workers in the cluster have the [exactly.once.support](https://docs.confluent.io/platform/current/installation/configuration/connect/index.html#exactly-once-source-support) property set to `enabled`. For more information, see [exactly once source worker](https://kafka.apache.org/documentation.html#connect_exactlyoncesource) .
- The connect worker is running in a distributed mode. Exactly once delivery cannot be supported in standalone mode.
- The connect worker principal should have the required ACLs. For more information on the required ACLs, see [ACLs for exactly once source](https://kafka.apache.org/documentation.html#connect_exactlyonce) .
- The connector is configured with the `state.topic.name` property.

When these conditions are met, the connector processes each record exactly once, even through failures or restarts.
It uses the state topic to track progress of the records it has processed, allowing it to resume from the last
processed record in case of a failure. You must set the state topic only when you first create the connector.
Changing the topic name after the connector creation can result in duplicates.

For exactly once semantics, the connector requires only one consumer of the MQ destination. Hence, it doesn’t support more than one task or receiver thread.

The connector uses a transactional producer for writing records to the Kafka topic, guaranteeing exactly once delivery.
Any Kafka consumer reading from the topic must also set [isolation.level](https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html#isolation-level) property to `read_committed`.


#### NOTE
Change the `confluent.topic.bootstrap.servers` property to include your broker address(es), and change the `confluent.topic.replication.factor` to `3` for staging or production use.

Use curl to post a configuration to one of the Connect workers. Change `http://localhost:8083/` to the endpoint of one of your Connect worker(s).

```bash
curl -sS -X POST -H 'Content-Type: application/json' --data @config.json http://localhost:8083/connectors
```

Enter the following command to confirm that the connector is in a `RUNNING` state:

```bash
curl http://localhost:8083/connectors/MyJiraConnector/status
```

The output should resemble the example below:

```bash
{
   "name":"MyJiraConnector",
   "connector":{
      "state":"RUNNING",
      "worker_id":"127.0.1.1:8083"
   },
   "tasks":[
      {
         "id":0,
         "state":"RUNNING",
         "worker_id":"127.0.1.1:8083"
      }
   ],
   "type":"source"
}
```

Enter the following command to consume records written by the connector to the Kafka topic:

```bash
./bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic jira-topic-roles --from-beginning
```

The output should resemble the example below:

```bash
{
    "type":"roles",
    "data":{
       "self":"<Your-Jira-URL>/rest/api/2/role/10100",
       "name":"Project_Name",
       "id":10111,
       "description":"A test role added to the project",
       "scope":null,
       "actors":{
          "array":[
             {
                "id":10012,
                "displayName":"Jira_Actor_Name",
                "type":"user-role-actor",
                "actorUser":{
                   "accountId":"101"
                }
             }
          ]
       }
    }
}
```


#### Override Default Configuration Properties

You can override the replication factor using
`confluent.topic.replication.factor`. For example, when using a Kafka cluster
as a destination with less than three brokers (for development and testing) you
should set the `confluent.topic.replication.factor` property to `1`.

You can override producer-specific properties by using the
`producer.override.*` prefix (for source connectors) and consumer-specific
properties by using the `consumer.override.*` prefix (for sink connectors).

You can use the defaults or customize the other properties as well. For example,
the `confluent.topic.client.id` property defaults to the name of the connector
with `-licensing` suffix. You can specify the configuration settings for
brokers that require SSL or SASL for client connections using this prefix.

You cannot override the cleanup policy of a topic because the topic always has a
single partition and is compacted. Also, do not specify serializers and
deserializers using this prefix; they are ignored if added.


## REST-based example

This configuration is used typically along with [distributed workers](/platform/current/connect/concepts.html#distributed-workers).
Write the following json to connector.json, configure all of the required values, and use the command below to
post the configuration to one the distributed connect worker(s). Check here for more information about the
Kafka Connect [REST API](/platform/current/connect/references/restapi.html)

**Connect Distributed REST example:**

```json
 {
   "config" : {
     "name" : "MapRDBSinkConnector1",
     "connector.class" : "io.confluent.connect.mapr.db.MapRDBSinkConnector",
     "tasks.max" : "1",
     "mapr.table.map.<topic>" : "<table-path>"
   }
 }
```

Use curl to post the configuration to one of the Kafka Connect Workers. Change http://localhost:8083/ the endpoint of
one of your Kafka Connect worker(s).

**Create a new connector:**

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
```

**Update an existing connector:**

```bash
curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/MapRDBSinkConnector1/config
```


### REST-based example

This configuration is used typically along with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following json to
`connector.json`, configure all of the required values, and use the command
below to post the configuration to one the distributed connect worker(s). Check
here for more information about the Kafka Connect [REST
API](/platform/current/connect/references/restapi.html)

**Connect Distributed REST example:**

```json
 {
   "config" : {
     "name" : "MqttSinkConnector1",
     "connector.class" : "io.confluent.connect.mqtt.MqttSinkConnector",
     "tasks.max" : "1",
     "topics" : "< Required Configuration >",
     "mqtt.server.uri" : "< Required Configuration >"
   }
 }
```

Use `curl` to post the configuration to one of the Kafka Connect Workers. Change
http://localhost:8083/ the endpoint of
one of your Kafka Connect worker(s).

- **Create a new connector**:
  ```bash
  curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
  ```
- **Update an existing connector**:
  ```bash
  curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/MqttSinkConnector1/config
  ```


### REST-based example

This configuration is used typically along with [distributed workers](/platform/current/connect/concepts.html#distributed-workers).
Write the following json to `connector.json`, configure all of the required values, and use the command below to
post the configuration to one the distributed connect worker(s). Check here for more information about the
Kafka Connect [REST API](/platform/current/connect/references/restapi.html)

**Connect Distributed REST example:**

```json
 {
   "config" : {
     "name" : "MqttSourceConnector1",
     "connector.class" : "io.confluent.connect.mqtt.MqttSourceConnector",
     "tasks.max" : "1",
     "mqtt.server.uri" : "< Required Configuration >",
     "mqtt.topics" : "< Required Configuration >"
   }
 }
```

Use curl to post the configuration to one of the Kafka Connect Workers. Change http://localhost:8083/ the endpoint of
one of your Kafka Connect worker(s).

- **Create a new connector**:
  ```bash
  curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
  ```
- **Update an existing connector**:
  ```bash
  curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/MqttSourceConnector1/config
  ```


### REST-based example

This configuration is typically used for [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). See the Kafka Connect
[REST API](/platform/current/connect/references/restapi.html) for details REST API
information.

1. Write the following JSON sample code to `connector.json` and set all of the
   required parameters.
   ```json
   {
     "name": "NetezzaSinkConnector",
     "config":{
           "connector.class": "io.confluent.connect.netezza.NetezzaSinkConnector",
           "tasks.max": "1",
           "topics": "orders",
           "connection.host": "192.168.24.74",
           "connection.port": "5480",
           "connection.database": "SYSTEM",
           "connection.user": "admin",
           "connection.password": "password",
           "batch.size": "10000",
           "auto.create": "true"
           }
   }
   ```
2. Use curl to post the configuration to one of the Kafka Connect workers.
   Change `http://localhost:8083/` the endpoint of one of your Kafka Connect
   worker(s).
   ```bash
   curl -s -X POST -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors
   ```

   ```bash
   curl -s -X PUT -H 'Content-Type: application/json' --data @connector.json http://localhost:8083/connectors/NetezzaSinkConnector/config
   ```
3. To verify the data in Netezza, log in to Netezza and connect to the Netezza
   database with the following command:
   ```bash
   [nz@netezza ~]$nzsql
   ```
4. Run the following SQL query to verify the records:
   ```bash
   SYSTEM.ADMIN(ADMIN)=> select * from orders;
    foo|50.0|100|999
   ```


### The connector fails with  “Redo Log consumer failed to subscribe…”

This may occur if the connector can’t read from the redo log topic due to
security being configured on the Kafka cluster.

When security is enabled on a Kafka cluster, you must configure
`redo.log.consumer.*` accordingly. For example, in the case of an SSL-secured
(non-Confluent Cloud) cluster, you can configure the following properties:

```json
"redo.log.consumer.security.protocol": "SSL",
"redo.log.consumer.ssl.truststore.location": "<filepath>",
"redo.log.consumer.ssl.truststore.password": "<password>",
"redo.log.consumer.ssl.keystore.location": "<filepath>",
"redo.log.consumer.ssl.keystore.password": "<password>",
"redo.log.consumer.ssl.key.password": "<password>”,
"redo.log.consumer.ssl.truststore.type":"<type>",
"redo.log.consumer.ssl.keystore.type": "<type>",
```

If configuring the connector to send data to a Confluent Cloud cluster, the following
properties can be configured:

```json
"redo.log.consumer.bootstrap.servers: "XXXXXXXX",
"redo.log.consumer.security.protocol: "SASL_SSL",
"redo.log.consumer.ssl.endpoint.identification.algorithm: "https",
"redo.log.consumer.sasl.mechanism: "PLAIN",
"redo.log.consumer.sasl.jaas.config: "org.apache.kafka.common.security.plain.PlainLoginModule required username='XXXXXXXXX' password='XXXXXXXXXX';"
```


### REST-based example

This configuration is used typically along with [distributed
workers](/platform/current/connect/concepts.html#distributed-workers). Write the following JSON to
`kafka-connect-redis.json`, configure all of the required values, and use the
command below to post the configuration to one of the distributed connect
workers. For more information about the Kafka Connect, see [REST API](/platform/current/connect/references/restapi.html).

```bash
{
    "name" : "kafka-connect-redis",
     "config" : {
        "name" : "kafka-connect-redis",
        "connector.class" : "com.github.jcustenborder.kafka.connect.redis.RedisSinkConnector",
        "topics" : "users",
        "tasks.max" : "1",
        "key.converter" : "org.apache.kafka.connect.storage.StringConverter",
        "value.converter" : "org.apache.kafka.connect.storage.StringConverter"
      }
    }
```

Use curl to post the configuration to one of the Kafka Connect Workers. Change
`http://localhost:8083/` the endpoint of one of your Kafka Connect workers.

```bash
curl -s -X POST -H 'Content-Type: application/json' --data @kafka-connect-redis.json http://localhost:8083/connectors
```


## Write JSON message values into ServiceNow

The example settings file is shown below.

1. Create a `servicenow-sink-json.json` file with the following contents.

   #### NOTE
   All user-defined tables in ServiceNow start with `u_`

   ```bash
   // substitute <> with your config
   {
       "name": "ServiceNowSinkJSONConnector",
       "config": {
           "connector.class": "io.confluent.connect.servicenow.ServiceNowSinkConnector",
           "topics": "test_table_json",
           "servicenow.url": "https://<endpoint>.service-now.com/",
           "tasks.max": "1",
           "servicenow.table": "u_test_table",
           "servicenow.user": "<username>",
           "servicenow.password": "<password>",
           "key.converter":"org.apache.kafka.connect.storage.StringConverter",
           "value.converter":"org.apache.kafka.connect.json.JsonConverter",
           "value.converter.schemas.enable": "true",
           "confluent.topic.bootstrap.servers": "localhost:9092",
           "confluent.license": "<license>", // leave it empty for evaluation license
           "confluent.topic.replication.factor": "1",
           "reporter.bootstrap.servers": "localhost:9092",
           "reporter.error.topic.name": "test-error",
           "reporter.error.topic.replication.factor": 1,
           "reporter.error.topic.key.format": "string",
           "reporter.error.topic.value.format": "string",
           "reporter.result.topic.name": "test-result",
           "reporter.result.topic.key.format": "string",
           "reporter.result.topic.value.format": "string",
           "reporter.result.topic.replication.factor": 1
       }
   }
   ```

   #### NOTE
   For details about using this connector with Kafka Connect Reporter, see
   [Connect
   Reporter](/kafka-connectors/self-managed/userguide.html#userguide-connect-reporter).
2. Load the ServiceNow Sink connector by posting configuration to Connect REST
   server.
   ```bash
   confluent local load ServiceNowSinkJSONConnector --config servicenow-sink-json.json
   ```
3. Confirm that the connector is in a `RUNNING` state.
   ```bash
   confluent local status ServiceNowSinkJSONConnector
   ```
4. To produce some records into the `test_table_json` topic, first start a
   Kafka producer.

   #### NOTE
   All user-defined columns in ServiceNow start with `u_`

   ```bash
   kafka-console-producer \
   --broker-list localhost:9092 \
   --topic test_table_json
   ```
5. The console producer is now waiting for input, so you can go ahead and
   insert some records into the topic.
   ```json
   {"schema": {"type": "struct", "fields": [{"type": "string", "optional": false, "field": "u_name"},{"type": "float", "optional": false, "field": "u_price"}, {"type": "int64","optional":false,"field": "u_quantity"}],"optional": false,"name": "products"}, "payload": {"u_name": "laptop", "u_price": 999.50, "u_quantity": 3}}
   {"schema": {"type": "struct", "fields": [{"type": "string", "optional": false, "field": "u_name"},{"type": "float", "optional": false, "field": "u_price"}, {"type": "int64","optional":false,"field": "u_quantity"}],"optional": false,"name": "products"}, "payload": {"u_name": "pencil", "u_price": 0.99, "u_quantity": 10}}
   {"schema": {"type": "struct", "fields": [{"type": "string", "optional": false, "field": "u_name"},{"type": "float", "optional": false, "field": "u_price"}, {"type": "int64","optional":false,"field": "u_quantity"}],"optional": false,"name": "products"}, "payload": {"u_name": "pen", "u_price": 1.99, "u_quantity": 5}}
   ```


## Override Default Configuration Properties

You can override the replication factor using
`confluent.topic.replication.factor`. For example, when using a Kafka cluster
as a destination with less than three brokers (for development and testing) you
should set the `confluent.topic.replication.factor` property to `1`.

You can override producer-specific properties by using the
`producer.override.*` prefix (for source connectors) and consumer-specific
properties by using the `consumer.override.*` prefix (for sink connectors).

You can use the defaults or customize the other properties as well. For example,
the `confluent.topic.client.id` property defaults to the name of the connector
with `-licensing` suffix. You can specify the configuration settings for
brokers that require SSL or SASL for client connections using this prefix.

You cannot override the cleanup policy of a topic because the topic always has a
single partition and is compacted. Also, do not specify serializers and
deserializers using this prefix; they are ignored if added.


### CSV with Schema Example

This example reads CSV files and writes them to Kafka. It parses them using the schema specified in `key.schema` and `value.schema`.

1. Create a data directory and generate test data.
   ```bash
   curl "https://api.mockaroo.com/api/58605010?count=1000&key=25fd9c80" > "data/csv-spooldir-source.csv"
   ```
2. Create a `spooldir.properties` file with the following contents:
   ```properties
   name=CsvSchemaSpoolDir
   tasks.max=1
   connector.class=com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector
   input.path=/path/to/data
   input.file.pattern=csv-spooldir-source.csv
   error.path=/path/to/error
   finished.path=/path/to/finished
   halt.on.error=false
   topic=spooldir-testing-topic
   csv.first.row.as.header=true
   key.schema={\n  \"name\" : \"com.example.users.UserKey\",\n  \"type\" : \"STRUCT\",\n  \"isOptional\" : false,\n  \"fieldSchemas\" : {\n    \"id\" : {\n      \"type\" : \"INT64\",\n      \"isOptional\" : false\n    }\n  }\n}
   value.schema={\n  \"name\" : \"com.example.users.User\",\n  \"type\" : \"STRUCT\",\n  \"isOptional\" : false,\n  \"fieldSchemas\" : {\n    \"id\" : {\n      \"type\" : \"INT64\",\n      \"isOptional\" : false\n    },\n    \"first_name\" : {\n      \"type\" : \"STRING\",\n      \"isOptional\" : true\n    },\n    \"last_name\" : {\n      \"type\" : \"STRING\",\n      \"isOptional\" : true\n    },\n    \"email\" : {\n      \"type\" : \"STRING\",\n      \"isOptional\" : true\n    },\n    \"gender\" : {\n      \"type\" : \"STRING\",\n      \"isOptional\" : true\n    },\n    \"ip_address\" : {\n      \"type\" : \"STRING\",\n      \"isOptional\" : true\n    },\n    \"last_login\" : {\n      \"type\" : \"STRING\",\n      \"isOptional\" : true\n    },\n    \"account_balance\" : {\n      \"name\" : \"org.apache.kafka.connect.data.Decimal\",\n      \"type\" : \"BYTES\",\n      \"version\" : 1,\n      \"parameters\" : {\n        \"scale\" : \"2\"\n      },\n      \"isOptional\" : true\n    },\n    \"country\" : {\n      \"type\" : \"STRING\",\n      \"isOptional\" : true\n    },\n    \"favorite_color\" : {\n      \"type\" : \"STRING\",\n      \"isOptional\" : true\n    }\n  }\n}
   ```
3. Load the SpoolDir CSV Source connector.
   ```bash
   confluent local load spooldir --config spooldir.properties
   ```

   #### IMPORTANT
   Don’t use the [Confluent CLI](https://docs.confluent.io/confluent-cli/current/index.html) in production environments.
4. Validate messages are sent to Kafka serialized with Avro.
   ```bash
   kafka-avro-console-consumer --topic spooldir-testing-topic --from-beginning --bootstrap-server localhost:9092
   ```


## JSON Source Connector Example

This example follows the same steps as the Quick Start. Review the Quick Start for help running the Confluent Platform and installing the Spool Dir connectors.

1. Generate a JSON dataset using the command below:
   ```bash
   curl "https://api.mockaroo.com/api/17c84440?count=500&key=25fd9c80" > "json-spooldir-source.json"
   ```
2. Create a `spooldir.properties` file with the following contents:
   ```properties
   name=JsonSpoolDir
   tasks.max=1
   connector.class=com.github.jcustenborder.kafka.connect.spooldir.SpoolDirJsonSourceConnector
   input.path=/path/to/data
   input.file.pattern=json-spooldir-source.json
   error.path=/path/to/error
   finished.path=/path/to/finished
   halt.on.error=false
   topic=spooldir-json-topic
   ```
3. Load the SpoolDir JSON Source connector using the Confluent CLI [confluent local services connect connector load](https://docs.confluent.io/confluent-cli/current/command-reference/local/services/connect/connector/confluent_local_services_connect_connector_load.html) command.
   ```bash
   confluent local load spooldir --config spooldir.properties
   ```

   #### IMPORTANT
   Don’t use the [confluent local](https://docs.confluent.io/confluent-cli/current/command-reference/local/index.html) commands in production environments.


### Passwordless OAuth/OIDC authentication with client assertion

Starting with version 8.0, Confluent Platform supports OAuth client assertion, a secure
credential management with passwordless authentication. It uses asymmetric
encryption-based authentication, extending Confluent Platform OAuth, and allows you to:

* Avoid deploying username and password while securing Confluent Platform.
* Streamline and automate client credential rotation on a periodic basis without
  manual intervention for the client applications.

A client assertion is a JSON Web Token (JWT) with a collection of information
for sharing identity and security information, and it is presented as proof of
the client’s identity.

The following client assertion flows are supported in CFK:

* JSON Web Token (JWT) assertion retrieval from file flow

  JWT assertion retrieval from file flow is not recommended for production use cases.
  Instead, you should use local client assertion flow for production.
* Local client assertion flow

In CFK 3.0, OAuth client assertion is supported for the following resources:

* Day 1 components: Kafka, KRaft, MDS, Schema Registry
* Day 2 application resources: KafkaTopic, Kafka REST Class, ConfluentRoleBinding, Schema,
  SchemaExporter, ClusterLinking


## Customize Confluent Platform pods with Pod Overlay

Confluent for Kubernetes (CFK) supports a subset of Kubernetes PodTemplateSpec in the CFK API
(`spec.podTemplate` in the component custom resource) where you configure
StatefulSet PodTemplate for the Confluent Platform components. .

To set and use additional Kubernetes features that are not supported by the CFK
API, you can use the Pod Overlay feature.

Example use cases that you can use the Pod Overlay feature include:

* To deploy a Confluent Platform cluster with a custom init container that runs alongside the
  CFK init container.

  In this case, the custom init container runs before the CFK init container.
* To use a newly introduced Kubernetes feature that has not been added to the
  CFK API.

Make sure that you do not have conflict values between what’s set in the CFK
podTemplate API and in Pod Overlay. For example, if you specify
`podSecurityContext` in the `kafka.spec.podTemplate`, you cannot use Pod
Overlay to specify different values in `spec.template.spec.securityContext`.

To use Pod Overlay:

1. Create a template file (`<template-file>`) with the settings you want to
   add:
   ```yaml
   spec:
     template:
   ```

   The template file has to start with `spec: template: <xxx>`, and it has to
   follow the [Kubernetes StatefulSetSpec API](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#statefulsetspec-v1-apps).

   You can configure fields only inside `spec.template`.

   Fields specified outside of `spec.template` will be considered as invalid.

   The following example is for a custom init container:
   ```yaml
   spec:
     template:
       spec:
         initContainers:
         - name: busybox
           image: busybox:1.28
           command: ["echo","I am a custom init-conatiner"]
           imagePullPolicy: IfNotPresent
         hostNetwork: true
         dnsPolicy: ClusterFirstWithHostNet
   ```

   Note that when `hostNetwork:` is set to `true`, `dnsPolicy:` must be
   set to `ClusterFirstWithHostNet`.
2. Create a ConfigMap (`<configmap>`) using the file created in the previous step
   (`<template-file>`). You must use `pod-template.yaml` as the key with
   `--from-file` option.
   ```bash
   kubectl create configmap <configmap> --from-file=pod-template.yaml=<template-file> -n <namespace>
   ```
3. Add the `platform.confluent.io/pod-overlay-configmap-name` annotation on
   the Confluent Platform component resource CR.

   For example:
   ```yaml
   kind: Kafka
   metadata:
     name: kafka
     namespace: operator
     annotations:
       platform.confluent.io/pod-overlay-configmap-name: <configmap>
   ```

   According to the Kubernetes convention, a ConfigMap can only be referenced by
   the pods residing in the same namespace. So CFK will look for
   `<configmap>` within the same namespace as the component CR object.

For configuration examples, see [the tutorial for Pod Overlay](https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/advanced-configuration/pod-overlay).