Confluent Platform 5.4.0 Release Notes¶
5.4.0 is a major release of Confluent Platform that provides you with Apache Kafka® 2.4.0, the latest stable version of Kafka.
The technical details of this release are summarized below.
- The Confluent CLI is now included in the full Confluent Platform tarball. Confluent Community software users must download the CLI separately and configure it to point at their installation manually. Updates are disabled for the CLI included in the full Confluent Platform tarball to ensure compatibility.
- Added support for Windows, except for
- Added support for centralized ACLs. For more information, see Kafka Access Control List (ACLs) improvements.
- Added more information when listing role bindings. For more information, see Security.
Confluent Control Center¶
- Cluster overview You can quickly navigate from the home page to an overview of your cluster and some high level metrics, including brokers, topics, and connected services. Red and orange labels denote areas of your cluster that may have issues. For more information, see Cluster overview.
- Broker metrics dashboard A new time-series chart that allows you to easily compare metrics for Production, Consumption, Broker uptime, Partition replicas, System and Disk over a unified time frame. You can rearrange the metrics on this dashboard to personalize what metrics are most important or which are better viewed side-by-side. For more information, see Brokers metrics page.
- RBAC management You can now view your own RBAC role assignment and permissions, and manage subordinate role bindings using Control Center. For more information, see Manage RBAC roles with Control Center.
- Replicator monitoring Previously in Control Center, the replicator monitoring was accomplished from within a Connect cluster. With 5.4.0, you can now navigate directly to the Replicators menu to see all replicators and their status. For more information, see Replicators.
- Auto-updates You can now enable Control Center to automatically install the most current version, including new features, security enhancements, and critical bug fixes. For more information, see Auto-updating the Control Center user interface.
- Topic schema validation You can now enable key and value schema validation when creating
a topic. Under Expert mode, two settings are now visible:
- Topics page enhancements - The All Topics page is updated to include metrics on under-replicated partitions, out-of-sync replicas, out-of-sync observers, and production/consumption throughput for each topic in the cluster.
- User behavior analytics In an effort to continuously improve our products, user behavior data
is now collected by default in Control Center starting in 5.4. You can disable collection of user
behavior analytics by setting
falsein the configuration. For more information, see Control Center Usage Data Collection.
- Known issues
- When RBAC is enabled, front-end updates are not available to Control Center users.
- If RBAC is enabled, Control Center cannot monitor data streams that are running on Confluent Cloud. For more information on how to connect to non-RBAC Control Center to Confluent Cloud, see Connecting Control Center to Confluent Cloud.
- Some connector configurations are not visible in Control Center, for example, the
avro.codecwhen selecting Avro as the output format in the S3 sink connector. You can instead supply these configurations by using the Connect REST API.
Multi-Region Clusters (Production Available)¶
Multi-Region Clusters is now Production Available in CP 5.4.0. This feature allows a single cluster to be stretched across multiple datacenters. This feature centralizes Disaster Recovery operations, making DR events easier manage, monitor, and secure. For more information, see Multi-Region Clusters or check out the example: here.
Tiered Storage (Preview)¶
Tiered Storage is currently in preview and is not advised to run in production environments because the interfaces are not yet stable.
Tiered Storage is a major evolution of the architecture and capability of Confluent Platform. This feature makes Confluent Platform more elastic, while giving platform Operators the ability to cost-effectively increase retention periods, all on a per-topic basis. For more information, see Tiered Storage.
Schema Validation (Production Available)¶
Schema Validation is now certified for Production Environments. Schema Validation gives operators a centralized location to enforce data format correctness at the topic level. To enable schema validation:
confluent.key.schema.validation=true, or both, on the topic.
For more information, see Schema Validation.
This version of Confluent Platform incorporates a long solved issue with Avro which allowed incompatible schemas with records in unions. If the client has published an incompatible schema and transitivity is enabled, it will be unable to publish new schemas after the platform upgrade to v5.4.
For more information about Avro and other supported formats in Schema Registry, see Schema Formats, Serializers, and Deserializers.
For current Avro documentation, see https://avro.apache.org/docs/current/.
- Enables configuration of schema validation. See Schema Validation.
- Supports installation using Helm 3 for a more secure workflow compared to Helm 2. See Upgrading Confluent Operator, Confluent Platform, and Helm.
- A Quick Start deployment utility is available. See Confluent Operator Quick Start.
- For more information, see Confluent Operator. For full deployment instructions, see Deploying Confluent Operator and Confluent Platform.
- The minimum supported Kubernetes version is 1.11. Make sure that your underlying Kubernetes environment is upgraded and conforms to the documented supported environments for Confluent Platform for Confluent Platform 5.4.
- The default ports for external load balancers changed to 443 for HTTPS and 80 for HTTP traffic.
- For information about upgrading to Operator for 5.4, see Upgrading Confluent Platform version 5.3.x to 5.4.
- Due to a known issue in ZooKeeper, single-node ZooKeeper clusters may experience data loss if not upgraded carefully. For more information, see Upgrading Confluent Platform version 5.3.x to 5.4.
- By default, upgrading to Operator 5.4 will automatically perform a rolling update of all Confluent Platform clusters (e.g. ZooKeeper) that are managed by Operator. For more information about how to control when individual clusters get updated, see Upgrading Confluent Platform version 5.3.x to 5.4.
- The default ports for the load balancers for KSQL, Connect, Replicator, Schema Registry, and Confluent Control Center have changed in Operator 5.4. After upgrading to Operator 5.4, the load balancer ports are now the standard HTTPS or HTTP port (443 and 80, respectively). You need to update the clients that access the components through the load balancer to use the new ports. See Upgrading Confluent Platform version 5.3.x to 5.4 for more information on port changes.
- Simplifies troubleshooting for Operator by consolidating two Operator pods (“operator” and “manager”) into a single pod.
- Reduced the total size of Docker images needed to deploy the product.
- Fixes configuration of init-containers deployed by Operator so that they can no longer access data volumes of Kafka and ZooKeeper pods.
For supported Confluent Platform features in this version of Operator, see Interoperability.
Role-based access control (RBAC)¶
- RBAC is now in Production Availability. This means you can run this feature in production. In 5.3, RBAC was only available in Preview mode for test and development clusters.
- RBAC provides centralized and platform-wide authorization for all the Confluent Platform components, including Kafka, Connect, Schema Registry, KSQL, and Control Center.
- Added support for public HTTP APIs for viewing and managing role bindings.
- Added Swagger UI tooling for development testing against public RBAC and ACL APIs. For more information, see Confluent Metadata API Reference.
- Added support in Control Center to view and manage role bindings. For more information, see Manage RBAC roles with Control Center.
- RBAC CLI improvements:
- It is now easier to discover cluster ID information.
- When listing role bindings for a user, transitive role bindings for that user are now listed through their group role bindings.
- Privileged users can now view a list of principals (users/groups) who have a role on specific resources.
- For more information, see confluent iam.
Kafka Access Control List (ACLs) improvements¶
- Centralized ACLs that create a consistent user experience for RBAC and ACLs.
- A new option to persist all ACLs centrally on a topic, similar to RBAC metadata. Note that you can continue to use ZooKeeper to store ACLs, however, it is recommended that you migrate and utilize the new centralized ACLs.
- Centralized ACLs can be managed through public APIs or through Confluent CLI.
- For more information, see Authorization using centralized ACLs.
- With this feature, you can capture structured event logs that identify all of the authorization requests across the entire platform. You can monitor access, detect abnormal activity, and meet compliance requirements.
- Audit logs are set using the
- Configuration using API, CLI, and GUI is planned for future releases.
- For more information, see Audit Logs.
FIPS Operational readiness¶
- You can now use FIPS crypto libraries, ciphers, and operating systems.
- FIPS broker connections are enforced. All connections to the broker must use FIPS-compliant ciphers.
- If a non-FIPS compliant cipher attempts to make a connection, it is rejected by the broker.
- For more information, see Confluent Platform FIPS operational readiness.
- When basic authentication fails, the user password is printed in the logs.
- RBAC-enabled brokers fail to process the
This version of Confluent Platform includes Kafka 2.4.0. For a full list of Kafka KIPs, features, and bug fixes, see the Kafka release notes.
Kafka C++ Client¶
- KIP-392 - Fetch messages from closest replica / follower
- Transaction aware consumer (
- Consumer property
isolation.level=read_committedensures the consumer will only read messages from successfully committed producer transactions. The default is
read_committed. To get the previous behavior, set the property to
read_uncommitted, which will read all messages produced to a topic, regardless if the message was part of an aborted or not yet committed transaction.
- Improved authentication errors (KIP-152).
- Sub-millisecond buffering (
linger.ms) on the producer.
- Optimize varint decoding, increasing consume performance by ~15%.
- Upgrade builtin lz4 to 1.9.2 (CVE-2019-17543, #2598).
- Messages were not timed out for leader-less partitions (.NET issue #1027).
- Fix msgq (re)insertion code to avoid O(N^2) insert sort operations on retry (#2508). The msgq insert code now properly handles interleaved and overlapping message range inserts, which may occur during Producer retries for high-throughput applications.
- Rate limit IO-based queue wakeups to
linger.ms, which reduces CPU load and lock contention for high-throughput producer applications (#2509).
For the full librdkafka release notes, see the librdkafka GitHub project.
Python, .NET, and Go Clients¶
Clients include all the improvements from the latest version.
- Writes log messages to a file by default in addition to writing them to the console (KIP-521).
- Enables you to change log levels dynamically (KIP-495).
- Adds more metrics for connectors (KIP-475).
- Secures internal Connect REST endpoints (KIP-507).
- Connect Converter implementations can use record headers when (de)serializing record keys and values (KIP-440), although the Avro Converter and JSON Converter were not changed to use headers.
- The JSON Converter supports Decimal type in JSON properly (KIP-481).
- SQL Server’s DateTimeOffset type is handled as a Connect TIMESTAMP logical type.
- Includes the most recent bugfix release of the PostgreSQL JDBC driver (9.4.1212).
- Notable fixes: A source task fails when it fails to read and convert any column; improved log messages; corrected PostgreSQL upsert statements when writing records to tables with no primary key columns.
S3 Sink Connector¶
- Now supports Parquet format, set with
- New configuration
behavior.on.null.valuesspecifies how the connector should behave when a record with a null value is consumed by this connector.
- Notable fixes: Improved retry logic when failing to close files; improve behavior when evolving schemas to different names.
HDFS v2 Sink Connector¶
- Add support for Connect logical types (time, timestamp, date, decimal) when using Parquet formatted files. When Parquet format is used with Hive, only the decimal LogicalType is supported.
- File caching is disabled to be more compatible with recent HDFS versions.
- Notable fixes: improved retry logic when failing to close files; improve behavior when evolving schemas to different names.
JMS Source Connector¶
- Supports permissive schemas. When
use.permissive.schemas=true, the messageID becomes optional in the key and value.
- Notable fixes: Some versions of dependent libraries were changed to avoid versions with known CVEs.
Elasticsearch Sink Connector¶
Notable fixes: Improved INFO and DEBUG logging; improved support for Elasticsearch 7; fixed performance regression related to index checks.
- Greatly improves cluster lifecycle management with new Ansible playbooks for in-place upgrades of Confluent Platform.
- Enables configuration of schema validation.
- Enables FIPS-compliant configuration for Confluent Platform.
- Enables configuration of SASL SCRAM for greater freedom to choose secure authentication configurations for Confluent Platform.
- Simplifies configuration management by making all properties configurable through variable files during initial deployment on a per-component basis.
- Improves monitoring for Confluent Platform by enabling installation of monitoring-interceptor JARs.
- Standardizes logging across components for simpler troubleshooting and auditing.
For supported Confluent Platform features in in this version of Ansible Playbooks for Confluent Platform, see Interoperability.
- KIP-213: Support non-key joining in KTables - It’s often the case that two events have a relationship in common beside their partition key. Previously, the Kafka Streams DSL didn’t provide a mechanism to support joining two tables on anything but their key. This adds support for non-key table/table joins. It also supports updates from both sides of the join.
- KIP-307: Allow defining custom processor names with the KStreams DSL - In the past, Kafka Streams would
automatically assign names to the processor nodes in a topology. This would make complex topologies
hard to understand or debug when you examined their contents. This KIP introduces the
.withName()method, allowing you to set more meaningful names to each node in the topology.
- KIP-470: TopologyTestDriver test input and output usability improvements - The TopologyTestDriver class is a useful utility for testing Kafka Streams applications in a lightweight, deterministic way. Its existing API, however, made it rather verbose to feed it input and draw output from the program being tested. This KIP adds new utilities for interacting with the input and output of topologies being tested in a more concise manner.
- KIP-471: Expose RocksDB Metrics in Kafka Streams - RocksDB, which Kafka Streams leverages internally, has functionality to collect statistics about operations over its running instances. These statistics enable users to find bottlenecks and to accordingly tune RocksDB, which can be critical for operating Kafka Streams in performance-sensitive environments. Previously, Kafka Streams had not surfaced these metrics. This KIP exposes a subset of RocksDB’s statistics in Kafka Streams metrics.
- Connect integration - Create and manage external Connect data sources and sinks entirely
from within KSQL. External sources and sinks may be created and configured using the KSQL
CREATE … CONNECTORcommand. KSQL can integrate with Connect in one of these ways:
- Embedded mode - Connect workers run embedded within the KSQL JVM, no external Connect cluster is required. KSQL will automatically launch embedded Connect workers when configured for embedded mode.
- External mode - KSQL is pointed to an external Connect cluster. KSQL executes all connector management operations against this external cluster.
- Pull queries - Pull queries have been added to support point-in-time lookup queries against materialized views. Only single-row lookups are currently supported but query coverage will expand over time.
- Support for quoted identifiers - Column names containing unparseable characters can now be
used by enclosing them within backticks (
- explode() - User-defined table functions (UDTF) are now supported by KSQL, and
explode()has been added as a new built-in UDTF.
explode()takes an array as an argument, and outputs a row for each element in the array (essentially
- New built-ins - The following builtin functions have been added to KSQL:
- avg(v) - Compute the average of all input values (aggregate)
- exp (v) - Compute the exponential of
- initcap(str) - Capitalize the first letter of each word in the input string
- ln(v) - Compute the natural logarithm of
- replace(col, find, replace) - Replace all occurrences of
- round(v [, scale]) - Round the given value optionally using the given scale (default scale is 0)
- sign(v) - Compute the sign of
v(returns 1 for positive, -1 for negative)
- sqrt(v) - Compute the square root of
- KAFKA serde format - The
KAFKAserialization format has been added to support Kafka-serialized keys and values.
KAFKAmay be assigned as a
value_format. For more information, see KAFKA.
- Client output formatting - Output formatting improvements have been made to the KSQL CLI, and column names are now included in the output of transient queries.
- Custom type registry - Type declarations can now be saved and reused using the
CREATE TYPEcommand. This is particularly useful for complex, verbose type declarations that are cumbersome to reuse. Type declarations may now be named and referred to as their own types.
- User-specified delimiter characters - You can now specify a custom delimiter when using the
value_format. The delimiter is set with the
- Decimal datatype - A Decimal type has been introduced. Decimals offer significantly increased precision relative to KSQL’s Double type, and are thus suitable for high-precision use cases such as currency.
- Beginning with the next major release of Confluent Platform, Connect connectors will no longer be packaged natively with Confluent Platform. Instead all connectors must be downloaded directly from Confluent Hub.
- Starting with Confluent Platform 6.0, TLS 1.0 and 1.1 will be disabled and no longer supported in Confluent Platform.
How to Download¶
To upgrade Confluent Platform to a newer version, check the Upgrade Confluent Platform documentation.
Supported Versions and Interoperability¶
For the supported versions and interoperability of Confluent Platform and its components, see Supported Versions and Interoperability.