5.0.0 is a major release of the Confluent Platform that provides Confluent users with Apache Kafka 2.0.0, the latest stable version of Kafka.
Confluent Platform users are encouraged to upgrade to 5.0.0. The technical details of this release are summarized below.
Confluent Control Center¶
Confluent Control Center introduces several new features, including:
- A new UI for KSQL: Control Center now includes a brand new graphical UI for KSQL. With this new UI, you can create KSQL streams and tables in Control Center, as well as run, view, terminate, and explain queries and statements.
- Broker configuration: Making sure your brokers are set up just the way you want is now easy with Control Center. You can now view broker configurations across multiple Kafka clusters, quickly check specific brokers, and easily spot differences in properties.
- Topic inspection: Control Center now makes it easy to see the actual streaming messages coming out of Kafka topics using the new topic inspection feature.
- Consumer lag: Wondering how your consumers are performing? You can now see if any consumers are lagging using the new consumer lag feature. You can also set alerts on consumer lag, so you can monitor performance and proactively address issues whenever you think best.
- Schema Registry support: Control Center now supports viewing the schema associated with a topic, along with its version history, and allows comparing older versions with the current one.
Replicator already automates the replication of topic messages and related metadata across Kafka clusters. With Confluent Platform 5.0, Replicator and the new consumer timestamps interceptor also ensure that consumer offsets are available on the secondary cluster. When consumer client applications failover to the secondary cluster, Replicator handles consumer offset translation so that they can resume consuming near the last point they stopped consuming at in the original cluster. This minimizes the reprocessing that consumers needs to do in a disaster scenario without skipping messages. See the Replicator and Cross-Cluster Failover section for more detail.
To facilitate application failover and switch back in Confluent Platform 5.0, Replicator adds support for protection against circular replication. This guarantees that if two Replicator instances are configured to run, one replicating from DC1 to DC2 and the second instance configured to replicate from DC2 to DC1, Replicator will ensure that messages replicated to DC2 are not replicated back to DC1, and vice versa. As a result, Replicator safely runs in each direction.
This new component is an MQTT 3.1.1 broker to help streamline Internet of Things (IoT) architectures with a scalable proxy similar to Confluent REST Proxy for HTTP(S) communication with Kafka. MQTT Proxy enables the replacement of third party MQTT brokers with Kafka-native MQTT proxies while avoiding intermediate storage and lag. MQTT Proxy uses Transport Layer Security (TLS) encryption and basic authentication. It also supports the widely used MQTT 3.1.1 protocol so MQTT client messages at all three MQTT quality of service levels are written directly into Kafka topics where KSQL and streaming applications can consume and process them. Future releases will add bidirectional communication and more security options.
Apache Kafka 2.0.0 adds support for prefixed ACLs, simplifying access control management in large secure deployments. Prefixed ACLs enable bulk access to topics, consumer groups or transactional ids with a prefix to be granted using a single rule. See KIP-290 for details.
Access control for topic creation has been improved to enable access to be granted to create specific topics or topics with a prefix.
KIP-255 adds a framework for authenticating to Kafka brokers using OAuth2 access tokens. Authentication is performed using SASL/OAUTHBEARER mechanism defined in RFC-7628, which enables authentication using OAuth2 bearer tokens in non-HTTP protocols. The implementation in Apache Kafka is customizable using callbacks for token retrieval and validation, providing the flexibility required for integration in different deployments.
You can now dynamically update SSL truststores without broker restart.
You can now configure security for listeners in ZooKeeper before starting brokers, including SSL keystore and truststore passwords, and JAAS configuration for SASL. With this new feature, you can store sensitive password configs in encrypted form in ZooKeeper rather than in cleartext in the broker properties file.
Host name verification is now enabled by default for SSL connections to ensure that the default SSL configuration is not susceptible to man-in-the-middle attacks. Set <code>ssl.endpoint.identification.algorithm</code> to an empty string to restore the previous behaviour.
Availability and Resilience¶
The replication protocol has been improved to avoid log divergence between leader and follower during fast leader failover.
To avoid OutOfMemory errors in brokers, the memory footprint of message down-conversions has been reduced. By using message chunking, both memory usage and memory reference time have been reduced to a minimum. Down-conversion is required to support consumers that only support an older message format.
Kafka clients are now notified of throttling before any throttling is applied when quotas are enabled. This enables clients to distinguish between network errors and large throttle times when quotas are exceeded. Quota implementations can also now be customized to work in different deployments.
KIP-266 removes indefinite blocking in the Java consumer, adding a new configuration option to control maximum wait time. Consumer group operations have also been added to the AdminClient. New metrics have also been added to monitor the lead between consumer offset and the start of the log to track slow consumers which may lose data when lead is close to zero.
Legacy Code Cleanup¶
Apache Kafka 2.0.0 drops support for Java 7 and removes the previously deprecated Scala producer and consumer.
- Improved REST API : We have made several improvements to the KSQL REST API, including returning proper HTTP response status codes on error conditions, a cleaner hierarchy of response objects, and more details about queries in the query descriptions returned by
- Aggregations on tables: KSQL now supports non-windowed usage of
COUNTaggregation functions on tables. Previously you could only perform such aggregations on streams.
- Support for timestamps in more formats: Users can now specify a
TIMESTAMP_FORMATin addition to specifying the timestamp column when creating new streams and tables. This allows KSQL to use timestamp data in any format supproted by the Java DateTimeFormatter. See the KSQL Syntax Reference for details.
- Added protection when dropping streams and tables: When you execute a
DROP STREAMcommand, KSQL now checks whether there are any currently active queries using that stream or table. If there are, KSQL will prevent you from dropping the stream or table without first terminating the queries which depend on it. This prevents KSQL from getting into an inconsistent internal state and thus boosts stability.
- INSERT INTO: KSQL now supports INSERT INTO, a new type of statement that lets you write the output of a query into an existing stream. See the KSQL Syntax Reference for details.
- CP Docker Images for KSQL: Confluent Platform Docker images are now available on docker hub for the 5.0 preview version of both the KSQL server and KSQL CLI. You can use the confluentinc/cp-ksql-server image to deploy the server in interactive (default) or headless mode. You can use the confluentinc/cp-ksql-cli image to start a CLI session inside a Docker container.
- Topic/Schema Cleanup on DELETE: Users can now optionally delete the backing topic and its registered schema (if any) when dropping a stream or table. See the KSQL Syntax Reference for details.
- Support for STRUCT data type: Added support for nested data using STRUCT data type. You can now declare streams and tables with columns using a STRUCT data type in your CREATE STREAM and CREATE TABLE statements, and then access the internal fields of these columns in your SELECT queries as you do in any other expression. See the KSQL Syntax Reference for details.
- UDF/UDAF: You can bring your own custom User Defined Functions (UDFs) and User Defined Aggregate Functions (UDAFs) to KSQL and use them in your queries. UDF/UDAF open the door for many novel use cases where you need to perform custom computations over your data.
- Stream-Stream and Table-Table Joins: Added Stream-Stream and Table-Table joins in CP 5.0. For each of these joins, KSQL supports inner, full outer, and left join types. This means KSQL now covers all of the available join operations in Kafka Streams. See the KSQL Syntax Reference for details.
Kafka Streams now support message headers and dynamic routing. See the Kafka Streams Upgrade Guide for details.
Kafka Connect includes a number of improvements and features. You can now control how errors in connectors, transformations, and converters are handled by enabling automatic retries and controlling the number of errors that are tolerated before an exception is thrown and the connector is stopped. More contextual information can be included in the logs to help diagnose problems, and problematic messages consumed by sink connectors can be sent to a dead letter queue rather than forcing the connector to stop.
There’s also a new extension point for moving secrets outside of connector configurations. You can integrate any external key management system and will allow you to store variables or placeholders in the connector configurations and resolve them in the external key management systems. The variables are only resolved before sending the configuration to the connector, making it possible to ensure that secrets are stored and managed securely in your preferred key management system and not exposed over the REST APIs or in log files.
Confluent provides high quality clients for the most popular non-Java languages, these clients are all based on the underlying librdkafka C/C++ implementation, which provides high performance and stability. The non-Java clients follow their own versioning scheme, slightly trailing the Apache Kafka releases. The current clients release is v0.11.5.
Highlights from v0.11.5:
- Admin API support.
- Java compatible murmur2 partitioner.
- Schema Registry HTTPS support for the Python client.
- Many enhancements and bug fixes.
See the client release notes for more information:
The JDBC source and sink connectors were modified to move DBMS-specific logic into dialects, making it easier to add and support various flavors of JDBC-compliant databases. This should be transparent to users as the connector will automatically select and use the most appropriate dialect for the connection.
The Elasticsearch connector is now able to use basic authentication when connecting to Elasticsearch. It also better supports keyword fields for text types.
The S3 connector extends the existing SSE support to enable customer-provided server-side encryption (SSE-C) keys.
Camus was deprecated in CP 3.0.0 and has been removed in CP 5.0.0.
To upgrade Confluent Platform to a newer version, check the Upgrade documentation.
For the supported versions and interoperability of Confluent Platform and its components, see Supported Versions and Interoperability.