Confluent Platform 3.2.0 Release Notes

This is a minor release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.2.0, the latest stable version of Kafka. In addition, this release includes Confluent’s S3 Connector, Rest Proxy Security, JMS client and a .Net client. This release also includes major feature upgrades to Confluent Control Center and critical bug fixes to Auto Data Balancing and Replicator.

Confluent Platform users are encouraged to upgrade to CP 3.2.0. The technical details of this release are summarized below.

Table of Contents

Highlights

Enterprise Features

JMS Client

Confluent Enterprise now includes a JMS-compatible client for Apache Kafka. This Kafka client implements the JMS 1.1 standard API, using Apache Kafka brokers as the backend. This is useful if you have legacy applications using JMS, and you would like to replace the existing JMS message broker with Apache Kafka. By replacing the legacy JMS message broker with Apache Kafka, existing applications can integrate with your modern streaming platform without a major rewrite of the application.

Confluent Control Center

This release adds significant new monitoring and alerting features for clusters, brokers and topics to Confluent Control Center. Leveraging the experience Confluent has with dozens of the world’s largest Kafka installations, Control Center encodes operational best practices by distilling over 150 different available metrics from a running cluster into a manageable set of KPIs and supporting details to provide both broker-centric and topic-centric insights into platform health.

Automatic Data Balancing

Automatic Data Balancing capability was added to Confluent Enterprise in release 3.1.0. In release 3.2.0 we improved the feature by making it safer to run unsupervised. In this release, a rebalance will not cause a broker to run out of disk space if the broker has a single log directory (this feature will be expanded to support multiple log directories in a future release). In addition, brokers with auto-generated broker ids are now supported.

Open Source Features

Kafka Streams

Backwards Compatible to Older Kafka Clusters

This (and future versions) of Kafka’s Streams API are now backwards-compatible with older Kafka clusters. This means you can upgrade your applications to use the latest version of the Streams API without having to first upgrade your Kafka clusters. Also, as has been the case before, the Streams API is still forwards-compatible, i.e. older versions of the Streams API can talk to newer Kafka clusters, too. This gives you full flexibility when planning and performing upgrades of applications and clusters across teams in your organization.

Compatibility Matrix:

  Kafka Broker (columns)
Streams API (rows) 3.0.x / 0.10.0.x 3.1.x / 0.10.1.x 3.2.0 / 0.10.2.0
3.0.x / 0.10.0.x compatible compatible compatible
3.1.x / 0.10.1.x   compatible compatible
3.2.0 / 0.10.2.0   compatible compatible

See the Upgrade Guide of Kafka Streams for more information.

Session Windows

You can now group data records into sessions, which is an essential feature for user behavior analysis in particular. Sessions represent a period of activity separated by a defined gap of inactivity. Session-based analyses can range from simple metrics (e.g. count of user visits to a website) to more complex metrics (e.g. customer conversion funnel and event flows). See Windowing in the Developer Guide of Kafka Streams for details.

Global KTables

In addition to the existing KTable interface, which represents partitioned changelog streams, you can now leverage the GlobalKTable interface, which represents fully replicated changelog streams. Here, when a Kafka topic is read into a global table, each KafkaStreams instance (with the same application.id) will get a full copy of the data, rather than just a subset of partitions. Global tables allow you to perform “star joins”, to join against foreign keys, and they make many use cases feasible where you must work on heavily skewed data and thus suffer from hot partitions. Also, global tables are often more efficient than their KTable counterpart when you need to perform multiple joins in succession. See Creating source streams from Kafka in the Developer Guide of Kafka Streams for details.

ZooKeeper Dependency Removed

Applications that use the Streams API no longer need to talk to ZooKeeper, which they previously did for Kafka topic management. This results in a simpler as well as a more secure Kafka infrastructure because you can now lock down network access to your ZooKeeper ensembles. The StreamsConfig.ZOOKEEPER_CONNECT_CONFIG aka zookeeper.connect application setting is now deprecated and ignored in 0.10.2.

Kafka Connect

Single Message Transform

New feature of Kafka’s Connect API. Single Message Transformations allows you to modify events before they make it into Kafka, or on the way out. You can now mask personally identifiable information, perform data-based event routing, and normalize data formats. You can modify events without any coding by adding our built-in transformations into the connector configuration. A flexible API lets you implement your own transformations where needed.

S3 Connector

Exactly-once connector that streams events from Kafka to S3 buckets. This connector supports Avro and JSON data formats and multiple partitioners, so your data will be available in S3 and easy to integrate with EMR, Redshift and other AWS services.

.Net Client

We’re introducing a fully supported, up to date client for .NET languages, including C#. The client is reliable, high-performance, secured, compatible with all versions of Kafka and supports the full Kafka protocol.

REST Proxy

In this release we’ve a new version of the REST APIs (v2). The v2 APIs let you use the new (0.9.0.0) Apache Kafka consumer protocol features. This includes:

  • Option to configure a secure connection (SSL, SASL_SSL or SASL_PLAINTEXT) between REST Proxy and Apache Kafka
  • No need to choose between high level consumer and simple consumer - simply create a consumer and subscribe to topics as part of a group or assign specific partitions to the consumer directly.
  • Get the list of topics and partitions a consumer subcribed to
  • Choose between automatic or manual commit of offsets
  • Seek to specific offsets in partitions

Apache Kafka 0.10.2

For a full list of changes in this release of Apache Kafka, see the 0.10.2 Release Notes.

Previous releases

Confluent Platform 3.1.2 Release Notes

This is a minor release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.1.1, the latest stable version of Kafka.

Confluent Platform users are encouraged to upgrade to CP 3.1.2 as it includes important bug fixes. The technical details of this release are summarized below.

Highlights

Apache Kafka 0.10.1.1

For a full list of changes in this release of Apache Kafka, see the 0.10.1.1 Release Notes.

Fixes
  • KAFKA-1464: Add a throttling option to the Kafka replication tool
  • KAFKA-4313: ISRs may thrash when replication quota is enabled
  • KAFKA-3994: Deadlock between consumer heartbeat expiration and offset commit.
  • KAFKA-4497: log cleaner breaks on timeindex
  • KAFKA-4529: tombstone may be removed earlier than it should
  • KAFKA-4469: Consumer throughput regression caused by inefficient list removal and copy
  • KAFKA-4472: offsetRetentionMs miscalculated in GroupCoordinator
  • KAFKA-4362: Consumer can fail after reassignment of the offsets topic partition
  • KAFKA-4384: ReplicaFetcherThread stopped after ReplicaFetcherThread received a corrupted message
  • KAFKA-4205: NullPointerException in fetchOffsetsBefore

Confluent Platform 3.1.1 Release Notes

This is a minor release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.1.0, the latest stable version of Kafka. In addition, this release includes Confluent’s multi-datacenter replication and automatic data balancing tool.

Confluent Platform users are encouraged to upgrade to CP 3.1.1 as it includes both new major functionality as well as important bug fixes. The technical details of this release are summarized below.

Highlights

Enterprise Features
Automatic Data Balancing

Confluent Enterprise now includes Auto Data Balancing. As clusters grow, topics and partitions grow in different rates, brokers are added and removed and over time this leads to unbalanced workload across datacenter resources. Some brokers are not doing much at all, while others are heavily taxed with large or many partitions, slowing down message delivery. When executed, this feature monitors your cluster for number of brokers, size of partitions, number of partitions and number of leaders within the cluster. It allows you to shift data to create an even workload across your cluster, while throttling rebalance traffic to minimize impact on production workloads while rebalancing.

Multi-Datacenter Replication

Confluent Enterprise now makes it easier than ever to maintain multiple Kafka clusters in multiple data centers. Managing replication of data and topic configuration between data centers enables use-cases such as:

  • Active-active geo-localized deployments: allows users to access a near-by data center to optimize their architecture for low latency and high performance
  • Centralized analytics: Aggregate data from multiple Kafka clusters into one location for organization-wide analytics
  • Cloud migration: Use Kafka to synchronize data between on-prem applications and cloud deployments

The new Multi-Datacenter Replication feature allows configuring and managing replication for all these scenarios from either Confluent Control Center or command-line tools.

Confluent Control Center

Control Center added the capability to define alerts on the latency and completeness statistics of data streams, which can be delivered by email or queried from a centralized alerting system. In this release the stream monitoring features have also been extended to monitor topics from across multiple Kafka clusters and access to the Control Center can be protected via integration with enterprise authentication systems.

Open Source Features
Kafka Streams: Interactive Queries

Interactive queries let you get more from streaming than just the processing of data. This feature allows you to treat the stream processing layer as a lightweight embedded database and, more concretely, to directly query the latest state of your stream processing application, without needing to materialize that state to external databases or external storage first.

As a result, interactive queries simplify the architecture of many use cases and lead to more application-centric architectures. For example, you often no longer need to operate and interface with a separate database cluster – or a separate infrastructure team in your company that runs that cluster – to share data between a Kafka Streams application (say, an event-driven microservice) and downstream applications, regardless of whether these applications use Kafka Streams or not; they may even be applications that do not run on the JVM, e.g. implemented in Python, C/C++, or JavaScript.

Kafka Streams: Application Reset Tool

The Application Reset Tool allows you to quickly reset a Kafka Streams application in order to reprocess its input data from scratch – think: an application “reset” button.

Kafka Streams: Improved memory management

Kafka Streams applications now benefit from record caches. Notably, these caches are used to compact output records (similar to Kafka’s log compaction) so that fewer updates for the same record key are being sent downstream. These new caches are enabled by default and typically result in reduced load on your Kafka Streams application, your Kafka cluster, and/or downstream applications and systems such as external databases. However, these caches can also be disabled, if needed, to restore the CP 3.0.x behavior of your applications.

Kafka Connect

In this release, we’ve added two new connectors to Confluent Open Source:

  • Elasticsearch Sink - High performance connector to stream data from Kafka to Elasticsearch. This connector supports automatic generation of Elasticsearch mappings, all Elasticsearch datatypes and exactly-once delivery
  • JDBC Sink - Stream data from Kafka to any relational database. This connector supports inserts and upserts, schema evolution and exactly-once delivery.
Go Client

We’re introducing a fully supported, up to date client for Go. The client is reliable, high-performance, secured, compatible with all versions of Kafka and supports the full Kafka protocol.

Apache Kafka 0.10.1

Here is a quick overview of the notable Kafka-related changes in the release:

New Feature
  • KAFKA-1464: Add a throttling option to the Kafka replication tool
  • KAFKA-3176: Allow console consumer to consume from particular partitions when new consumer is used.
  • KAFKA-3492: support quota based on authenticated user name
  • KAFKA-3776: Unify store and downstream caching in streams
  • KAFKA-3858: Add functions to print stream topologies
  • KAFKA-3909: Queryable state for Kafka Streams
  • KAFKA-4015: Change cleanup.policy config to accept a list of valid policies
  • KAFKA-4093: Cluster id
Improvement
  • KAFKA-724: Allow automatic socket.send.buffer from operating system
  • KAFKA-1981: Make log compaction point configurable
  • KAFKA-2063: Bound fetch response size (KIP-74)
  • KAFKA-2547: Make DynamicConfigManager to use the ZkNodeChangeNotificationListener introduced as part of KAFKA-2211
  • KAFKA-2800: Update outdated dependencies
  • KAFKA-3158: ConsumerGroupCommand should tell whether group is actually dead
  • KAFKA-3281: Improve message of stop scripts when no processes are running
  • KAFKA-3282: Change tools to use new consumer if zookeeper is not specified
  • KAFKA-3283: Remove beta from new consumer documentation
  • KAFKA-3479: Add new consumer metrics documentation
  • KAFKA-3595: Add capability to specify replication compact option for stream store
  • KAFKA-3644: Use Boolean protocol type for StopReplicaRequest delete_partitions
  • KAFKA-3680: Make Java client classloading more flexible
  • KAFKA-3683: Add file descriptor recommendation to ops guide
  • KAFKA-3697: Clean-up website documentation when it comes to clients
  • KAFKA-3699: Update protocol page on website to explain how KIP-35 should be used
  • KAFKA-3711: Allow configuration of MetricsReporter subclasses
  • KAFKA-3732: Add an auto accept option to kafka-acls.sh
  • KAFKA-3748: Add consumer-property to console tools consumer (similar to –producer-property)
  • KAFKA-3753: Add approximateNumEntries() to the StateStore interface for metrics reporting
  • KAFKA-3760: Set broker state as running after publishing to ZooKeeper
  • KAFKA-3762: Log.loadSegments() should log the message in exception
  • KAFKA-3765: Code style issues in Kafka
  • KAFKA-3768: Replace all pattern match on boolean value by if/elase block.
  • KAFKA-3771: Improving Kafka code
  • KAFKA-3775: Throttle maximum number of tasks assigned to a single KafkaStreams
  • KAFKA-3824: Docs indicate auto.commit breaks at least once delivery but that is incorrect
  • KAFKA-3842: Add Helper Functions Into TestUtils
  • KAFKA-3844: Sort configuration items in log
  • KAFKA-3845: Support per-connector converters
  • KAFKA-3846: Connect record types should include timestamps
  • KAFKA-3847: Connect tasks should not share a producer
  • KAFKA-3849: Add explanation on why polling every second in MirrorMaker is required
  • KAFKA-3888: Allow consumer to send heartbeats in background thread (KIP-62)
  • KAFKA-3920: Add Schema source connector to Kafka Connect
  • KAFKA-3922: Add a copy-constructor to AbstractStream
  • KAFKA-3936: Validate user parameters as early as possible
  • KAFKA-3942: Change IntegrationTestUtils.purgeLocalStreamsState to use java.io.tmpdir
  • KAFKA-3954: Consumer should use internal topics information returned by the broker
  • KAFKA-3997: Halting because log truncation is not allowed and suspicious logging
  • KAFKA-4012: KerberosShortNamer should implement toString()
  • KAFKA-4013: SaslServerCallbackHandler should include cause for exception
  • KAFKA-4016: Kafka Streams join benchmark
  • KAFKA-4044: log actual socket send/receive buffer size after connecting in Selector
  • KAFKA-4050: Allow configuration of the PRNG used for SSL
  • KAFKA-4052: Allow passing properties file to ProducerPerformance
  • KAFKA-4053: Refactor TopicCommand to remove redundant if/else statements
  • KAFKA-4062: Require –print-data-log if –offsets-decoder is enabled for DumpLogOffsets
  • KAFKA-4063: Add support for infinite endpoints for range queries in Kafka Streams KV stores
  • KAFKA-4070: Implement a useful Struct.toString()
  • KAFKA-4112: Remove alpha quality label from Kafka Streams in docs
  • KAFKA-4151: Update public docs for KIP-78
  • KAFKA-4165: Add 0.10.0.1 as a source for compatibility tests where relevant
  • KAFKA-4177: Replication Throttling: Remove ThrottledReplicationRateLimit from Server Config
  • KAFKA-4244: Update our website look & feel
Bug
  • KAFKA-1196: java.lang.IllegalArgumentException Buffer.limit on FetchResponse.scala + 33
  • KAFKA-2311: Consumer’s ensureNotClosed method not thread safe
  • KAFKA-2684: Add force option to TopicCommand & ConfigCommand to suppress console prompts
  • KAFKA-2894: WorkerSinkTask doesn’t handle rewinding offsets on rebalance
  • KAFKA-2932: Adjust importance level of Kafka Connect configs
  • KAFKA-2935: Remove vestigial CLUSTER_CONFIG in WorkerConfig
  • KAFKA-2941: Docs for key/value converter in Kafka connect are unclear
  • KAFKA-2948: Kafka producer does not cope well with topic deletions
  • KAFKA-2971: KAFKA - Not obeying log4j settings, DailyRollingFileAppender not rolling files
  • KAFKA-3054: Connect Herder fail forever if sent a wrong connector config or task config
  • KAFKA-3111: java.lang.ArithmeticException: / by zero in ConsumerPerformance
  • KAFKA-3218: Kafka-0.9.0.0 does not work as OSGi module
  • KAFKA-3396: Unauthorized topics are returned to the user
  • KAFKA-3400: Topic stop working / can’t describe topic
  • KAFKA-3500: KafkaOffsetBackingStore set method needs to handle null
  • KAFKA-3501: Console consumer process hangs on SIGINT
  • KAFKA-3525: max.reserved.broker.id off-by-one error
  • KAFKA-3561: Auto create through topic for KStream aggregation and join
  • KAFKA-3562: Null Pointer Exception Found when delete topic and Using New Producer
  • KAFKA-3645: NPE in ConsumerGroupCommand and ConsumerOffsetChecker when running in a secure env
  • KAFKA-3650: AWS test script fails to install vagrant
  • KAFKA-3682: ArrayIndexOutOfBoundsException thrown by SkimpyOffsetMap.get() when full
  • KAFKA-3691: Confusing logging during metadata update timeout
  • KAFKA-3710: MemoryOffsetBackingStore creates a non-daemon thread that prevents clean shutdown
  • KAFKA-3716: Check against negative timestamps
  • KAFKA-3719: Pattern regex org.apache.kafka.common.utils.Utils.HOST_PORT_PATTERN is too narrow
  • KAFKA-3723: Cannot change size of schema cache for JSON converter
  • KAFKA-3735: RocksDB objects needs to be disposed after usage
  • KAFKA-3740: Enable configuration of RocksDBStores
  • KAFKA-3742: Can’t run connect-distributed.sh with -daemon flag
  • KAFKA-3761: Controller has RunningAsBroker instead of RunningAsController state
  • KAFKA-3767: Failed Kafka Connect’s unit test with Unknown license.
  • KAFKA-3769: KStream job spending 60% of time writing metrics
  • KAFKA-3781: Errors.exceptionName() can throw NPE
  • KAFKA-3782: Transient failure with kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_bounce.clean=True
  • KAFKA-3786: Avoid unused property from parent configs causing WARN entries
  • KAFKA-3794: Add Stream / Table prefix in print functions
  • KAFKA-3807: OffsetValidationTest - transient failure on test_broker_rolling_bounce
  • KAFKA-3809: Auto-generate documentation for topic-level configuration
  • KAFKA-3810: replication of internal topics should not be limited by replica.fetch.max.bytes
  • KAFKA-3812: State store locking is incorrect
  • KAFKA-3830: getTGT() debug logging exposes confidential information
  • KAFKA-3840: OS auto tuning for socket buffer size in clients not allowed through configuration
  • KAFKA-3850: WorkerSinkTask should retry commits if woken up during rebalance or shutdown
  • KAFKA-3852: Clarify how to handle message format upgrade without killing performance
  • KAFKA-3854: Subsequent regex subscription calls fail
  • KAFKA-3864: Kafka Connect Struct.get returning defaultValue from Struct not the field schema
  • KAFKA-3894: LogCleaner thread crashes if not even one segment can fit in the offset map
  • KAFKA-3896: Unstable test KStreamRepartitionJoinTest.shouldCorrectlyRepartitionOnJoinOperations
  • KAFKA-3915: LogCleaner IO buffers do not account for potential size difference due to message format change
  • KAFKA-3916: Connection from controller to broker disconnects
  • KAFKA-3929: Add prefix for underlying clients configs in StreamConfig
  • KAFKA-3930: IPv6 address can’t used as ObjectName
  • KAFKA-3934: Start scripts enable GC by default with no way to disable
  • KAFKA-3937: Kafka Clients Leak Native Memory For Longer Than Needed With Compressed Messages
  • KAFKA-3938: Fix consumer session timeout issue in Kafka Streams
  • KAFKA-3945: kafka-console-producer.sh does not accept request-required-acks=all
  • KAFKA-3946: Protocol guide should say that Produce request acks can only be 0, 1, or -1
  • KAFKA-3949: Consumer topic subscription change may be ignored if a rebalance is in progress
  • KAFKA-3952: VerifyConsumerRebalance cannot succeed when checking partition owner
  • KAFKA-3963: Missing messages from the controller to brokers
  • KAFKA-3965: Mirror maker sync send fails will lose messages
  • KAFKA-4002: task.open() should be invoked in case that 0 partitions is assigned to task.
  • KAFKA-4019: LogCleaner should grow read/write buffer to max message size for the topic
  • KAFKA-4023: Add thread id as prefix in Kafka Streams thread logging
  • KAFKA-4031: Check DirectBuffer’s cleaner to be not null before using
  • KAFKA-4032: Uncaught exceptions when autocreating topics
  • KAFKA-4033: KIP-70: Revise Partition Assignment Semantics on New Consumer’s Subscription Change
  • KAFKA-4034: Consumer need not lookup coordinator when using manual assignment
  • KAFKA-4035: AclCommand should allow Describe operation on groups
  • KAFKA-4037: Transient failure in ConnectRestApiTest
  • KAFKA-4042: DistributedHerder thread can die because of connector & task lifecycle exceptions
  • KAFKA-4051: Strange behavior during rebalance when turning the OS clock back
  • KAFKA-4056: Kafka logs values of sensitive configs like passwords
  • KAFKA-4066: NullPointerException in Kafka consumer due to unsafe access to findCoordinatorFuture
  • KAFKA-4073: MirrorMaker should handle mirroring messages w/o timestamp better
  • KAFKA-4077: Backdate validity of certificates in system tests to cope with clock skew
  • KAFKA-4082: Support Gradle 3.0
  • KAFKA-4098: NetworkClient should not intercept all metadata requests on disconnect
  • KAFKA-4099: Change the time based log rolling to only based on the message timestamp.
  • KAFKA-4100: Connect Struct schemas built using SchemaBuilder with no fields cause NPE in Struct constructor
  • KAFKA-4103: DumpLogSegments cannot print data from offsets topic
  • KAFKA-4104: Queryable state metadata is sometimes invalid
  • KAFKA-4105: Queryable state tests for concurrency and rebalancing
  • KAFKA-4118: StreamsSmokeTest.test_streams started failing since 18 August build
  • KAFKA-4123: Queryable State returning null for key before all stores in instance have been initialized
  • KAFKA-4129: Processor throw exception when getting channel remote address after closing the channel
  • KAFKA-4130:
  • KAFKA-4131: Multiple Regex KStream-Consumers cause Null pointer exception in addRawRecords in RecordQueue class
  • KAFKA-4135: Inconsistent javadoc for KafkaConsumer.poll behavior when there is no subscription
  • KAFKA-4153: Incorrect KStream-KStream join behavior with asymmetric time window
  • KAFKA-4158: Reset quota to default value if quota override is deleted for a given clientId
  • KAFKA-4160: Consumer onPartitionsRevoked should not be invoked while holding the coordinator lock
  • KAFKA-4162: Typo in Kafka Connect document
  • KAFKA-4163: NPE in StreamsMetadataState during re-balance operations
  • KAFKA-4172: Fix masked test error in KafkaConsumerTest.testSubscriptionChangesWithAutoCommitEnabled
  • KAFKA-4173: SchemaProjector should successfully project when source schema field is missing and target schema field is optional
  • KAFKA-4175: Can’t have StandbyTasks in KafkaStreams where NUM_STREAM_THREADS_CONFIG > 1
  • KAFKA-4176: Node stopped receiving heartbeat responses once another node started within the same group
  • KAFKA-4183: Logical converters in JsonConverter don’t properly handle null values
  • KAFKA-4193: FetcherTest fails intermittently
  • KAFKA-4197: Make ReassignPartitionsTest System Test move data
  • KAFKA-4200: Minor issue with throttle argument in kafka-reassign-partitions.sh
  • KAFKA-4214: kafka-reassign-partitions fails all the time when brokers are bounced during reassignment
  • KAFKA-4216: Replication Quotas: Control Leader & Follower Throttled Replicas Separately
  • KAFKA-4222: Transient failure in QueryableStateIntegrationTest.queryOnRebalance
  • KAFKA-4223: RocksDBStore should close all open iterators on close
  • KAFKA-4225: Replication Quotas: Control Leader & Follower Limit Separately
  • KAFKA-4227: AdminManager is not shutdown when KafkaServer is shutdown
  • KAFKA-4233: StateDirectory fails to create directory if any parent directory does not exist
  • KAFKA-4234: Consumer should not commit offsets in unsubscribe()
  • KAFKA-4235: Fix the closing order in Sender.initiateClose().
  • KAFKA-4241: StreamsConfig doesn’t pass through custom consumer and producer properties to ConsumerConfig and ProducerConfig
  • KAFKA-4248: Consumer can return data from old regex subscription in poll()
  • KAFKA-4251: Test driver not launching in Vagrant 1.8.6
  • KAFKA-4252: Missing ProducerRequestPurgatory
  • KAFKA-4253: Fix Kafka Stream thread shutting down process ordering
  • KAFKA-4257: Inconsistencies in 0.10.1 upgrade docs
  • KAFKA-4262: Intermittent unit test failure ReassignPartitionsClusterTest.shouldExecuteThrottledReassignment
  • KAFKA-4265: Intermittent test failure ReplicationQuotasTest.shouldBootstrapTwoBrokersWithFollowerThrottle
  • KAFKA-4267: Quota initialization for <user, clientId> uses incorrect ZK path
  • KAFKA-4274: KafkaConsumer.offsetsForTimes() hangs and times out on an empty map
  • KAFKA-4283: records deleted from CachingKeyValueStore still appear in range and all queries
  • KAFKA-4290: High CPU caused by timeout overflow in WorkerCoordinator
  • KAFKA-3590: KafkaConsumer fails with “Messages are rejected since there are fewer in-sync replicas than required.” when polling
  • KAFKA-3774: GetOffsetShell tool reports ‘Missing required argument “[time]”’ despite the default
  • KAFKA-4008: Module “tools” should not be dependent on “core”
Task
  • KAFKA-3163: KIP-33 - Add a time based log index to Kafka
  • KAFKA-3838: Bump zkclient and Zookeeper versions
  • KAFKA-4079: Document quota configuration changes from KIP-55
  • KAFKA-4148: KIP-79 add ListOffsetRequest v1 and search by timestamp interface to consumer.
  • KAFKA-4192: Update upgrade documentation for 0.10.1.0 to mention inter.broker.protocol.version
Wish
  • KAFKA-3905: Check for null in KafkaConsumer#{subscribe, assign}</li>
Test
  • KAFKA-3374: Failure in security rolling upgrade phase 2 system test
  • KAFKA-3799: Turn on endpoint validation in SSL system tests
  • KAFKA-3863: Add system test for connector failure/restart
  • KAFKA-3985: Transient system test failure ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol
  • KAFKA-4055: Add system tests for secure quotas
  • KAFKA-4145: Avoid redundant integration testing in ProducerSendTests
  • KAFKA-4213: Add system tests for replication throttling (KIP-73)
Sub-task
  • KAFKA-2720: Periodic purging groups in the coordinator
  • KAFKA-2945: CreateTopic - protocol and server side implementation
  • KAFKA-2946: DeleteTopic - protocol and server side implementation
  • KAFKA-3290: WorkerSourceTask testCommit transient failure
  • KAFKA-3443: Support regex topics in addSource() and stream()
  • KAFKA-3576: Unify KStream and KTable API
  • KAFKA-3660: Log exception message in ControllerBrokerRequestBatch
  • KAFKA-3678: Fix stream integration test timeouts
  • KAFKA-3708: Rethink exception handling in KafkaStreams
  • KAFKA-3777: Extract the existing LRU cache out of RocksDBStore
  • KAFKA-3778: Avoiding using range queries of RocksDBWindowStore on KStream windowed aggregations
  • KAFKA-3780: Add new config cache.max.bytes.buffering to the streams configuration
  • KAFKA-3865: Transient failures in org.apache.kafka.connect.runtime.WorkerSourceTaskTest.testSlowTaskStart
  • KAFKA-3870: Expose state store names to DSL
  • KAFKA-3872: OOM while running Kafka Streams integration tests
  • KAFKA-3874: Transient test failure: org.apache.kafka.streams.integration.JoinIntegrationTest.shouldCountClicksPerRegion
  • KAFKA-3911: Enforce KTable materialization
  • KAFKA-3912: Query local state stores
  • KAFKA-3914: Global discovery of state stores
  • KAFKA-3926: Transient failures in org.apache.kafka.streams.integration.RegexSourceIntegrationTest
  • KAFKA-3973: Investigate feasibility of caching bytes vs. records
  • KAFKA-3974: LRU cache should store bytes/object and not records
  • KAFKA-4038: Transient failure in DeleteTopicsRequestTest.testErrorDeleteTopicRequests
  • KAFKA-4045: Investigate feasibility of hooking into RocksDb’s cache
  • KAFKA-4049: Transient failure in RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted
  • KAFKA-4058: Failure in org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterReset
  • KAFKA-4069: Forward records in context of cache flushing/eviction
  • KAFKA-4147: Transient failure in ConsumerCoordinatorTest.testAutoCommitDynamicAssignment
  • KAFKA-4167: Add cache metrics
  • KAFKA-4194: Add more tests for KIP-79

Confluent Platform 3.0.1 Release Notes

Confluent Platform 3.0.1 contains a number of bug fixes included in the Kafka 0.10.0.1 release. Details of the changes to Kafka in this patch release are found in the Kafka Release Notes. Details of the changes to other components of the Confluent Platform are listed in the respective changelogs such as Kafka REST Proxy changelog.

Highlights

Confluent Control Center

Control Center added performance improvements to reduce running overhead, including reducing the number of Kafka topics necessary and optimizing webclient fetches from the server. We also added the ability to delete a running connector and support for connecting to SSL/SASL secured Kafka clusters.

Kafka Streams
New Feature: Application Reset Tool

A common situation when implementing stream processing applications in practice is to tell an application to reprocess its data from scratch. This may be required for a number of reasons, including but not limited to: during development and testing, when addressing bugs in production, when doing A/B testing of algorithms and campaigns, when giving demos to customers, and so on.

Previously, resetting an application was a manual task that was cumbersome and error-prone. Kafka now includes an Application Reset Tool for Kafka Streams through which you can quickly reset an application so that it will reprocess its data from scratch – think: an application “reset button”.

Apache Kafka 0.10.0.1

Here is a quick overview of the notable Kafka-related changes in the release:

New Feature
  • KAFKA-3538: Abstract the creation/retrieval of Producer for stream sinks for unit testing
Improvement
  • KAFKA-3479: Add new consumer metrics documentation
  • KAFKA-3667: Improve Section 7.2 Encryption and Authentication using SSL to include proper hostname verification configuration
  • KAFKA-3683: Add file descriptor recommendation to ops guide
  • KAFKA-3699: Update protocol page on website to explain how KIP-35 should be used
  • KAFKA-3725: Update documentation with regards to XFS
  • KAFKA-3747: Close RecordBatch.records when append to batch fails
  • KAFKA-3785: Fetcher spending unnecessary time during metrics recording
  • KAFKA-3836: RocksDBStore.get() should not pass nulls to Deserializers
  • KAFKA-3880: Disallow Join Windows with size zero
  • KAFKA-3922: Add a copy-constructor to AbstractStream
  • KAFKA-4034: Consumer need not lookup coordinator when using manual assignment
Bug
  • KAFKA-3185: Allow users to cleanup internal data
  • KAFKA-3258: BrokerTopicMetrics of deleted topics are never deleted
  • KAFKA-3393: Update site docs and javadoc based on max.block.ms changes
  • KAFKA-3500: KafkaOffsetBackingStore set method needs to handle null
  • KAFKA-3718: propagate all KafkaConfig __consumer_offsets configs to OffsetConfig instantiation
  • KAFKA-3728: EndToEndAuthorizationTest offsets_topic misconfigured
  • KAFKA-3781: Errors.exceptionName() can throw NPE
  • KAFKA-3782: Transient failure with kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_bounce.clean=True
  • KAFKA-3783: Race condition on last ACL removal for a resource fails with a ZkBadVersionException
  • KAFKA-3784: TimeWindows#windowsFor misidentifies some windows if TimeWindows#advanceBy is used
  • KAFKA-3787: Preserve message timestamp in mirror mkaer
  • KAFKA-3789: Upgrade Snappy to fix snappy decompression errors
  • KAFKA-3802: log mtimes reset on broker restart
  • KAFKA-3805: Running multiple instances of a Streams app on the same machine results in Java_org_rocksdb_RocksDB_write error
  • KAFKA-3817: KTableRepartitionMap should handle null inputs
  • KAFKA-3830: getTGT() debug logging exposes confidential information
  • KAFKA-3850: WorkerSinkTask should retry commits if woken up during rebalance or shutdown
  • KAFKA-3851: Add references to important installation/upgrade notes to release notes
  • KAFKA-3852: Clarify how to handle message format upgrade without killing performance
  • KAFKA-3854: Subsequent regex subscription calls fail
  • KAFKA-3855: Guard race conditions in TopologyBuilder
  • KAFKA-3864: Kafka Connect Struct.get returning defaultValue from Struct not the field schema
  • KAFKA-3879: KafkaConsumer with auto commit enabled gets stuck when killed after broker is dead
  • KAFKA-3887: StreamBounceTest.test_bounce and StreamSmokeTest.test_streams failing
  • KAFKA-3890: Kafka Streams: task assignment is not maintained on cluster restart or rolling restart
  • KAFKA-3898: KStream.leftJoin(...) is missing a Serde for thisVal. This can cause failures after mapValues etc
  • KAFKA-3902: Optimize KTable.filter() to reduce unnecessary traffic
  • KAFKA-3915: LogCleaner IO buffers do not account for potential size difference due to message format change
  • KAFKA-3924: Data loss due to halting when LEO is larger than leader&#39;s LEO
  • KAFKA-3933: Kafka OOM During Log Recovery Due to Leaked Native Memory
  • KAFKA-3935: ConnectDistributedTest.test_restart_failed_task.connector_type=sink system test failing
  • KAFKA-3941: Avoid applying eviction listener in InMemoryKeyValueLoggedStore
  • KAFKA-3950: kafka mirror maker tool is not respecting whitelist option
  • KAFKA-3952: VerifyConsumerRebalance cannot succeed when checking partition owner
  • KAFKA-3960: Committed offset not set after first assign
  • KAFKA-3977: KafkaConsumer swallows exceptions raised from message deserializers
  • KAFKA-3983: It would be helpful if SocketServer&#39;s Acceptors logged both the SocketChannel port and the processor ID upon registra
  • KAFKA-3996: ByteBufferMessageSet.writeTo() should be non-blocking
  • KAFKA-4008: Module &quot;tools&quot; should not be dependent on &quot;core&quot;
  • KAFKA-4018: Streams causing older slf4j-log4j library to be packaged along with newer version
  • KAFKA-4073: MirrorMaker should handle mirroring messages w/o timestamp better
  • PR-1735 - Add application id prefix for copartitionGroups in TopologyBuilder
Test
  • KAFKA-3863: Add system test for connector failure/restart
Sub-task
  • KAFKA-3660: Log exception message in ControllerBrokerRequestBatch
  • KAFKA-3865: Transient failures in org.apache.kafka.connect.runtime.WorkerSourceTaskTest.testSlowTaskStart
  • KAFKA-3931: kafka.api.PlaintextConsumerTest.testPatternUnsubscription transient failure

Confluent Platform 3.0.0 Release Notes

This is a major release of the Confluent Platform that provides Confluent users with Apache Kafka 0.10.0.0, the latest stable version of Kafka. In addition, this release includes the new Confluent Control Center application as well as the new Kafka Streams library that ships with Apache Kafka 0.10.0.0.

Confluent Platform users are encouraged to upgrade to CP 3.0.0 as it includes both new major functionality as well as important bug fixes. The technical details of this release are summarized below.

Highlights

We have added several new features to the Confluent Platform in CP 3.0, to provide a more complete, easier to use, and higher performacne Stream Processing Platform:

Kafka Streams

We’re very excited to introduce Kafka Streams. Kafka Streams is included in Apache Kafka 0.10.0.0. Kafka Streams is a library that turns Apache Kafka into a full featured, modern stream processing system. Kafka Streams includes a high level language for describing common stream operations (such as joining, filtering, and aggregating records), allowing developers to quickly develop powerful streaming applications. Kafka Streams applications can easily be deployed on many different systems— they can run on YARN, be deployed on Mesos, run in Docker containers, or just embedded into exisiting Java applications.

Further information is available in the Kafka Streams documentation. If you want to give it a quick spin, head straight to the Kafka Streams Quickstart.

Confluent Control Center

Control Center is a web-based management and monitoring tool for Apache Kafka. In version 3.0.0, Control Center allows you to configure, edit, and manage connectors in Kafka Connect. It also includes Stream Monitoring: a system for measuring and monitoring your data streams end to end, from producer to consumer. To get started with Control Center, see Installation.

A term license for Confluent Control Center is available for Confluent Platform Enterprise Subscribers, but any user may download and try Confluent Control Center for free for 30 days.

Apache Kafka 0.10.0.0

Apache Kafka 0.10.0.0 is a major release of Apache Kafka and includes a number of new features and enhancements. Highlights include

  • Kafka Streams. As described above, Kafka Streams adds a simple but powerful streaming library to Apache Kafka.
  • Relative Offsets in Compressed Messages. In older versions of Kafka, recompression occurred when a broker received a batch of messages from ther producer. In 0.10.0.0, we have changed from using absolute offsets to relative offsets to avoid the recompression, reducing latency and reducing load on Kafka brokers. KAFKA-2511
  • Rack Awareness. Kafka can now run with a rack awareness feature that isolates replicas so they are guaranteed to span multiple racks or availability zones. This allows all of Kafka’s durability guarantees to be applied to these larger architectural units, significantly increasing availability. KAFKA-1215
  • Timestamps in Mesages. Messages are now tagged with timestamps at the time they are produced, allowing a number of future features including looking up message by time and measuring timing. KAFKA-2511
  • Kafka Consumer Max Records. In 0.9.0.0, developers had little control over the number of mesages returned when calling poll() for the new consumer. This feature introduces a new parameter max.poll.records that allows developers to limit the number of messages returned. KAKFA-3007
  • Client-Side Interceptors. We have introduced a new plugin architecture that allows developers to easily add “plugins” to Kafka clients. This allows developers to easily deploy additional code to inspect or modify Kafka messages. KAFKA-3162
  • Standardize Client Sequences. This features changed the arguments to some methods in the new consumer to work nore consistently with Java Collections. KAFKA-3006
  • List Connectors REST API. You can now query a distributed Kafka Connect cluster to discover the available connector classes. KAFKA-3316
  • Admin API changes. Some changes were made in the metadata request/response, improving performance in some situations. KAFKA-1694
  • Protocol Version Improvements. Kafka brokers now support a request that returns all supported protocol API versions. (This will make it easier for future Kafka clients to support multiple broker versions with a single client.) KAFKA-3307
  • SASL Improvements. Kafka 0.9.0.0 introduced new security features to Kafka, including support for Kerberos through SASL. In 0.10.0.0, Kafka now includes support for more SASL features, including external authentication servers, supporting multiple types of SASL authentication on one server, and other improvements. KAFKA-3149
  • Connect Status/Control APIs. In Kafka 0.10.0.0, we have continued to improve Kafka Connect. Previously, users had to monitor logs to view the status of connectors and their tasks, but we now support a status API for easier monitoring. We’ve also added control APIs, which allow you to pause a connector’s message processing in order to perform maintenance, and to manually restart tasks which have failed. KAFKA-3093, KAFKA-2370, KAFKA-3506
  • Allow cross origin HTTP requests on all HTTP methods. In Kafka 0.9.0.0, Kafka Connect only supported requests from the same domain; this enhancement removes that restriction. KAFKA-3578
  • Kafka LZ4 framing. Kafka’s implementation of LZ4 did not follow the standard LZ4 specification, creating problems for third party clients that wanted to leverage existing libraries. Kafka now conforms to the standard. KAFKA-3160

Note

Upgrading a Kafka Connect running in distributed mode from 0.9 versions of Kafka to 0.10 versions requires making a configuration change before the upgrade. See the Kafka Connect Upgrade Notes for more details.

For a complete list of features added and bugs fixed, see the Apache Kafka Release Notes.

Deprecating Camus

Camus in Confluent Platform is deprecated in Confluent Platform 3.0 and may be removed in a release after Confluent Platform 3.1. To export data from Kafka to HDFS and Hive, we recommend Kafka Connect with the Confluent HDFS connector as an alternative.

Other Notable Changes

We have also added some additional features to the Confluent Platform in CP 3.0:

  • Preview release of Python Client. We’re introducing a fully supported, up to date client for Python. Over time, we will keep this client up to date with the latest Java clients, including support for new broker versions and Kafka features. Try it out and send us feedback, through the Confluent Platform Mailing List.
  • Security for Schema Registry. The Schema Registry now supports SSL both at its REST layer (via HTTPS) and in its communication with Kafka. The REST layer is the public, user-facing component, and the “communication with Kafka” is the backend communication with Kafka where schemas are stored.
  • Security for Kafka REST Proxy. The REST Proxy now supports REST calls over HTTPS. The REST Proxy does not currently support Kafka security.
  • We’ve removed the “beta” designation from the new Java consumer and encourage users to begin migration away from the old consumers (note that it is required to make use of Kafka security extensions).
  • The old Scala producer has been deprecated. Users should migrate to the Java producer as soon as possible.

Confluent Platform 2.0.1 Release Notes

Confluent Platform 2.0.1 contains a number of bug fixes included in the Kafka 0.9.0.1 release. Details of the changes to Kafka in this patch release are found in the Kafka Release Notes. Details of the changes to other components of the Confluent Platform are listed in the respective changelogs such as Kafka REST Proxy changelog.

Here is a quick overview of the notable Kafka-related bug fixes in the release, grouped by the affected functionality:

New Java consumer

  • KAFKA-2978: Topic partition is not sometimes consumed after rebalancing of consumer group
  • KAFKA-3179: Kafka consumer delivers message whose offset is earlier than sought offset.
  • KAFKA-3157: Mirror maker doesn’t commit offset with new consumer when there is no more messages
  • KAFKA-3170: Default value of fetch_min_bytes in new consumer is 1024 while doc says it is 1

Compatibility

  • KAFKA-2695: Handle null string/bytes protocol primitives
  • KAFKA-3100: Broker.createBroker should work if json is version > 2, but still compatible
  • KAFKA-3012: Avoid reserved.broker.max.id collisions on upgrade

Security

  • KAFKA-3198: Ticket Renewal Thread exits prematurely due to inverted comparison
  • KAFKA-3152: kafka-acl doesn’t allow space in principal name
  • KAFKA-3169: Kafka broker throws OutOfMemory error with invalid SASL packet
  • KAFKA-2878: Kafka broker throws OutOfMemory exception with invalid join group request
  • KAFKA-3166: Disable SSL client authentication for SASL_SSL security protocol

Performance/memory usage

  • KAFKA-3003: The fetch.wait.max.ms is not honored when new log segment rolled for low volume topics
  • KAFKA-3159: Kafka consumer 0.9.0.0 client poll is very CPU intensive under certain conditions
  • KAFKA-2988: Change default configuration of the log cleaner
  • KAFKA-2973: Fix leak of child sensors on remove

Topic deletion

  • KAFKA-2937: Topics marked for delete in Zookeeper may become undeletable

Confluent Platform 2.0.0 Release Notes

The CP 2.0.0 release includes a range of new features over the previous release CP 1.0.x.

Security

This release includes three key security features built directly within Kafka itself. First we now authenticate users using either Kerberos or TLS client certificates, so we now know who is making each request to Kafka. Second we have added a unix-like permissions system (ACLs) to control which users can access which data. Third, we support encryption on the wire using TLS to protect sensitive data on an untrusted network.

For more information on security features and how to enable them, see Kafka Security.

Kafka Connect

Kafka Connect facilitates large-scale, real-time data import and export for Kafka. It abstracts away common problems that each such data integration tool needs to solve to be viable for 24x7 production environments: fault tolerance, partitioning, offset management and delivery semantics, operations, and monitoring. It offers the capability to run a pool of processes that host a large number of Kafka connectors while handling load balancing and fault tolerance.

Confluent Platform includes a file connector for importing data from text files or exporting to text files, JDBC connector for importing data from relational databases and an HDFS connector for exporting data to HDFS / Hive in Avro and Parquet formats.

To learn more about Kafka Connect and the available connectors, see Kafka Connect.

User Defined Quotas

Confluent Platform 2.0 and Kafka 0.9 now support user-defined quotas. Users have the ability to enforce quotas on a per-client basis. Producer-side quotas are defined in terms of bytes written per second per client id while consumer quotas are defined in terms of bytes read per second per client id.

Learn more about user defined quotas in the Enforcing Client Quotas section of the post-deployment documentation.

New Consumer

This release introduces beta support for the newly redesigned consumer client. At a high level, the primary difference in the new consumer is that it removes the distinction between the “high-level” ZooKeeper-based consumer and the “low-level” SimpleConsumer APIs, and instead offers a unified consumer API.

The new consumer allows the use of the group management facility (like the older high-level consumer) while still offering better control over offset commits at the partition level (like the older low-level consumer). It offers pluggable partition assignment amongst the members of a consumer group and ships with several assignment strategies. This completes a series of projects done in the last few years to fully decouple Kafka clients from Zookeeper, thus entirely removing the consumer client’s dependency on ZooKeeper.

To learn how to use the new consumer, refer to the Kafka Consumers documentation or the API docs.

New Client Library - librdkafka

In this release of the Confluent Platform we are packaging librdkafka. librdkafka is a C/C++ library implementation of the Apache Kafka protocol, containing both Producer and Consumer support. It was designed with message delivery reliability and high performance in mind, current figures exceed 800,000 msgs/second for the producer and 3 million msgs/second for the consumer.

You can learn how to use librdkafka side-by-side with the Java clients in our Kafka Clients documentation.

Proactive Support

Proactive Support is a component of the Confluent Platform that improves Confluent’s support for the platform by collecting and reporting support metrics (“Metrics”) to Confluent. Proactive Support is enabled by default in the Confluent Platform. We do this to provide proactive support to our customers, to help us build better products, to help customers comply with support contracts, and to help guide our marketing efforts. With Metrics enabled, a Kafka broker is configured to collect and report certain broker and cluster metadata (“Metadata”) every 24 hours about your use of the Confluent Platform (including without limitation, your remote internet protocol address) to Confluent, Inc. (“Confluent”) or its parent, subsidiaries, affiliates or service providers.

Proactive Support is enabled by default in the Confluent Platform, but you can disable it by following the instructions in Proactive Support documentation. Please refer to the Confluent Privacy Policy for an in-depth description of how Confluent processes such information.

Compatibility Notes

  • Kafka 0.9 no longer supports Java 6 or Scala 2.9. If you are still on Java 6, consider upgrading to a supported version.
  • Configuration parameter replica.lag.max.messages was removed. Partition leaders will no longer consider the number of lagging messages when deciding which replicas are in sync.
  • Configuration parameter replica.lag.time.max.ms now refers not just to the time passed since last fetch request from replica, but also to time since the replica last caught up. Replicas that are still fetching messages from leaders but did not catch up to the latest messages in replica.lag.time.max.ms will be considered out of sync.
  • MirroMaker no longer supports multiple target clusters. As a result it will only accept a single --consumer.config parameter. To mirror multiple source clusters, you will need at least one MirrorMaker instance per source cluster, each with its own consumer configuration.
  • Broker IDs above 1000 are now reserved by default to automatically assigned broker IDs. If your cluster has existing broker IDs above that threshold make sure to increase the reserved.broker.max.id broker configuration property accordingly.

How to Download

CP 3.1.1 is available for download at http://www.confluent.io/developer. See section Installation for detailed information.

To upgrade Confluent Platform to 3.1.1, check the Upgrade documentation.

Questions?

If you have questions regarding this release, feel free to reach out via the Confluent Platform mailing list. Confluent customers are encouraged to contact our support directly.