In this post, I will cover what happened in the Kafka community in July 2018.
As hinted last month’s digest, 5 Kafka versions have been released in July! First, let’s cover the new major release:
– Kafka 2.0.0:
It was released on the 30th of July. It required 4 release candidates hence it’s a bit late according to the release plan. Nevertheless, it is packed with new exciting features:
– Improved ACLs support: ACLs can now contain wildcards simplifying authorization management in large clusters. ACLs for topic creation can now be granted per topic or for topics with a prefix.
– OAuth2 token bearer support: Kafka now supports authentication using the OAUTHBEARER SASL mechanism.
– Extended dynamic broker configuration: SSL Truststores and Keystores can now be updated at runtime.
– Updated replication protocol: The new replication protocol prevents log divergence between leader and follower during fast leader failover. Also message down-conversions (that happens when an old client connects to a newer broker) now use less memory.
– Updated Quota handling: Clients are now notified before throttling is applied. This allows clients to distinguish between network errors and long throttle times.
– Improved Consumer APIs: All Consumer APIs can now accept a timeout preventing the Consumer from blocking forever.
– Removed Java 7 support and deprecated Scala old clients.
– Streams improvements: A ton of new Streams content including support for headers, better windowed aggregations performance and a Scala wrapper API for Kafka Streams DSL.
– Connect improvements: Secrets like passwords and keys can now be stored out of the (plaintext) Connector configuration files. The error handling in Connector, Transformers and Converters has also been improved to allow automatic retries and better logging in case of fatal issues.
– over 120 bugs fixed!
Many thanks to Rajini Sivaram for managing this release. As always, the full release notes are available on the Apache website.
– Bugfix releases:
There were also 4 bugfix versions released. You should upgrade to one of these as each contains over 30 fixes including a handful of blockers.
Thanks to Dong Lin for managing the 1.1.1 release and to Matthias J. Sax for managing the other 3 releases!
Once again, the community has been very active. 17 KIPs (KIP-331 to KIP-348 but 337 has been skipped) have been submitted since last month. These are the ones that caught my eye:
KIP-332: Update AclCommand to use AdminClient API
Now that managing ACLs can be done using the Admin APIs, the proposal of this KIP is to update the ACLCommand tool to not require direct access to Zookeeper and instead use the AdminClient.
KIP-341: Update Sticky Assignor’s User Data Protocol
The goal of this KIP is to update the Sticky Assignor data protocol to fix assignment issues that can arise if a consumer leaves and re-joins a group in a very short interval.
KIP-345: Reduce multiple consumer rebalances by specifying member id
Group rebalances happen when the state of a consumer group changes; whether it’s a new consumer joining, one leaving or their subscription changing. In large consumer groups rebalances can take a long time to complete during which the group is fully stalled and not consuming anything. The goal of this KIP is to optimize the rebalancing process in order to reduce their occurrence and their duration.
KIP-346 – Improve LogCleaner behavior on error
Currently if the LogCleaner hits an exception while processing logs, it is stopped and not restarted automatically. Because logs are not being deleted, in some environments this can quickly lead to disk exhaustion rendering brokers unusable. The proposal is to mark partitions that fail cleaning as “uncleanable” and keep the cleaner working on the other partitions. In addition, this KIP introduces metrics to track “uncleanable” partitions.
IBM Message Hub is Apache Kafka as a service for IBM Cloud. You can get started at https://console.bluemix.net/docs/services/MessageHub/index.html#messagehub.