Kafka Monthly Digest – December 2020 and 2020 in review
See what's going on in the Kafka community this month
In this 35th edition of the Kafka Monthly Digest, I’ll cover what happened in the Apache Kafka community in December 2020 as well as some of the milestones that the project and community reached in 2020.
For last month’s digest, see Kafka Monthly Digest: November 2020.
After seven release candidates, Bill Bejeck released Apache Kafka 2.7.0 on December 21. A post was published on the Apache blog, and as always you can find the full list of changes in the release notes or in the release plan on the wiki. This new minor version brings a number of interesting features.
- PEM files can now be used to configured TLS certificates and private keys. (KIP-615)
- Quotas can be applied on topic operations such as create, delete and alter. (KIP-599)
- Quotas can be applied to the connection rate on brokers. (KIP-612)
- Progress on the migration off ZooKeeper (KIP-500):
- A new ConfigProvider, DirectoryConfigProvider, can be used to retrieve secrets in Kubernetes environments. (KIP-632)
- Consumer offsets can now be automatically synchronized across clusters with MirrorMaker 2. (KIP-545)
- New metrics to track end to end latency and RockDB status. (KIP-613, KIP-607)
- Added support for sliding windows in the DSL. (KIP-450)
- State stores can now be iterated backwards. (KIP-617)
Last month, the community submitted 11 KIPs (KIP-689 to KIP-701), and these are the ones that caught my eye.
KIP-690: Add additional configuration to control MirrorMaker 2 internal topics naming convention. MirrorMaker 2 uses a few internal topics for its operations but at the moment their names can’t be customized. This KIP proposes introducing an interface, InternalTopicPolicy, that would allow administrators to configure the names of these internal topics. This is interesting in environments with strict naming policies or when isolating multiple MirrorMaker 2 instances is required.
KIP-694: Support Reducing Partitions for Topics. While Kafka supports adding partitions to topics, it is currently not possible to remove partitions. This KIP proposes a new protocol message, DeletePartitions, to support this new feature. For simplicity, it assumes topics have short retention periods and it does not handle transparently topics with keyed messages.
KIP-700: Add Describe Cluster API. Kafka has APIs to retrieve many details about clusters. Internally, most of this data has been added to the Metadata protocol message. However Metadata is designed for producers and consumers to periodically refresh their metadata so it’s not desirable to keep adding fields into this message. For that reason, this KIP introduces a new protocol message, DescribeCluster, to contain all cluster details. Fields not relevant for producers and consumers will also be deprecated and removed from Metadata.
In this section, I will cover releases of some community projects. This only includes projects that are Open Source.
- node-rdkafka 2.10.0. This new release of node-rdkafka is now based on librdkafka 1.5.2. It also added support for librdkafka’s EOF events to identify when consumers reach the last offset of partitions.
Milestones that the project and community achieved in 2020
Releases in 2020
KIPs in 2020
In the past 12 months, the community raised 142 KIPs. This is in line with previous years.
Code and contributors
Over 200 unique contributors made more than 1400 commits in 2020.
These stats were generated using this command:
git diff --shortstat $(git hash-object -t tree /dev/null)
Committers and PMC
In 2020, six contributors were invited to become Committers:
- Konstantine Karantasis
- Boyang Chen
- Xi Hu
- David Jacot
- Chia-Ping Tsai
- Sophie Blee-Goldman
Likewise, five Committers also joined the Apache Kafka PMC:
- Colin McCabe
- Vahid Hashemian
- Manikumar Reddy
- Mickael Maison (I still cannot quite believe that I’ve joined the PMC but that happened!)
- John Roesler
Get started with Kafka
IBM Event Streams for Cloud is Apache Kafka-as-a-Service for IBM Cloud. Get started with IBM Event Streams today.