Thursday, April 20, 2023

Zio-kafka hacking day

Not long ago I contacted Steven (committer of the zio-kafka library) to get some better understanding of how the library works. April 12, not more than 2 months later I am a committer, and I was sitting in a room together with Steven, Jules Ivanic (another committer) and wildcard Pierangelo Cecchetto (contributor), hacking on zio-kafka.

The meeting was an idea of Jules who was ‘in the neighborhood’. He was traveling from Australia for his company (Conduktor). We were able to get a nice room in the Amsterdam office of my employer (Adevinta). Amsterdam turned out to be a nice middle ground for Steven, me and Pierangelo. (Special thanks to Go Data Driven who also had place for us.)

In the morning we spoke about current and new ideas on how to improve the library. Also, we shared detailed knowledge on ZIO and what Kafka expects from its users. After lunch we started hacking. Having someone to start an ad hoc discussion turned out to be very productive; we were able to move some tough issues forward.

Here are some highlights.

PR #788 — Wait for stream end in rebalance listener is important to prevent duplicates during a rebalance process. This PR was mostly finished for quite some time, but many details made the extensive test suite fail. We were able to solve many of these issues.

In the area of performance we implemented an idea to replace buffering (pre-fetching a fixed number of polls), with pre-fetching based on the stream’s queue size. This resulted in PR #803 — Alternative backpressure mechanism.

We also laid the seeds for another performance improvement implementation: PR #908 — Optimistically resume partitions early.

These last two PRs showed great performance improvements bringing us much closer to direct usage of the Java Kafka client. All 3 PRs are now in review.

All in all it was a lot of fun to meet fellow enthusiasts and hack on the complex machinery that is inside zio-kafka.