Book Review: Introduction to Event Streaming with Apache Kafka


Anatoly Zelenin, Alexander Kropp
Apache Kafka – From the basics to productive use
Hanser, November 2021
192 pages, € 44.99 (hardcover including eBook)
ISBN: 978-3-446-46187-1

“Here I am, poor fool, and am as smart as before” – Doctor Faustus’ frustrated exclamation in view of the inadequacies of university education has always been particularly true of Apache Kafka. The distributed event subsystem originally developed by Linkedin is designed to store and process data streams for big data applications. On its own, however, it is not too useful, but only reveals its actual usefulness when it is combined and integrated with other systems.

Anatoly Zelenin and Alexander Kropp are considered experts for Apache Kafka and even enjoy the status of celebrity consultants. With their work “Kafka – From the basics to productive use”, the Hanser-Verlag presents a 192-page short book that seeks to demonstrate the basics and application.

To keep access to Kafka low, the two start with a ten-page chapter that deals with the simulated launch of a rocket and describes the processing of the resulting telemetry data via a Kafka-based system. The authors tacitly assume a functioning Kafka installation. In the fifth chapter, however, interested readers will find detailed instructions on how to deliver Kafka using the ZooKeeper administration and thus construct a missile monitoring cluster.

It is somewhat unfortunate that the book was completed before the publication of Kafka 3.0 at the end of September 2021. In order to still be able to describe the installation of the event subsystem without the ZooKeeper, which has since become obsolete, the two authors promise to provide all buyers of the work with a free update of the relevant chapter on their website, in which the deployment of Kafka without ZooKeeper is explained.

Kropp and Zelenin also use the martial-inspired example with the rocket in the second chapter to consider events and Kafka news payloads in order to demonstrate the various concepts on a specific project – this helps all those who are already familiar with messenger protocols such as MQTT are.

Apache Kafka is also often used to manage logging information in performance-critical areas of application. In addition to considerations about a distributed Kafka cluster, the authors also devote themselves in this context to methods for increasing the performance and managing the reliability of the overall system.

Book review: Apache Kafka

(Image: Hanser)

As in many areas of IT, the same applies to Apache Kafka: nothing automatically runs optimally. However, the handouts in the book help to master the tasks set and to weigh up between different approaches.

In the chapter “Kafka Deep Dive”, the authors allow a more in-depth analysis to follow the considerations that have taken place on a higher technical level up to this point. Kropp and Zelenin present – albeit with a little distance – the network protocol used in Kafka and explain, for example, the (important, but sometimes problematic) role of ACK packets, which are used in the communication processes in a Kafka cluster Confirm receipt or processing of data or commands.

In the later section on programming tools, there are shell scripts as well as snippets that demonstrate the use of Apache Kafka from Python.

The MQTT protocol already mentioned marks the beginning of the fourth chapter of the book, in which the authors give an insight into how Kafka can be meaningfully integrated into corporate IT in order to also achieve value contribution to the business.

In addition to a detailed – and, in the opinion of the reviewer, fair – comparison between Kafka and “competing” message broker services, Kropp and Zelenin also dedicate themselves to other services such as Kafka Streams, which can be sensibly combined with the main system in corporate use. Security is not neglected either, and considerations on reference architectures in Kubernetes, on your own hardware and in virtual machines round off the chapter.

Apache Kafka is one of those more demanding systems that you need time and the will to deal with the concepts to understand. Those who are ready for both will find a helpful guide through the jungle of the event subsystem in the introduction by Anatoly Zelenin and Alexander Kropp. Whether the goal achieved in the end can also be satisfactory is another matter – because Apache Kafka is and will remain a system that cannot be used equally profitably everywhere.

Annette Bosbach
looks after the legacy systems of Tamoggemon Holding ks and has also been concerned with the influence that technology and people have on one another for years.


(map)

To home page



Source link -64