With Early Release ebooks, you get books in their earliest form—the author's raw and unedited content as he or she writes—so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters are available, and the final ebook bundle is released.
Learn how to take full advantage of Apache Kafka, the distributed, publish-subscribe queue for handling real-time data feeds. With this comprehensive book, you’ll understand how Kafka works and how it’s designed. Authors Neha Narkhede, Gwen Shapira, and Todd Palino show you how to deploy production Kafka clusters; secure, tune, and monitor them; write rock-solid applications that use Kafka; and build scalable stream-processing applications.
Learn how Kafka compares to other queues, and where it fits in the big data ecosystem
Dive into Kafka’s internal design
Pick up best practices for developing applications that use Kafka
Understand the best way to deploy Kafka in production monitoring, tuning, and maintenance tasks
Learn how to secure a Kafka cluster
Get detailed use-cases
Chapter 1Meet Kafka
Chapter 2Installing Kafka
Chapter 3Kafka Producers - Writing Messages to Kafka
Chapter 4Kafka Consumers - Reading Data from Kafka
Chapter 5Kafka Internals
Chapter 6Reliable Data Delivery
Chapter 7Building Data Pipelines
Chapter 8Cross-Cluster Data Mirroring
Chapter 9Administering Kafka
Chapter 10Stream Processing
Chapter 11Case Studies
Appendix AInstalling Kafka on Other Operating Systems
Neha Narkhede is co-founder and CTO at Confluent, a company backing the popular Apache Kafka messaging system. Prior to founding Confluent, Neha led streams infrastructure at LinkedIn, where she was responsible for LinkedIn’s streaming infrastructure built on top of Apache Kafka and Apache Samza. She is one of the initial authors of Apache Kafka and a committer and PMC member on the project.
Gwen Shapira is a system architect at Confluent helping customers achieve success with their Apache Kafka implementation. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. She currently specializes in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an author of "Hadoop Application Architectures", and a frequent presenter at data driven conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects.
Todd is a Staff Site Reliability Engineer at LinkedIn, tasked with keeping the largest deployment of Apache Kafka, Zookeeper, and Samza fed and watered. He is responsible for architecture, day-to-day operations, and tools development, including the creation of an advanced monitoring and notification system. Todd is the developer of the open source project Burrow, a Kafka consumer monitoring tool, and can be found sharing his experience on Apache Kafka at industry conferences and tech talks. Todd has spent over 20 years in the technology industryrunning infrastructure services, most recently as a Systems Engineer at Verisign, developing service management automation for DNS, networking, and hardware management, as well as managing hardware and software standards across the company.