Streaming Systems The What, Where, When, and How of Large-Scale Data Processing By Publisher: O'Reilly Media Final Release Date: May 2017 Pages: 100 With Early Release ebooks, you get books in their earliest form—the author's raw and unedited content as he or she writes—so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters are available, and the final ebook bundle is released. Streaming data is a big deal in big data these days, and for good reason. Businesses crave ever more timely data, and streaming is a good way to achieve lower latency. Plus, streaming is a much easier way to tame the massive, unbounded data sets that are increasingly common today. Expanded from co-author Tyler Akidau's popular series of blog posts "Streaming 101" and "Streaming 102", this practical book shows data engineers, data scientists, and developers how to work with streaming or event-time data in a conceptual and platform-agnostic way. You'll go from "101"-level understanding of stream processing to a nuanced grasp of the what, where, when, and how of processing real-time data streams. Dive deep into topics including watermarks and windowing, as well as state and timers in the context of stream processing. Although the book uses Apache Beam code snippets to make examples concrete, it presents a general and broad explanation of streaming that's not tied to a specific framework. Chapter 1 Why Stream Processing? Chapter 2 Data Processing Patterns Chapter 3 The What, Where, When, and How of Data Processing Chapter 4 Watermarks Chapter 5 Advanced Where & When Chapter 6 Composition Chapter 7 Exactly Once & Side Effects Chapter 8 Streams & Tables Chapter 9 The Practicalities of Persistent State Chapter 10 Towards Robust Streaming SQL Chapter 11 The Evolution of Large-Scale Data Processing Title: Streaming Systems By: Tyler Akidau, Slava Chernyak, Reuven Lax Publisher: O'Reilly Media Formats: Print

Early Release Ebook Print: Early Release Ebook: Pages: 100 (est.) Print ISBN: 978-1-4919-8387-4 | ISBN 10: 1-4919-8387-6 Early Release Ebook ISBN: 978-1-4919-8380-5 | ISBN 10: 1-4919-8380-9 Tyler Akidau Tyler Akidau is a staff software engineer at Google Seattle. He leads technical infrastructure's internal data processing teams (MillWheel & Flume), is a founding member of the Apache Beam PMC, and has spent the last seven years working on massive-scale data processing systems. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer in batch and streaming as two sides of the same coin, with the real endgame for data processing systems the seamless merging between the two. He is the author of the 2015 Dataflow Model paper and the Streaming 101 and Streaming 102 articles on the O'Reilly website. His preferred mode of transportation is by cargo bike, with his two young daughters in tow. Slava Chernyak Slava Chernyak is a senior software engineer at Google Seattle. Slava spent over five years working on Google's internal massive-scale streaming data processing systems and has since become involved with designing and building Windmill, Google Cloud Dataflow's next-generation streaming backend, from the ground up. Slava is passionate about making massive-scale stream processing available and useful to a broader audience. When he is not working on streaming systems, Slava is out enjoying the natural beauty of the Pacific Northwest. Reuven Lax Reuven Lax is a senior staff software engineer at Google Seattle, and has spent the past nine years helping to shape Google's data processing and analysis strategy. For much of that time he has focused on Google's low-latency, streaming data processing efforts, first as a long-time member and lead of the MillWheel team, and more recently founding and leading the team responsible for Windmill, the next-generation stream processing engine powering Google Cloud Dataflow. He's very excited to bring Google's data-processing experience to the world at large, and proud to have been a part of publishing both the MillWheel paper in 2013 and the Dataflow Model paper in 2015. When not at work, Reuven enjoys swing dancing, rock climbing, and exploring new parts of the world.