Larger Cover Mastering Spark for Structured Streaming A Practical Guide to Building End-to-End Structured Streaming Applications Using Spark 2.0 By Publisher: O'Reilly Media Final Release Date: November 2016 Run time: 1 hour 47 minutes Spark is one of today’s most popular distributed computation engines for processing and analyzing big data. This course provides data engineers, data scientist and data analysts interested in exploring the technology of data streaming with practical experience in using Spark. You’ll learn about the Spark Structured Streaming API, the powerful Catalyst query optimizer, the Tungsten execution engine, and more in this hands-on course where you’ll build small several applications that leverage all the aspects of Spark 2.0. While not a requirement, the course works best for those with some Scala experience. Understand the main features of Spark and its advantages over existing systems

Learn the basics of parallelism, streaming computation, and Spark streaming

Explore the distinctions between Spark Structured Streaming and legacy DStream APIs

Understand how to write to and use the Spark Structured Streaming API

Learn about the new Catalyst query optimizer and the Tungsten execution engine

Discover how Scala and Spark Structured Streaming simplify distributed streaming tasks

Gain hands-on experience building applications using Spark 2.0 Michael Li is the founder of The Data Incubator, which provides big data corporate training and a selective eight-week fellowship for PhDs transitioning into industry. Previously, he worked as a data scientist, software engineer, and researcher at Foursquare, Google, Andreessen Horowitz, J.P. Morgan, and NASA. He is a regular contributor to VentureBeat, The Next Web, and Harvard Business Review. Michael earned his Ph.D. at Princeton and was a Marshall Scholar in Cambridge. Start This Video Training for Free View the links in the TOC below. Overview Overview 02m 06s Spark Datasets and Structured Streaming Spark Overview 02m 11s Spark Wordcount Using RDD Example 05m 00s Spark Wordcount Using Scala Example 02m 37s Spark and Datasets 01m 56s Spark Wordcount Using Datasets Example 03m 06s Joining Data Using Spark Datasets 03m 32s Structured Streaming Overview 03m 17s Spark Structured Streaming Wordcount Example 03m 20s Spark Structured Streaming Spark Structured Streaming 00m 46s Netcat Socket Structured Streaming Example 02m 27s Socket Structured Streaming Example 02m 54s Spark Structured Streaming Parsing Data 02m 56s Constructing Columns in Structured Streaming 02m 47s Selecting and Filtering Columns Using Structured Streaming 02m 06s GroupBy and Aggregation in Structured Streaming 03m 32s Joining Structured Stream with Datasets 03m 38s SQL Queries in Spark Structured Streaming 02m 19s DStream Comparison Comparing Structured Streaming with DStream 03m 39s Custom Receivers in Spark DStream 02m 17s Iterative Wordcount Using Spark DStream 03m 29s Cumulative Wordcount using Spark DStream 06m 31s Benefits of Spark Tungsten 04m 42s Tungsten Performance Benefit Demonstration 02m 58s Benefits of Spark Catalyst 03m 17s Viewing Query Plans in Spark Shell 01m 36s Visualizing Query Stages in Spark UI Viewer 00m 51s Viewing Spark Catalyst-Optimized Physical Plans 02m 56s Standalone Spark Streaming Applications Writing Standalone Spark Streaming Applications 01m 03s Two Environments for Running Spark 01m 56s Spark Streaming Standalone Code - Meetup Events Example 07m 37s Scala Build Tool (SBT) and Spark 06m 01s Compiling and Building a Standalone Spark Application 04m 29s Spark Twitter Streaming Example 03m 53s Title: Mastering Spark for Structured Streaming By: Tianhui Michael Li Publisher: O'Reilly Media Formats: Safari Videos Online

Tianhui Michael Li Michael Li is the founder of The Data Incubator, a big data education and placement company that runs customized, vendor-neutral, big data corporate training, and a selective eight-week fellowship for PhDs transitioning into industry. Previously, he worked as a data scientist, software engineer, and researcher at Foursquare, Google, Andreessen Horowitz, J.P. Morgan, NASA and D.E. Shaw. He is a regular contributor to VentureBeat, The Next Web, and Harvard Business Review. Michael earned his Ph.D. at Princeton and was a Marshall Scholar in Cambridge.