Books & Videos

Table of Contents

Chapter: Getting Started with Spark


02m 16s

How to Use This Course

01m 41s

Getting Set Up – Installing Python, a JDK, Spark, and its Dependencies

14m 52s

Installing the MovieLens Movie Rating Dataset

03m 35s

Run Your First Spark Program – Ratings Histogram Example

04m 52s

Chapter: Spark Basics and Simple Examples

Introduction to Spark

10m 11s

The Resilient Distributed Dataset (RDD)

12m 17s

Ratings Histogram Walkthrough

13m 33s

Key/Value RDDs and the Average Friends by Age Example

16m 13s

Running the Average Friends by Age Example

05m 39s

Filtering RDDs and the Minimum Temperature by Location Example

08m 10s

Running the Minimum Temperature Example and Modifying It for Maximums

05m 8s

Running the Maximum Temperature by Location Example

03m 21s

Counting Word Occurrences Using flatmap()

07m 28s

Improving the Word Count Script with Regular Expressions

04m 44s

Sorting the Word Count Results

07m 44s

Find the Total Amount Spent by Customer

04m 1s

Check Your Results and Sort Them by Total Amount Spent

05m 8s

Check Your Sorted Implementation and Results Against Mine

03m 18s

Chapter: Advanced Examples of Spark Programs

Find the Most Popular Movie

05m 52s

Use Broadcast Variables to Display Movie Names Instead of ID Numbers

08m 23s

Find the Most Popular Superhero in a Social Graph

04m 29s

Run the Script – Discover Who the Most Popular Superhero is!

06m 0s

Superhero Degrees of Separation – Introducing Breadth-First Search

07m 54s

Superhero Degrees of Separation – Accumulators and Implementing BFS in Spark

06m 44s

Superhero Degrees of Separation – Review the Code and Run it

09m 14s

Item-Based Collaborative Filtering in Spark, cache(), and persist()

10m 12s

Running the Similar Movies Script Using Spark's Cluster Manager

10m 54s

Improve the Quality of Similar Movies

02m 58s

Chapter: Running Spark on a Cluster

Introducing Elastic MapReduce

05m 8s

Setting Up Your AWS / Elastic MapReduce Account and PuTTY

09m 55s


04m 21s

Create Similar Movies from One Million Ratings – Part 1

05m 12s

Create Similar Movies from One Million Ratings – Part 2

11m 27s

Create Similar Movies from One Million Ratings – Part 3

03m 28s

Troubleshooting Spark on a Cluster

03m 43s

More Troubleshooting and Managing Dependencies

05m 47s

Chapter: SparkSQL, DataFrames, and DataSets

Introducing SparkSQL

06m 8s

Executing SQL Commands and SQL-Style Functions on a DataFrame

08m 16s

Using DataFrames Instead of RDDs

05m 52s

Chapter: Other Spark Technologies and Libraries

Introducing MLLib

08m 10s

Using MLLib to Produce Movie Recommendations

02m 56s

Analyzing the ALS Recommendations Results

04m 53s

Using DataFrames with MLLib

07m 31s

Spark Streaming and GraphX

07m 36s

Chapter: You Made It! Where to Go from Here

Learning More about Spark and Data Science

04m 9s