Books & Videos

Table of Contents

Chapter: Introduction

Introduction and Course Overview

04m 10s

About the Author

00m 34s

Spark’s concepts and approach

06m 3s

Resilient Distributed Databases (RDD)

05m 3s

Creating a Project in IDEA

02m 54s

How To Access Your Working Files

01m 15s

Chapter: Spark Core API & Best practices

Base RDD

06m 46s


05m 34s

Actions - Part 1

01m 40s

Actions - Part 2

02m 41s

Hadoop Combiners In Spark

04m 51s

Direct Acyclic Graph And Lazy Evaluation

07m 20s


06m 15s

Chapter: Closure serialization

How does the magic of Spark works

07m 29s

Serializers and how to change them

04m 10s

Chapter: Shared variables and performance


04m 7s


05m 5s

Caching & Persistence

09m 21s

Chapter: Spark SQL

Spark SQL

12m 32s

Inferring A Schema

07m 38s

Applying A Schema

06m 27s

Loading And Writing

06m 7s

SQL Caching And UDF

08m 47s

Chapter: Spark MLLib

Spark MLLib And Supervised Example - SVM

10m 2s

Unsupervised With Iris Dataset - KMeans

08m 54s

Chapter: Spark GraphX

Graph Construction

07m 5s

Graph Algorithms

06m 51s

Chapter: Spark Streaming

Streaming And The Microbatch

13m 57s

Mutable Transformations And Checkpointing

09m 6s

Windows And RDD Transformations

08m 43s

Streaming With Spark SQL, MLLib And Core

12m 28s

Chapter: Deployment and Infrastructure

Cluster Managers And Submission - Standalone, Mesos And Yarn

13m 20s

Chapter: Conclusion

Resources And Where To Go From Here

04m 6s