Books & Videos

Table of Contents

Overview of the Video Course

08m 24s

Chapter: A Distributed Computing Environment

The Motivation for Hadoop

09m 22s

A Brief History of Hadoop

05m 34s

Understanding the Hadoop Architecture

12m 24s

Setting Up A Pseudo-Distributed Environment

03m 47s

The Distributed File System (HDFS)

11m 15s

Distributed Computing with MapReduce

07m 44s

Word Count - the "Hello, World" of Hadoop!

08m 1s

Chapter: Computing with Hadoop

How a MapReduce Job Works

10m 26s

Mappers and Reducers in Detail

19m 16s

Working with Hadoop via the Command Line: Starting HDFS and Yarn

07m 54s

Working with Hadoop via the Command Line: Loading Data into HDFS

07m 5s

Working with Hadoop via the Command Line: Running a MapReduce Job

07m 55s

How To Use Our Github Goodies

00m 38s

Working in Python with Hadoop Streaming

21m 54s

Common MapReduce Tasks

13m 54s

Spark on Hadoop 2

18m 26s

Creating a Spark Application with Python

22m 30s

Chapter: The Hadoop Ecosystem

The Hadoop Ecosystem

03m 1s

Data Warehousing with Hadoop

17m 15s

Higher Order Data Flows

11m 21s

Other Notable Projects

08m 31s

Chapter: Working with Data on Hive

Introduction to Hive

04m 28s

Interacting with Data via the Hive Console

10m 39s

Creating Databases, Tables, and Schemas for Hive

08m 19s

Loading Data into Hive from HDFS

09m 26s

Querying Data and Performing Aggregations With Hive

12m 6s

Chapter: Towards Last Mile Computing

Decomposing Large Data Sets to a Computational Space

07m 56s

Linear Regressions

20m 10s

Summarizing Documents with TF-IDF

14m 11s

Classification of Text

15m 45s

Parallel Canopy Clustering

11m 3s

Computing Recommendations via Linear Log-Likelihoods

14m 50s