Books & Videos

Table of Contents

Chapter: Introduction

Introduction to Working with Big Data LiveLessons

03m 16s

What is Big Data?

05m 25s

Chapter: Lesson 1: Unstructured Storage and Hadoop

Learning objectives

00m 49s

1.1 Set up a basic Hadoop installation

16m 14s

1.2 Write data into the Hadoop file system

07m 41s

1.3 Write a Hadoop streaming job to process text files

17m 55s

Chapter: Lesson 2: Structured Storage and Cassandra

Learning objectives

01m 0s

2.1 Set up a basic Cassandra installation

10m 16s

2.2 Create a Cassandra schema for storing data

17m 3s

2.3 Store and retrieve data from Cassandra using the Ruby library

07m 38s

2.4 Write data into Cassandra from a Hadoop streaming job

20m 13s

2.5 Use the Hadoop reduce phase to parallelize writes

15m 9s

Chapter: Lesson 3: Real Time Processing and Messaging

Learning objectives

01m 7s

3.1 Set up the Kafka messaging system

08m 2s

3.2 Publish and consume data from Kafka in Ruby

11m 5s

3.3 Aggregate log files into Hadoop using Kafka and a Ruby consumer

13m 55s

3.4 Create horizontally scalable message consumers

11m 35s

3.5 Sample messages using Kafka’s partitioning

10m 46s

3.6 Create redundant message consumers for high availability

27m 49s

Chapter: Lesson 4: Working with Machine Learning Algorithms

Learning objectives

00m 57s

4.1 Grasp the concepts of machine learning and implement the k-nearest neighbors algorithm

25m 47s

4.2 Understand the basics of distance metrics and implement euclidean distance and cosine similarity

26m 43s

4.3 Transform raw data into a matrix and convert a text document into the vector space model

22m 41s

4.4 Use k-nearest neighbors to make predictions

18m 41s

4.5 Improve execution time by reducing the search space

11m 7s

Chapter: Lesson 5: Experimentation and Running Algorithms in Production

Learning objectives

00m 58s

5.1 Use cross validation to test a predictive model

17m 37s

5.2 Integrate a trained model into production

09m 5s

5.3 Version a model and track feedback data

03m 35s

5.4 Write a test harness to compare versioned models

09m 22s

5.5 Test new predicted models in production

02m 41s

Chapter: Lesson 6: Basic Visualizations

Learning objectives

00m 53s

6.1 Prepare raw data for use in visualizations

13m 10s

6.2 Use core functions of the D3 JavaScript visualizaiton toolkit

13m 17s

6.3 Use D3 to create a barchart

07m 56s

6.4 Use D3 to create a time series

15m 28s