Books & Videos

Table of Contents

Chapter: Your Spark and Visualization Toolkit

The Course Overview

03m 54s

Spark: Origins and Ecosystem for Big Data Scientists, the Scala, Python, and R flavors

04m 41s

Install Spark on Your Laptop with Docker, or Scale Fast in the Cloud

04m 40s

Apache Zeppelin, a Web-Based Notebook for Spark with matplotlib and ggplot2

03m 7s

Chapter: First Steps with Spark Visualization

Manipulating Data with the Core RDD API

08m 16s

Using Dataframe, Dataset, and SQL – Natural and Easy!

06m 35s

Manipulating Rows and Columns

04m 49s

Dealing with File Format

02m 17s

Visualizing More – ggplot2, matplotlib, and Angular.js at the Rescue

03m 32s

Chapter: The Spark Machine Learning Algorithms

Discovering spark.ml and spark.mllib - and Other Libraries

08m 1s

Wrapping Up Basic Statistics and Linear Algebra

09m 58s

Cleansing Data and Engineering the Features

05m 3s

Reducing the Dimensionality

04m 9s

Pipeline for a Life

03m 58s

Chapter: Collecting and Cleansing the Dirty Tweets

Streaming Tweets to Disk

05m 37s

Streaming Tweets on a Map

04m 5s

Cleansing and Building Your Reference Dataset

05m 12s

Querying and Visualizing Tweets with SQL

04m 16s

Chapter: Statistical Analysis on Tweets

Indicators, Correlations, and Sampling

07m 16s

Validating Statistical Relevance

03m 31s

Running SVD and PCA

04m 4s

Extending the Basic Statistics for Your Needs

04m 19s

Chapter: Extracting Features from the Tweets

Analyzing Free Text from the Tweets

07m 23s

Dealing with Stemming, Syntax, Idioms and Hashtags

05m 23s

Detecting Tweet Sentiment

03m 28s

Identifying Topics with LDA

03m 6s

Chapter: Mine Data and Share Results

Word Cloudify Your Dataset

05m 30s

Locating Users and Displaying Heatmaps with GeoHash

04m 15s

Collaborating on the Same Note with Peers

04m 56s

Create Visual Dashboards for Your Business Stakeholders

03m 56s

Chapter: Classifying the Tweets

Building the Training and Test Datasets

07m 25s

Training a Logistic Regression Model

03m 55s

Evaluating Your Classifier

05m 31s

Selecting Your Model

05m 18s

Chapter: Clustering Users

Clustering Users by Followers and Friends

05m 12s

Clustering Users by Location

02m 47s

Running KMeans on a Stream

02m 30s

Chapter: Your Next Data Challenges

Recommending Similar Users

05m 11s

Analyzing Mentions with GraphX

06m 21s

Where to Go from Here

06m 20s