Books & Videos

Table of Contents

  1. Chapter 1 Introduction

    1. The Ascendance of Data

    2. What Is Data Science?

    3. Motivating Hypothetical: DataSciencester

  2. Chapter 2 A Crash Course in Python

    1. The Basics

    2. The Not-So-Basics

    3. For Further Exploration

  3. Chapter 3 Visualizing Data

    1. matplotlib

    2. Bar Charts

    3. Line Charts

    4. Scatterplots

    5. For Further Exploration

  4. Chapter 4 Linear Algebra

    1. Vectors

    2. Matrices

    3. For Further Exploration

  5. Chapter 5 Statistics

    1. Describing a Single Set of Data

    2. Correlation

    3. Simpson’s Paradox

    4. Some Other Correlational Caveats

    5. Correlation and Causation

    6. For Further Exploration

  6. Chapter 6 Probability

    1. Dependence and Independence

    2. Conditional Probability

    3. Bayes’s Theorem

    4. Random Variables

    5. Continuous Distributions

    6. The Normal Distribution

    7. The Central Limit Theorem

    8. For Further Exploration

  7. Chapter 7 Hypothesis and Inference

    1. Statistical Hypothesis Testing

    2. Example: Flipping a Coin

    3. Confidence Intervals

    4. P-hacking

    5. Example: Running an A/B Test

    6. Bayesian Inference

    7. For Further Exploration

  8. Chapter 8 Gradient Descent

    1. The Idea Behind Gradient Descent

    2. Estimating the Gradient

    3. Using the Gradient

    4. Choosing the Right Step Size

    5. Putting It All Together

    6. Stochastic Gradient Descent

    7. For Further Exploration

  9. Chapter 9 Getting Data

    1. stdin and stdout

    2. Reading Files

    3. Scraping the Web

    4. Using APIs

    5. Example: Using the Twitter APIs

    6. For Further Exploration

  10. Chapter 10 Working with Data

    1. Exploring Your Data

    2. Cleaning and Munging

    3. Manipulating Data

    4. Rescaling

    5. Dimensionality Reduction

    6. For Further Exploration

  11. Chapter 11 Machine Learning

    1. Modeling

    2. What Is Machine Learning?

    3. Overfitting and Underfitting

    4. Correctness

    5. The Bias-Variance Trade-off

    6. Feature Extraction and Selection

    7. For Further Exploration

  12. Chapter 12 k-Nearest Neighbors

    1. The Model

    2. Example: Favorite Languages

    3. The Curse of Dimensionality

    4. For Further Exploration

  13. Chapter 13 Naive Bayes

    1. A Really Dumb Spam Filter

    2. A More Sophisticated Spam Filter

    3. Implementation

    4. Testing Our Model

    5. For Further Exploration

  14. Chapter 14 Simple Linear Regression

    1. The Model

    2. Using Gradient Descent

    3. Maximum Likelihood Estimation

    4. For Further Exploration

  15. Chapter 15 Multiple Regression

    1. The Model

    2. Further Assumptions of the Least Squares Model

    3. Fitting the Model

    4. Interpreting the Model

    5. Goodness of Fit

    6. Digression: The Bootstrap

    7. Standard Errors of Regression Coefficients

    8. Regularization

    9. For Further Exploration

  16. Chapter 16 Logistic Regression

    1. The Problem

    2. The Logistic Function

    3. Applying the Model

    4. Goodness of Fit

    5. Support Vector Machines

    6. For Further Investigation

  17. Chapter 17 Decision Trees

    1. What Is a Decision Tree?

    2. Entropy

    3. The Entropy of a Partition

    4. Creating a Decision Tree

    5. Putting It All Together

    6. Random Forests

    7. For Further Exploration

  18. Chapter 18 Neural Networks

    1. Perceptrons

    2. Feed-Forward Neural Networks

    3. Backpropagation

    4. Example: Defeating a CAPTCHA

    5. For Further Exploration

  19. Chapter 19 Clustering

    1. The Idea

    2. The Model

    3. Example: Meetups

    4. Choosing k

    5. Example: Clustering Colors

    6. Bottom-up Hierarchical Clustering

    7. For Further Exploration

  20. Chapter 20 Natural Language Processing

    1. Word Clouds

    2. n-gram Models

    3. Grammars

    4. An Aside: Gibbs Sampling

    5. Topic Modeling

    6. For Further Exploration

  21. Chapter 21 Network Analysis

    1. Betweenness Centrality

    2. Eigenvector Centrality

    3. Directed Graphs and PageRank

    4. For Further Exploration

  22. Chapter 22 Recommender Systems

    1. Manual Curation

    2. Recommending What’s Popular

    3. User-Based Collaborative Filtering

    4. Item-Based Collaborative Filtering

    5. For Further Exploration

  23. Chapter 23 Databases and SQL

    1. CREATE TABLE and INSERT

    2. UPDATE

    3. DELETE

    4. SELECT

    5. GROUP BY

    6. ORDER BY

    7. JOIN

    8. Subqueries

    9. Indexes

    10. Query Optimization

    11. NoSQL

    12. For Further Exploration

  24. Chapter 24 MapReduce

    1. Example: Word Count

    2. Why MapReduce?

    3. MapReduce More Generally

    4. Example: Analyzing Status Updates

    5. Example: Matrix Multiplication

    6. An Aside: Combiners

    7. For Further Exploration

  25. Chapter 25 Go Forth and Do Data Science

    1. IPython

    2. Mathematics

    3. Not from Scratch

    4. Find Data

    5. Do Data Science