Books & Videos

Table of Contents

  1. Chapter 1 Introduction

    1. Why Machine Learning?

    2. Why Python?

    3. scikit-learn

    4. Essential Libraries and Tools

    5. Python 2 Versus Python 3

    6. Versions Used in this Book

    7. A First Application: Classifying Iris Species

    8. Summary and Outlook

  2. Chapter 2 Supervised Learning

    1. Classification and Regression

    2. Generalization, Overfitting, and Underfitting

    3. Supervised Machine Learning Algorithms

    4. Uncertainty Estimates from Classifiers

    5. Summary and Outlook

  3. Chapter 3 Unsupervised Learning and Preprocessing

    1. Types of Unsupervised Learning

    2. Challenges in Unsupervised Learning

    3. Preprocessing and Scaling

    4. Dimensionality Reduction, Feature Extraction, and Manifold Learning

    5. Clustering

    6. Summary and Outlook

  4. Chapter 4 Representing Data and Engineering Features

    1. Categorical Variables

    2. Binning, Discretization, Linear Models, and Trees

    3. Interactions and Polynomials

    4. Univariate Nonlinear Transformations

    5. Automatic Feature Selection

    6. Utilizing Expert Knowledge

    7. Summary and Outlook

  5. Chapter 5 Model Evaluation and Improvement

    1. Cross-Validation

    2. Grid Search

    3. Evaluation Metrics and Scoring

    4. Summary and Outlook

  6. Chapter 6 Algorithm Chains and Pipelines

    1. Parameter Selection with Preprocessing

    2. Building Pipelines

    3. Using Pipelines in Grid Searches

    4. The General Pipeline Interface

    5. Grid-Searching Preprocessing Steps and Model Parameters

    6. Grid-Searching Which Model To Use

    7. Summary and Outlook

  7. Chapter 7 Working with Text Data

    1. Types of Data Represented as Strings

    2. Example Application: Sentiment Analysis of Movie Reviews

    3. Representing Text Data as a Bag of Words

    4. Stopwords

    5. Rescaling the Data with tf–idf

    6. Investigating Model Coefficients

    7. Bag-of-Words with More Than One Word (n-Grams)

    8. Advanced Tokenization, Stemming, and Lemmatization

    9. Topic Modeling and Document Clustering

    10. Summary and Outlook

  8. Chapter 8 Wrapping Up

    1. Approaching a Machine Learning Problem

    2. From Prototype to Production

    3. Testing Production Systems

    4. Building Your Own Estimator

    5. Where to Go from Here

    6. Conclusion