Books & Videos

Table of Contents

  1. Chapter 1 Introduction to Python

    1. Why Python

    2. Getting Started with Python

    3. Summary

  2. Chapter 2 Python Basics

    1. Basic Data Types

    2. Data Containers

    3. What Can the Various Data Types Do?

    4. Helpful Tools: type, dir, and help

    5. Putting It All Together

    6. What Does It All Mean?

    7. Summary

  3. Chapter 3 Data Meant to Be Read by Machines

    1. CSV Data

    2. JSON Data

    3. XML Data

    4. Summary

  4. Chapter 4 Working with Excel Files

    1. Installing Python Packages

    2. Parsing Excel Files

    3. Getting Started with Parsing

    4. Summary

  5. Chapter 5 PDFs and Problem Solving in Python

    1. Avoid Using PDFs!

    2. Programmatic Approaches to PDF Parsing

    3. Parsing PDFs Using pdfminer

    4. Learning How to Solve Problems

    5. Uncommon File Types

    6. Summary

  6. Chapter 6 Acquiring and Storing Data

    1. Not All Data Is Created Equal

    2. Fact Checking

    3. Readability, Cleanliness, and Longevity

    4. Where to Find Data

    5. Case Studies: Example Data Investigation

    6. Storing Your Data: When, Why, and How?

    7. Databases: A Brief Introduction

    8. When to Use a Simple File

    9. Alternative Data Storage

    10. Summary

  7. Chapter 7 Data Cleanup: Investigation, Matching, and Formatting

    1. Why Clean Data?

    2. Data Cleanup Basics

    3. Summary

  8. Chapter 8 Data Cleanup: Standardizing and Scripting

    1. Normalizing and Standardizing Your Data

    2. Saving Your Data

    3. Determining What Data Cleanup Is Right for Your Project

    4. Scripting Your Cleanup

    5. Testing with New Data

    6. Summary

  9. Chapter 9 Data Exploration and Analysis

    1. Exploring Your Data

    2. Analyzing Your Data

    3. Summary

  10. Chapter 10 Presenting Your Data

    1. Avoiding Storytelling Pitfalls

    2. Visualizing Your Data

    3. Presentation Tools

    4. Publishing Your Data

    5. Summary

  11. Chapter 11 Web Scraping: Acquiring and Storing Data from the Web

    1. What to Scrape and How

    2. Analyzing a Web Page

    3. Getting Pages: How to Request on the Internet

    4. Reading a Web Page with Beautiful Soup

    5. Reading a Web Page with LXML

    6. Summary

  12. Chapter 12 Advanced Web Scraping: Screen Scrapers and Spiders

    1. Browser-Based Parsing

    2. Spidering the Web

    3. Networks: How the Internet Works and Why It’s Breaking Your Script

    4. The Changing Web (or Why Your Script Broke)

    5. A (Few) Word(s) of Caution

    6. Summary

  13. Chapter 13 APIs

    1. API Features

    2. A Simple Data Pull from Twitter’s REST API

    3. Advanced Data Collection from Twitter’s REST API

    4. Advanced Data Collection from Twitter’s Streaming API

    5. Summary

  14. Chapter 14 Automation and Scaling

    1. Why Automate?

    2. Steps to Automate

    3. What Could Go Wrong?

    4. Where to Automate

    5. Special Tools for Automation

    6. Simple Automation

    7. Large-Scale Automation

    8. Monitoring Your Automation

    9. No System Is Foolproof

    10. Summary

  15. Chapter 15 Conclusion

    1. Duties of a Data Wrangler

    2. Beyond Data Wrangling

    3. Where Do You Go from Here?

  16. Appendix Comparison of Languages Mentioned

    1. C, C++, and Java Versus Python

    2. R or MATLAB Versus Python

    3. HTML Versus Python

    4. JavaScript Versus Python

    5. Node.js Versus Python

    6. Ruby and Ruby on Rails Versus Python

  17. Appendix Python Resources for Beginners

    1. Online Resources

    2. In-Person Groups

  18. Appendix Learning the Command Line

    1. Bash

    2. Windows CMD/Power Shell

  19. Appendix Advanced Python Setup

    1. Step 1: Install GCC

    2. Step 2: (Mac Only) Install Homebrew

    3. Step 3: (Mac Only) Tell Your System Where to Find Homebrew

    4. Step 4: Install Python 2.7

    5. Step 5: Install virtualenv (Windows, Mac, Linux)

    6. Step 6: Set Up a New Directory

    7. Step 7: Install virtualenvwrapper

    8. Learning About Our New Environment (Windows, Mac, Linux)

    9. Advanced Setup Review

  20. Appendix Python Gotchas

    1. Hail the Whitespace

    2. The Dreaded GIL

    3. = Versus == Versus is, and When to Just Copy

    4. Default Function Arguments

    5. Python Scope and Built-Ins: The Importance of Variable Names

    6. Defining Objects Versus Modifying Objects

    7. Changing Immutable Objects

    8. Type Checking

    9. Catching Multiple Exceptions

    10. The Power of Debugging

  21. Appendix IPython Hints

    1. Why Use IPython?

    2. Getting Started with IPython

    3. Magic Functions

    4. Final Thoughts: A Simpler Terminal

  22. Appendix Using Amazon Web Services

    1. Spinning Up an AWS Server

    2. Logging into an AWS Server