Books & Videos

Table of Contents

  1. Chapter 1 Introduction to Data Analysis: Break it down

    1. Acme Cosmetics needs your help

    2. The CEO wants data analysis to help increase sales

    3. Data analysis is careful thinking about evidence

    4. Define the problem

    5. Your client will help you define your problem

    6. Acme’s CEO has some feedback for you

    7. Break the problem and data into smaller pieces

    8. Now take another look at what you know

    9. Evaluate the pieces

    10. Analysis begins when you insert yourself

    11. Make a recommendation

    12. Your report is ready

    13. The CEO likes your work

    14. An article just came across the wire

    15. You let the CEO’s beliefs take you down the wrong path

    16. Your assumptions and beliefs about the world are your mental model

    17. Your statistical model depends on your mental model

    18. Mental models should always include what you don’t know

    19. The CEO tells you what he doesn’t know

    20. Acme just sent you a huge list of raw data

    21. Time to drill further into the data

    22. General American Wholesalers confirms your impression

    23. Here’s what you did

    24. Your analysis led your client to a brilliant decision

  2. Chapter 2 Experiments: Test your theories

    1. It’s a coffee recession!

    2. The Starbuzz board meeting is in three months

    3. The Starbuzz Survey

    4. Always use the method of comparison

    5. Comparisons are key for observational data

    6. Could value perception be causing the revenue decline?

    7. A typical customer’s thinking

    8. Observational studies are full of confounders

    9. How location might be confounding your results

    10. Manage confounders by breaking the data into chunks

    11. It’s worse than we thought!

    12. You need an experiment to say which strategy will work best

    13. The Starbuzz CEO is in a big hurry

    14. Starbuzz drops its prices

    15. One month later...

    16. Control groups give you a baseline

    17. Not getting fired 101

    18. Let’s experiment for real!

    19. One month later...

    20. Confounders also plague experiments

    21. Avoid confounders by selecting groups carefully

    22. Randomization selects similar groups

    23. Your experiment is ready to go

    24. The results are in

    25. Starbuzz has an empirically tested sales strategy

  3. Chapter 3 Optimization: Take it to the max

    1. You’re now in the bath toy game

    2. Constraints limit the variables you control

    3. Decision variables are things you can control

    4. You have an optimization problem

    5. Find your objective with the objective function

    6. Your objective function

    7. Show product mixes with your other constraints

    8. Plot multiple constraints on the same chart

    9. Your good options are all in the feasible region

    10. Your new constraint changed the feasible region

    11. Your spreadsheet does optimization

    12. Solver crunched your optimization problem in a snap

    13. Profits fell through the floor

    14. Your model only describes what you put into it

    15. Calibrate your assumptions to your analytical objectives

    16. Watch out for negatively linked variables

    17. Your new plan is working like a charm

    18. Your assumptions are based on an ever-changing reality

  4. Chapter 4 Data Visualization: Pictures make you smarter

    1. New Army needs to optimize their website

    2. The results are in, but the information designer is out

    3. The last information designer submitted these three infographics

    4. What data is behind the visualizations?

    5. Show the data!

    6. Here’s some unsolicited advice from the last designer

    7. Too much data is never your problem

    8. Making the data pretty isn’t your problem either

    9. Data visualization is all about making the right comparisons

    10. Your visualization is already more useful than the rejected ones

    11. Use scatterplots to explore causes

    12. The best visualizations are highly multivariate

    13. Show more variables by looking at charts together

    14. The visualization is great, but the web guru’s not satisfied yet

    15. Good visual designs help you think about causes

    16. The experiment designers weigh in

    17. The experiment designers have some hypotheses of their own

    18. The client is pleased with your work

    19. Orders are coming in from everywhere!

  5. Chapter 5 Hypothesis Testing: Say it ain’t so

    1. Gimme some skin...

    2. When do we start making new phone skins?

    3. PodPhone doesn’t want you to predict their next move

    4. Here’s everything we know

    5. ElectroSkinny’s analysis does fit the data

    6. ElectroSkinny obtained this confidential strategy memo

    7. Variables can be negatively or positively linked

    8. Causes in the real world are networked, not linear

    9. Hypothesize PodPhone’s options

    10. You have what you need to run a hypothesis test

    11. Falsification is the heart of hypothesis testing

    12. Diagnosticity helps you find the hypothesis with the least disconfirmation

    13. You can’t rule out all the hypotheses, but you can say which is strongest

    14. You just got a picture message...

    15. It’s a launch!

  6. Chapter 6 Bayesian Statistics: Get past first base

    1. The doctor has disturbing news

    2. Let’s take the accuracy analysis one claim at a time

    3. How common is lizard flu really?

    4. You’ve been counting false positives

    5. All these terms describe conditional probabilities

    6. You need to count

    7. 1 percent of people have lizard flu

    8. Your chances of having lizard flu are still pretty low

    9. Do complex probabilistic thinking with simple whole numbers

    10. Bayes’ rule manages your base rates when you get new data

    11. You can use Bayes’ rule over and over

    12. Your second test result is negative

    13. The new test has different accuracy statistics

    14. New information can change your base rate

    15. What a relief!

  7. Chapter 7 Subjective Probabilities: Numerical belief

    1. Backwater Investments needs your help

    2. Their analysts are at each other’s throats

    3. Subjective probabilities describe expert beliefs

    4. Subjective probabilities might show no real disagreement after all

    5. The analysts responded with their subjective probabilities

    6. The CEO doesn’t see what you’re up to

    7. The CEO loves your work

    8. The standard deviation measures how far points are from the average

    9. You were totally blindsided by this news

    10. Bayes’ rule is great for revising subjective probabilities

    11. The CEO knows exactly what to do with this new information

    12. Russian stock owners rejoice!

  8. Chapter 8 Heuristics: Analyze like a human

    1. LitterGitters submitted their report to the city council

    2. The LitterGitters have really cleaned up this town

    3. The LitterGitters have been measuring their campaign’s effectiveness

    4. The mandate is to reduce the tonnage of litter

    5. Tonnage is unfeasible to measure

    6. Give people a hard question, and they’ll answer an easier one instead

    7. Littering in Dataville is a complex system

    8. You can’t build and implement a unified litter-measuring model

    9. Heuristics are a middle ground between going with your gut and optimization

    10. Use a fast and frugal tree

    11. Is there a simpler way to assess LitterGitters’ success?

    12. Stereotypes are heuristics

    13. Your analysis is ready to present

    14. Looks like your analysis impressed the city council members

  9. Chapter 9 Histograms: The shape of numbers

    1. Your annual review is coming up

    2. Going for more cash could play out in a bunch of different ways

    3. Here’s some data on raises

    4. Histograms show frequencies of groups of numbers

    5. Gaps between bars in a histogram mean gaps among the data points

    6. Install and run R

    7. Load data into R

    8. R creates beautiful histograms

    9. Make histograms from subsets of your data

    10. Negotiation pays

    11. What will negotiation mean for you?

  10. Chapter 10 Regression: Prediction

    1. What are you going to do with all this money?

    2. An analysis that tells people what to ask for could be huge

    3. Behold... the Raise Reckoner!

    4. Inside the algorithm will be a method to predict raises

    5. Scatterplots compare two variables

    6. A line could tell your clients where to aim

    7. Predict values in each strip with the graph of averages

    8. The regression line predicts what raises people will receive

    9. The line is useful if your data shows a linear correlation

    10. You need an equation to make your predictions precise

    11. Tell R to create a regression object

    12. The regression equation goes hand in hand with your scatterplot

    13. The regression equation is the Raise Reckoner algorithm

    14. Your raise predictor didn’t work out as planned...

  11. Chapter 11 Error: Err Well

    1. Your clients are pretty ticked off

    2. What did your raise prediction algorithm do?

    3. The segments of customers

    4. The guy who asked for 25% went outside the model

    5. How to handle the client who wants a prediction outside the data range

    6. The guy who got fired because of extrapolation has cooled off

    7. You’ve only solved part of the problem

    8. What does the data for the screwy outcomes look like?

    9. Chance errors are deviations from what your model predicts

    10. Error is good for you and your client

    11. Specify error quantitatively

    12. Quantify your residual distribution with Root Mean Squared error

    13. Your model in R already knows the R.M.S. error

    14. R’s summary of your linear model shows your R.M.S. error

    15. Segmentation is all about managing error

    16. Good regressions balance explanation and prediction

    17. Your segmented models manage error better than the original model

    18. Your clients are returning in droves

  12. Chapter 12 Relational Databases: Can you relate?

    1. The Dataville Dispatch wants to analyze sales

    2. Here’s the data they keep to track their operations

    3. You need to know how the data tables relate to each other

    4. A database is a collection of data with well-specified relations to each other

    5. Trace a path through the relations to make the comparison you need

    6. Create a spreadsheet that goes across that path

    7. Your summary ties article count and sales together

    8. Looks like your scatterplot is going over really well

    9. Copying and pasting all that data was a pain

    10. Relational databases manage relations for you

    11. Dataville Dispatch built an RDBMS with your relationship diagram

    12. Dataville Dispatch extracted your data using the SQL language

    13. Comparison possibilities are endless if your data is in a RDBMS

    14. You’re on the cover

  13. Chapter 13 Cleaning Data: Impose order

    1. Just got a client list from a defunct competitor

    2. The dirty secret of data analysis

    3. Head First Head Hunters wants the list for their sales team

    4. Cleaning messy data is all about preparation

    5. Once you’re organized, you can fix the data itself

    6. Use the # sign as a delimiter

    7. Excel split your data into columns using the delimiter

    8. Use SUBSTITUTE to replace the carat character

    9. You cleaned up all the first names

    10. The last name pattern is too complex for SUBSTITUTE

    11. Handle complex patterns with nested text formulas

    12. R can use regular expressions to crunch complex data patterns

    13. The sub command fixed your last names

    14. Now you can ship the data to your client

    15. Maybe you’re not quite done yet...

    16. Sort your data to show duplicate values together

    17. The data is probably from a relational database

    18. Remove duplicate names

    19. You created nice, clean, unique records

    20. Head First Head Hunters is recruiting like gangbusters!

    21. Leaving town...

    22. It’s been great having you here in Dataville!

  1. Appendix Leftovers: The Top Ten Things (we didn’t cover)

    1. #1: Everything else in statistics

    2. #2: Excel skills

    3. #3: Edward Tufte and his principles of visualization

    4. #4: PivotTables

    5. #5: The R community

    6. #6: Nonlinear and multiple regression

    7. #7: Null-alternative hypothesis testing

    8. #8: Randomness

    9. #9: Google Docs

    10. #10: Your expertise

  2. Appendix Install R: Start R up!

    1. Get started with R

  3. Appendix Install Excel Analysis Tools: The ToolPak

    1. Install the data analysis tools in Excel