Books & Videos

Table of Contents

  1. Chapter 1 Two Characters: Exploration and Exploitation

    1. The Scientist and the Businessman

    2. The Explore-Exploit Dilemma

  2. Chapter 2 Why Use Multiarmed Bandit Algorithms?

    1. What Are We Trying to Do?

    2. The Business Scientist: Web-Scale A/B Testing

  3. Chapter 3 The epsilon-Greedy Algorithm

    1. Introducing the epsilon-Greedy Algorithm

    2. Describing Our Logo-Choosing Problem Abstractly

    3. Implementing the epsilon-Greedy Algorithm

    4. Thinking Critically about the epsilon-Greedy Algorithm

  4. Chapter 4 Debugging Bandit Algorithms

    1. Monte Carlo Simulations Are Like Unit Tests for Bandit Algorithms

    2. Simulating the Arms of a Bandit Problem

    3. Analyzing Results from a Monte Carlo Study

    4. Exercises

  5. Chapter 5 The Softmax Algorithm

    1. Introducing the Softmax Algorithm

    2. Implementing the Softmax Algorithm

    3. Measuring the Performance of the Softmax Algorithm

    4. The Annealing Softmax Algorithm

    5. Exercises

  6. Chapter 6 UCB – The Upper Confidence Bound Algorithm

    1. Introducing the UCB Algorithm

    2. Implementing UCB

    3. Comparing Bandit Algorithms Side-by-Side

    4. Exercises

  7. Chapter 7 Bandits in the Real World: Complexity and Complications

    1. A/A Testing

    2. Running Concurrent Experiments

    3. Continuous Experimentation vs. Periodic Testing

    4. Bad Metrics of Success

    5. Scaling Problems with Good Metrics of Success

    6. Intelligent Initialization of Values

    7. Running Better Simulations

    8. Moving Worlds

    9. Correlated Bandits

    10. Contextual Bandits

    11. Implementing Bandit Algorithms at Scale

  8. Chapter 8 Conclusion

    1. Learning Life Lessons from Bandit Algorithms

    2. A Taxonomy of Bandit Algorithms

    3. Learning More and Other Topics