Books & Videos

Table of Contents

  1. Chapter 1 Design Patterns and MapReduce

    1. Design Patterns

    2. MapReduce History

    3. MapReduce and Hadoop Refresher

    4. Hadoop Example: Word Count

    5. Pig and Hive

  2. Chapter 2 Summarization Patterns

    1. Numerical Summarizations

    2. Inverted Index Summarizations

    3. Counting with Counters

  3. Chapter 3 Filtering Patterns

    1. Filtering

    2. Bloom Filtering

    3. Top Ten

    4. Distinct

  4. Chapter 4 Data Organization Patterns

    1. Structured to Hierarchical

    2. Partitioning

    3. Binning

    4. Total Order Sorting

    5. Shuffling

  5. Chapter 5 Join Patterns

    1. A Refresher on Joins

    2. Reduce Side Join

    3. Replicated Join

    4. Composite Join

    5. Cartesian Product

  6. Chapter 6 Metapatterns

    1. Job Chaining

    2. Chain Folding

    3. Job Merging

  7. Chapter 7 Input and Output Patterns

    1. Customizing Input and Output in Hadoop

    2. Generating Data

    3. External Source Output

    4. External Source Input

    5. Partition Pruning

  8. Chapter 8 Final Thoughts and the Future of Design Patterns

    1. Trends in the Nature of Data

    2. The Effects of YARN

    3. Patterns as a Library or Component

    4. How You Can Help

  1. Appendix Bloom Filters

    1. Overview

    2. Use Cases

    3. Downsides

    4. Tweaking Your Bloom Filter

  2. Colophon