An Introduction to MapReduce with Pete Warden
Publisher: O'Reilly Media
Final Release Date: June 2011
Run time: 1 hour 49 minutes

The MapReduce algorithmic pattern may be Google's secret weapon for dealing with enormous quantities of data, but many programmers only see it as intimidating and obscure. In this video master class, data expert Pete Worden shows you how to build simple MapReduce jobs, using concrete use cases and descriptive examples to demystify the approach. All you need to get started is basic knowledge of Python and the Unix shell.

Worden demonstrates what happens when 500 million records are loaded into a database the traditional way: performance falls off dramatically once the working set is larger than memory. Discover how to solve the problem by introducing a sorting step—the method that lies at the heart of MapReduce.

In this video, you learn how to:

  • Tackle a "Hello World" example for MapReduce. Count word frequencies in a large body of text, then split the script into separate map and reduce stages and run it on the command line.
  • Run a job in Hadoop using Amazon’s Elastic MapReduce service. Set up a streaming job—upload scripts and data, debug run-time problems, and grab the results.
  • Prepare for very large data sets. Redesign scripts to find the most frequent words in 17GB of Wikipedia data.
Table of Contents
Product Details
About the Author
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
O'Reilly MediaAn Introduction to MapReduce with Pete Warden
 
4.3

(based on 3 reviews)

Ratings Distribution

  • 5 Stars

     

    (1)

  • 4 Stars

     

    (2)

  • 3 Stars

     

    (0)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

100%

of respondents would recommend this to a friend.

Pros

  • Easy to understand (3)
  • Helpful examples (3)

Cons

    Best Uses

    • Novice (3)
      • Reviewer Profile:
      • Developer (3)

    Reviewed by 3 customers

    Sort by

    Displaying reviews 1-3

    Back to top

    (2 of 2 customers found this review helpful)

     
    4.0

    Takes the Mystery out of MapReduce

    By hoop33

    from Jacksonville, FL

    About Me Developer

    Verified Reviewer

    Pros

    • Easy to understand
    • Helpful examples

    Cons

    • Moves slowly

    Best Uses

    • Novice

    Comments about O'Reilly Media An Introduction to MapReduce with Pete Warden:

    Founder of OpenHeatMap Pete Warden provides a simple introduction to MapReduce, the Google-created framework for examining large datasets. This video tutorial includes four videos: an intro to the topic, writing your first MapReduce job, and then two videos on running MapReduce jobs on the Amazon Elastic MapReduce cloud service. By the time you're done watching the videos, you'll understand what MapReduce is and how to write simple MapReduce jobs, and you'll be ready to move on to more advanced MapReduce topics.

    The videos use Python as the language for writing your mapper and reducer, but the code stays simple and even programmers unfamiliar with Python should follow along just fine. The tutorial has you writing real code, working on real data, and producing real results. You learn the principles necessary to apply to more advanced mapping and reducing scenarios. You can reasonably launch from the course material to write complex MapReduce jobs, depending on your programming skills and imagination.

    This is definitely an introduction; the material moves a little slow, and some of the questions from the audience are pretty basic. Part of the Amazon foray is troubleshooting some of the audience members' setups, for example, though Warden's lead is straightforward throughout. For novice programmers, I'm sure the troubleshooting scenes are valuable; for professional programmers, they can be a bit tedious. It's a small price to pay, however, for the information you'll extract.

    If you're looking to take the mystery out of MapReduce and understand how to use the Amazon Elastic MapReduce service to run your own MapReduce jobs, you'll find what you need here.

    (4 of 4 customers found this review helpful)

     
    4.0

    MapReduce simplified

    By John Brady

    from Exeter, RI

    About Me Designer, Developer, Maker, Sys Admin

    Verified Reviewer

    Pros

    • Concise
    • Easy to understand
    • Helpful examples

    Cons

      Best Uses

      • Novice
      • Student

      Comments about O'Reilly Media An Introduction to MapReduce with Pete Warden:

      An Introduction to MapReduce is a video offering from O'Reilly which provides a simple introduction to the use of map-reduce without a great deal of overhead. The product contains four video segments, starting with a description of the difficulties encountered when using simple scripting against large data sets and swiftly moving into the use of Python scripts to implement a map-reduce job.

      From there, the scripts are migrated to the Amazon map-reduce offerings, to demonstrate that the same algorithm can be used in a more sophisticated (Hadoop) environment. The use of Amazon tools consumes two segments, or approximately half the content of this product. An Amazon account will therefore be necessary to fully participate in the exercises.

      The provided example case (word count from a novel) is easily understood and does not interfere with the concept presentation. Python is used, but at a novice level, so a deep understanding of that language is not required; obviously access to a machine with Python installed would be helpful in order to run the jobs locally.

      One minor problem with the product is that the related content links mentioned within the video do not appear to have been provided; these resources are not essential to use of the videos.

      Overall, if you've had difficulty with the concept of map-reduce, this product would be worth a look; best audience are those who have not (successfully) run a map-reduce job on their own.

      (1 of 1 customers found this review helpful)

       
      5.0

      Highly recommended.

      By LP

      from San Francisco, CA

      About Me Developer

      Verified Reviewer

      Pros

      • Accurate
      • Concise
      • Easy to understand
      • Helpful examples

      Cons

        Best Uses

        • Intermediate
        • Novice
        • Student

        Comments about O'Reilly Media An Introduction to MapReduce with Pete Warden:

        As one of the students in the session, I can say this session is just as great on video as it was live. Pete is a great teacher; he makes MapReduce accessible in this session and demonstrates through practical examples how easy it is to understand and use.

        I thought MapReduce was a lot more complicated before this session and that I would need years of coding experience before I could use it. Turns out I just needed 30 minutes to understand it and apply it right away.

        Highly recommended.

        Displaying reviews 1-3

        Back to top

         
        Buy 2 Get 1 Free Free Shipping Guarantee
        Buying Options
        Immediate Access - Go Digital what's this?
        Video: $69.99
        (Streaming, Downloadable)