Parallel R
Data Analysis in the Distributed World
Publisher: O'Reilly Media
Released: October 2011
Pages: 126

It’s tough to argue with R as a high-quality, cross-platform, open source statistical software product—unless you’re in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets, including three chapters on using R and Hadoop together. You’ll learn the basics of Snow, Multicore, Parallel, Segue, RHIPE, and Hadoop Streaming, including how to find them, how to use them, when they work well, and when they don’t.

With these packages, you can overcome R’s single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R’s memory barrier.

  • Snow: works well in a traditional cluster environment
  • Multicore: popular for multiprocessor and multicore computers
  • Parallel: part of the upcoming R 2.14.0 release
  • R+Hadoop: provides low-level access to a popular form of cluster computing
  • RHIPE: uses Hadoop’s power with R’s language and interactive shell
  • Segue: lets you use Elastic MapReduce as a backend for lapply-style operations
Table of Contents
Product Details
About the Author
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyParallel R
 
3.0

(based on 2 reviews)

Ratings Distribution

  • 5 Stars

     

    (0)

  • 4 Stars

     

    (1)

  • 3 Stars

     

    (0)

  • 2 Stars

     

    (1)

  • 1 Stars

     

    (0)

Reviewed by 2 customers

Sort by

Displaying reviews 1-2

Back to top

(0 of 2 customers found this review helpful)

 
2.0

It's about parallel R

By Swalker

from Seattle, WA

Comments about oreilly Parallel R:

It's OK, but it's pretty obvious early on that R is not as robust when it comes to something you might want to run in a multi process application. Yes. You can program your own use cases, but if your going to do that, use a language that is more complete like python, or Java which have the same or better multiprocessing Capabilities.

 
4.0

Good, small, fast.

By Encode__

from Stockholm, Sweden

About Me Developer

Pros

  • Concise
  • Easy to understand

Cons

    Best Uses

    • Expert
    • Intermediate

    Comments about oreilly Parallel R:

    This is a short guide to a handful of parallel R libraries. There's also alot of emphasis on how to connect R to the world of big data, three chapters are dedicated to running R and Hadoop; R+Hadoop, RHIPE and Seque. I think it's a good, short read for the more experienced R-programmer and I also like the small size.

    Displaying reviews 1-2

    Back to top

     
    Buy 2 Get 1 Free Free Shipping Guarantee
    Buying Options
    Immediate Access - Go Digital what's this?
    Ebook: $19.99
    Formats:  DAISY, ePub, Mobi, PDF
    Print & Ebook: $24.19
    Print: $21.99