Data Algorithms
Recipes for Scaling up with Hadoop and Spark
Publisher: O'Reilly Media
Final Release Date: August 2014
Pages: 500

With Early Release ebooks, you get books in their earliest form—the author's raw and unedited content as he or she writes—so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters as they're written, and the final ebook bundle.

Learn the algorithms and tools you need to build MapReduce applications with Hadoop for processing gigabyte, terabyte, or petabyte-sized datasets on clusters of commodity hardware. With this practical book, Author Mahmoud Parsian, head of the big data team at Illumina, takes you step-by-step through the design of machine-learning algorithms, such as Naive Bayes and Markov Chain, and shows you how apply them to clinical and biological datasets, using MapReduce design patterns.

  • Apply MapReduce algorithms to clinical and biological data, such as DNA-Seq and RNA-Seq
  • Use the most relevant regression/analytical algorithms used for different biological data types
Table of Contents
Product Details
About the Author
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyData Algorithms
 
4.8

(based on 4 reviews)

Ratings Distribution

  • 5 Stars

     

    (3)

  • 4 Stars

     

    (1)

  • 3 Stars

     

    (0)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

100%

of respondents would recommend this to a friend.

Pros

  • Accurate (4)
  • Concise (4)
  • Easy to understand (4)
  • Helpful examples (4)
  • Well-written (4)

Cons

    Best Uses

    • Intermediate (4)
    • Expert (3)
    • Student (3)
      • Reviewer Profile:
      • Developer (4), Designer (3)

    Reviewed by 4 customers

    Sort by

    Displaying reviews 1-4

    Back to top

     
    5.0

    Covering a wide variety of MapReduce

    By Susan Z

    from Cupertino, CA

    About Me Designer, Developer

    Verified Reviewer

    Pros

    • Accurate
    • Concise
    • Easy to understand
    • Helpful examples
    • Well-written

    Cons

      Best Uses

      • Expert
      • Intermediate

      Comments about oreilly Data Algorithms:

      Enjoyed reading this book (can use for my work!): covers a wide variety of MapReduce and Spark programs. This is the first MR book which covers DNA-Seq and other statistical algorithms. Well done!

       
      5.0

      Great MapReduce Book

      By Sprintmoun100

      from Fargo, ND

      About Me Designer, Developer

      Verified Reviewer

      Pros

      • Accurate
      • Concise
      • Easy to understand
      • Helpful examples
      • Well-written

      Cons

        Best Uses

        • Expert
        • Intermediate
        • Student

        Comments about oreilly Data Algorithms:

        Great MapReduce book on variety of topics. Detailed examples on using Spark and Hadoop for MapReduce algorithms. The best part is that all solutions has source code on GitHub: https://github.com/mahmoudparsian/data-algorithms-book

         
        5.0

        MapReduce is nicely explained!

        By Mike Hanif

        from Falls Church, VA

        About Me Designer, Developer, Educator

        Pros

        • Accurate
        • Concise
        • Easy to understand
        • Helpful examples
        • Well-written

        Cons

          Best Uses

          • Expert
          • Intermediate
          • Student

          Comments about oreilly Data Algorithms:

          The author has given solid and working examples using MapReduce, Hadoop, and Spark. The range of algorithms spans from basics to sophisticated (such as Markov chains, DNA-Sequencing, Naive Bayes, kNN, ...). I have already applied some of the MapReduce algorithms for my work. Spark examples show step-by-step how to apply data algorithms to solve real problems.
          Some of the shell scripts needs to be polished (I am sure it will, since it is an early release!).

          (1 of 1 customers found this review helpful)

           
          4.0

          Great Book, BUT....

          By Don E

          from Phoenix AZ

          About Me Developer

          Pros

          • Accurate
          • Concise
          • Easy to understand
          • Helpful examples
          • Well-written

          Cons

            Best Uses

            • Intermediate
            • Novice
            • Student

            Comments about oreilly Data Algorithms:

            I am happy to be putting this out before the book is out...

            I have read the first six chapters and i really like it. But one thing i have a problem with is the idea of using the Old API instead of the new one.

            For instance, using the JobConf(which i thought was depreciated) class instead of Job Class on the new API.

            I tried to get this across to Mahmoud Parsian, but i was unable to find an email. So could someone please get the message across

            Displaying reviews 1-4

            Back to top

             
            Buy 2 Get 1 Free Free Shipping Guarantee
            Buying Options
            Immediate Access - Go Digital what's this?
            Pre-Order  Print: $69.99
            March 2015 (est.)