Learning Spark
Lightning-Fast Big Data Analysis
Publisher: O'Reilly Media
Final Release Date: January 2015
Pages: 274

Data in all domains is getting bigger. How can you work with it efficiently? This book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala.

Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning.

  • Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell
  • Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib
  • Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm
  • Learn how to deploy interactive, batch, and streaming applications
  • Connect to data sources including HDFS, Hive, JSON, and S3
  • Master advanced topics like data partitioning and shared variables
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyLearning Spark
 
4.1

(based on 16 reviews)

Ratings Distribution

  • 5 Stars

     

    (6)

  • 4 Stars

     

    (7)

  • 3 Stars

     

    (2)

  • 2 Stars

     

    (1)

  • 1 Stars

     

    (0)

86%

of respondents would recommend this to a friend.

Pros

  • Easy to understand (14)
  • Well-written (11)
  • Accurate (10)
  • Helpful examples (10)
  • Concise (9)

Cons

  • Not comprehensive enough (4)

Best Uses

  • Intermediate (12)
  • Novice (9)
  • Student (5)
    • Reviewer Profile:
    • Developer (11), Designer (3)

Reviewed by 16 customers

Sort by

Displaying reviews 1-10

Back to top

Previous | Next »

 
5.0

Great for Beginners!

By sbalajis

from Hackettstown, NJ

About Me Sys Admin

Verified Buyer

Pros

  • Accurate
  • Concise
  • Easy to understand
  • Helpful examples
  • Well-written

Cons

    Best Uses

    • Intermediate
    • Novice
    • Student

    Comments about oreilly Learning Spark:

    Excellent guide for quick and precise learning.

     
    4.0

    Good intro, but update is needed

    By renodino

    from Reno, NV

    About Me Developer

    Verified Reviewer

    Pros

    • Easy to understand
    • Well-written

    Cons

    • Not comprehensive enough
    • Outdated After 1 Month

    Best Uses

    • Intermediate
    • Novice

    Comments about oreilly Learning Spark:

    Provides a good surface level introduction, but could use more robust examples, and maybe a deeper dive in some subject areas. Also, less than a month after the final release of the book, the new Spark 1.3 has invalidated many of the examples (esp. Spark SQL). Under those circumstances, I think updates to the ebook should be made available.

    (0 of 4 customers found this review helpful)

     
    2.0

    Major mistakes concerning windows suppor

    By Al

    from Phladelphia,USA

    Pros

    • Concise
    • Easy to understand
    • Helpful examples
    • Well-written

    Cons

      Best Uses

      • Intermediate

      Comments about oreilly Learning Spark:

      I just browsed the book but right at the start authors claim that : Spark can be installed on any system with Java and python installed. I am sure they never tried to install a pre-built package for windows (there is non) a non of the pre-built packages works on windows (because of hadoop dependency).

      (2 of 4 customers found this review helpful)

       
      3.0

      Maybe Spark isn't for real data

      By iceback

      from SLC

      Verified Reviewer

      Pros

      • Concise
      • Easy to understand
      • Well-written

      Cons

      • Not comprehensive enough

      Best Uses

        Comments about oreilly Learning Spark:

        Definitely more thorough than most of the readily available examples out there, but really doesn't go much beyond. Maybe it's just me but certainly people are using Spark for things other than word count? Is "Big Data" really little more bloated collections independent strings?

         
        4.0

        Met my expectations

        By Emma

        from Spain

        About Me Developer

        Verified Buyer

        Pros

        • Accurate
        • Easy to understand

        Cons

        • Not comprehensive enough

        Best Uses

        • Intermediate

        Comments about oreilly Learning Spark:

        The perfect book to learn Apache Spark and get prepared for Spark Developer Certification.

        (1 of 1 customers found this review helpful)

         
        5.0

        if you are learning spark-read this book

        By just learning

        from Seattle, WA

        About Me Analyst, Developer

        Verified Reviewer

        Pros

        • Accurate
        • Concise
        • Easy to understand
        • Helpful examples
        • Well-written

        Cons

          Best Uses

          • Novice
          • Student

          Comments about oreilly Learning Spark:

          There are many types of resources out there for learning spark, but Learning Spark pulls together what you really need to keep in mind as you develop. I had taken a Spark class and watched many videos, and I still needed this book to fill in some of the gaps

          I think it works as bridging material for both data scientist persona and software/engineer persona. The book manages to answer relevant practical questions which both will have while getting started with Spark. It does this in an extremely accessible and clear explanatory style.

          First you will learn the main abstractions of Spark, and its particulars. There are useful code examples in the 3 main API languages. Then you will begin to learn some of the more advanced features, as well as starting to develop a basic understanding about how Spark and Spark applications are administered and tuned for performance. The book is helpful in developing an appreciation for how a Spark cluster could be a unifying mixed-use platform, engaging various different personnel within an organization.

          In the final chapters you will get a small flavor of the parts of the stack which sit on top of Spark- MLLib, SparkSQL, Spark Streaming, GraphX. After reading this book, I feel prepared to continue practicing hands-on with Spark, and particularly to deeply understand many of the other materials which I have come across.

          Grateful for Learning Spark.

           
          5.0

          Using Spark? Buy this book.

          By Tony Duarte

          from Silicon Valley, CA

          About Me Developer, Educator

          Verified Buyer

          Pros

          • Accurate
          • Good for beginners
          • Helpful examples

          Cons

            Best Uses

            • Novice
            • Student

            Comments about oreilly Learning Spark:

            Being charitable, the official Spark documentation might be described as "sparse".

            So having a book such as this, which covers the basics, really helps. Of course, I wish there were more details - but I'm mostly just grateful that the book exists.

            (6 of 8 customers found this review helpful)

             
            4.0

            Review of Learning Spark

            By Arun

            from San Jose, CA

            About Me Developer

            Verified Buyer

            Pros

            • Accurate
            • Concise
            • Easy to understand

            Cons

            • Not comprehensive enough

            Best Uses

            • Intermediate

            Comments about oreilly Learning Spark:

            I am still reading the book so these preliminary comments. I'll continue to add comments as I read more chapters and more chapters become available.

            The biggest shortcoming is the lack of Java 8 examples. Java 8 is gaining rapid adoption and when the book comes out in Feb 2015, it will be the preferred way of computing with Spark in Java. Here are the suggestions in preferred order:

            1. Include Java 8 examples along with Java 7 examples in the book. They will not take much space since they will be as compact as the Python examples.
            2. If the Java 8 examples are not in the book, all the examples with their Java 8 equivalents should be made available on Github *on the day the book is released*.

            (1 of 1 customers found this review helpful)

             
            3.0

            Good book for beginners

            By Tarun

            from San Francisco, CA

            About Me Designer, Developer

            Verified Buyer

            Pros

            • Accurate
            • Easy to understand
            • Helpful examples

            Cons

            • Too basic

            Best Uses

            • Intermediate

            Comments about oreilly Learning Spark:

            Book is good, but i am expecting more in that may be because this is the only book available in the market.

            I am looking for:
            1. More examples.
            2. Api level description
            3. Best practices (if any)

             
            5.0

            Thank You, Thank You, Thank You

            By 2bz4SQL

            from SOMA & Davis

            About Me Big Data Architect

            Verified Buyer

            Pros

            • Easy to understand
            • Helpful examples
            • Well-written

            Cons

              Best Uses

              • Intermediate
              • Novice

              Comments about oreilly Learning Spark:

              Spark is such a fast moving target that finding relevant, non-obsolete advice & examples is a difficult task. This book has really made this task much simpler. I have a much better understanding of the concepts now and this has really helped me to add Spark to an existing Cassandra project. I look forward to the additional chapters.

              I have downloaded just about every piece of documentation from the Databrix site, and watched just about every webinar or powerpoint slide that I could find - and this book has really helped to fill in the gaps - and to help me to understand the finer points of the excellent DataBrix presentations from Paco & the rest.

              Displaying reviews 1-10

              Back to top

              Previous | Next »

               
              Buy 2 Get 1 Free Free Shipping Guarantee
              Buying Options
              Immediate Access - Go Digital what's this?
              Ebook: $33.99
              Formats:  DAISY, ePub, Mobi, PDF
              Print & Ebook: $43.99
              Print: $39.99