High Performance Spark
Best practices for scaling and optimizing Apache Spark
Publisher: O'Reilly Media
Final Release Date: March 2016
Pages: 175

With Early Release ebooks, you get books in their earliest form—the author's raw and unedited content as he or she writes—so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters are available, and the final ebook bundle is released.

If you’ve successfully used Apache Spark to solve medium sized-problems, but still struggle to realize the "Spark promise" of unparalleled performance on big data, this book is for you. High Performance Spark shows you how take advantage of Spark at scale, so you can grow beyond the novice-level. It’s ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications.

  • Learn how to make Spark jobs run faster
  • Productionize exploratory data science with Spark
  • Handle even larger data sets with Spark
  • Reduce pipeline running times for faster insights
Table of Contents
Product Details
About the Author
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyHigh Performance Spark
 
4.5

(based on 2 reviews)

Ratings Distribution

  • 5 Stars

     

    (1)

  • 4 Stars

     

    (1)

  • 3 Stars

     

    (0)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

Reviewed by 2 customers

Displaying reviews 1-2

Back to top

(1 of 1 customers found this review helpful)

 
5.0

The best book on writing production-ready Spark code

By Ewan

from Manchester, UK

About Me Developer

Verified Reviewer

Pros

  • Accurate
  • Concise
  • Easy to understand
  • Helpful examples
  • Well-written

Cons

    Best Uses

    • Expert
    • Intermediate
    • Novice

    Comments about oreilly High Performance Spark:

    There are quite a few good books on getting started with Spark, launching the interactive shell, running a few queries, and so on, but this book is fairly unique in showing you the ways to get the best of the Spark programming APIs.

    The chapter on "Joins" covering RDD, DataFrame, and Dataset APIs will save you hours if not days of research alone.

    (0 of 8 customers found this review helpful)

     
    4.0

    I would like to purchase this book

    By Srini

    from India

    Comments about oreilly High Performance Spark:

    I would like to purchase this book, but still its in the early release category. May i know when this would be ready with all topics.

    Does it cover java equivalent examples as well?

    What knowledge do we need to have, to understand the book?

    Displaying reviews 1-2

    Back to top

     
    Buy 2 Get 1 Free Free Shipping Guarantee
    Buying Options
    Immediate Access - Go Digital what's this?
    Pre-Order  Print:  $39.99
    March 2017 (est.)