Big Data Cluster Computing in Production
By Ilya Ganelin, Ema Orhian, Kai Sasaki, Brennon York
Publisher: Wiley
Final Release Date: March 2016
Pages: 216


Spark's popularity means the field is expanding—in terms of both use and capability. Faster than Hadoop and MapReduce, but compatible with Java ® , Scala, Python ® , and R, this open source clustering framework is becoming a must-have skill. Spark: Big Data Cluster Computing in Production goes beyond the basics to show you how to bring Spark to real-world production environments. With expert instruction, real-life use cases, and frank discussion, this guide helps you move past the challenges and bring proof-of-concept Spark applications live.

  • Fine-tune your Spark app to run on production data
  • Manage resources, organize storage, and master monitoring
  • Learn about potential problems from real-world use cases, and see where Spark fits best
  • Estimate cluster size and nail down hardware requirements
  • Tune up performance with memory management, partitioning, shuffling, and more
  • Ensure data security with Kerberos
  • Head off Spark streaming problems in production
  • Integrate Spark with Yarn, Mesos, Tachyon, and more
Product Details
Recommended for You
Customer Reviews
Buy 2 Get 1 Free Free Shipping Guarantee
Buying Options
Immediate Access - Go Digital what's this?
Ebook:  $50.00
Formats:  ePub, Mobi, PDF