Spark
Big Data Cluster Computing in Production
By Ilya Ganelin, Ema Orhian, Kai Sasaki, Brennon York
Publisher: Wiley
Final Release Date: March 2016
Pages: 216

TIPS, TRICKS, AND SOLUTIONS FOR USING SPARK IN PRODUCTION

Spark's popularity means the field is expanding—in terms of both use and capability. Faster than Hadoop and MapReduce, but compatible with Java ® , Scala, Python ® , and R, this open source clustering framework is becoming a must-have skill. Spark: Big Data Cluster Computing in Production goes beyond the basics to show you how to bring Spark to real-world production environments. With expert instruction, real-life use cases, and frank discussion, this guide helps you move past the challenges and bring proof-of-concept Spark applications live.

  • Fine-tune your Spark app to run on production data
  • Manage resources, organize storage, and master monitoring
  • Learn about potential problems from real-world use cases, and see where Spark fits best
  • Estimate cluster size and nail down hardware requirements
  • Tune up performance with memory management, partitioning, shuffling, and more
  • Ensure data security with Kerberos
  • Head off Spark streaming problems in production
  • Integrate Spark with Yarn, Mesos, Tachyon, and more
Product Details
Recommended for You
Customer Reviews
 
Buy 2 Get 1 Free Free Shipping Guarantee
Buying Options
Immediate Access - Go Digital what's this?
Ebook:  $50.00
Formats:  ePub, Mobi, PDF