Large Scale Machine Learning with Spark
By Md. Rezaul Karim, Md. Mahedi Kaysar
Publisher: Packt Publishing
Final Release Date: October 2016
Pages: 476

Discover everything you need to build robust machine learning applications with Spark 2.0

About This Book

  • Get the most up-to-date book on the market that focuses on design, engineering, and scalable solutions in machine learning with Spark 2.0.0
  • Use Spark's machine learning library in a big data environment
  • You will learn how to develop high-value applications at scale with ease and a develop a personalized design

Who This Book Is For

This book is for data science engineers and scientists who work with large and complex data sets. You should be familiar with the basics of machine learning concepts, statistics, and computational mathematics. Knowledge of Scala and Java is advisable.

What You Will Learn

  • Get solid theoretical understandings of ML algorithms
  • Configure Spark on cluster and cloud infrastructure to develop applications using Scala, Java, Python, and R
  • Scale up ML applications on large cluster or cloud infrastructures
  • Use Spark ML and MLlib to develop ML pipelines with recommendation system, classification, regression, clustering, sentiment analysis, and dimensionality reduction
  • Handle large texts for developing ML applications with strong focus on feature engineering
  • Use Spark Streaming to develop ML applications for real-time streaming
  • Tune ML models with cross-validation, hyperparameters tuning and train split
  • Enhance ML models to make them adaptable for new data in dynamic and incremental environments

In Detail

Data processing, implementing related algorithms, tuning, scaling up and finally deploying are some crucial steps in the process of optimising any application.

Spark is capable of handling large-scale batch and streaming data to figure out when to cache data in memory and processing them up to 100 times faster than Hadoop-based MapReduce. This means predictive analytics can be applied to streaming and batch to develop complete machine learning (ML) applications a lot quicker, making Spark an ideal candidate for large data-intensive applications.

This book focuses on design engineering and scalable solutions using ML with Spark. First, you will learn how to install Spark with all new features from the latest Spark 2.0 release. Moving on, you'll explore important concepts such as advanced feature engineering with RDD and Datasets. After studying developing and deploying applications, you will see how to use external libraries with Spark.

In summary, you will be able to develop complete and personalised ML applications from data collections,model building, tuning, and scaling up to deploying on a cluster or the cloud.

Style and approach

This book takes a practical approach where all the topics explained are demonstrated with the help of real-world use cases.

Product Details
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyLarge Scale Machine Learning with Spark
 
4.3

(based on 6 reviews)

Ratings Distribution

  • 5 Stars

     

    (2)

  • 4 Stars

     

    (4)

  • 3 Stars

     

    (0)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

100%

of respondents would recommend this to a friend.

Pros

  • Helpful examples (6)
  • Concise (5)
  • Easy to understand (5)
  • Well-written (5)
  • Accurate (4)

Cons

  • Not comprehensive enough (4)

Best Uses

  • Intermediate (6)
  • Expert (3)
    • Reviewer Profile:
    • Developer (6)

Reviewed by 6 customers

Displaying reviews 1-6

Back to top

 
5.0

Very helpful and I would buy it again

By Menshawy

from Dublin, Ireland

About Me Developer

Verified Reviewer

Pros

  • Concise
  • Easy to understand
  • Helpful examples
  • Well-written

Cons

    Best Uses

    • Intermediate

    Comments about oreilly Large Scale Machine Learning with Spark:

    This book is very helpful and the examples are well explained. You well get to do large scale machine learning with Spark in an incremental approach.

     
    4.0

    An good book with lots of practical examples

    By Jason Roy

    from Aachen, Germany

    About Me Developer

    Verified Reviewer

    Pros

    • Accurate
    • Concise
    • Helpful examples
    • Well-written

    Cons

    • Difficult to understand
    • Not comprehensive enough

    Best Uses

    • Intermediate
    • Student

    Comments about oreilly Large Scale Machine Learning with Spark:

    Very useful book with lots of practical examples. The style of the book is fantastic. The book was approach from bottom to up approach -i.e. from feature engineering to model deployment.

    However, example should have been more elaborate and comprehensive. Moreover, it would have been even better if Scala were used instead of Java.

     
    4.0

    A useful book with reasonable price

    By Bianca

    from Sydney, Australia

    About Me Data Scientist, Developer

    Pros

    • Concise
    • Easy to understand
    • Helpful examples
    • Well-written

    Cons

    • Not comprehensive enough

    Best Uses

    • Expert
    • Intermediate

    Comments about oreilly Large Scale Machine Learning with Spark:

    The book is well written and concisely represented with lots of real-life examples. In addition, there are lots of theory makes the book perfect for them who are planning to solve machine learning problems with Spark implemented APIs.

    Another thing is that the applications are developed using Java which is better for the data scientist who are from MapReduce background.

    However, the issue I found is that not comprehensive discussion of the ML algorithms that used to solve the problems.

     
    5.0

    Very good understanding to machine learning with Spark

    By James

    from San Jose, CA, USA

    About Me Developer, Researcher

    Pros

    • Accurate
    • Concise
    • Easy to understand
    • Helpful examples
    • Well-written

    Cons

    • Should Have More Theory

    Best Uses

    • Expert
    • Intermediate

    Comments about oreilly Large Scale Machine Learning with Spark:

    This is an excellent book to help big data engineers and data scientist to get started with Spark. It doesn't require much prior knowledge about machine learning nor Spark.

    The book goes through different examples of business needs and explains the process of constructing very basics machine learning systems form end to end. It's well written and with a clear and logical structure with 20 real life examples.

    The codes are written mainly in Java; therefore, people from the MapReduce will enjoy it a lot.I was familiar with the machine learning concepts but not with Spark, this book helped me understand how to build large-scale machine learning applications in a very efficient way, so I would definitely recommend it.

    (0 of 1 customers found this review helpful)

     
    4.0

    Well written with lots of machine learning examples.

    By Donal

    from Cork, Ireland

    About Me Developer

    Pros

    • Accurate
    • Easy to understand
    • Helpful examples
    • Well-written

    Cons

    • Not comprehensive enough

    Best Uses

    • Intermediate
    • Student

    Comments about oreilly Large Scale Machine Learning with Spark:

    This book is useful for both beginner and experienced developers who want to develop large scale machine learning application with Apache Spark. The great thing of this book I found is lots of practical examples are given.

     
    4.0

    An excellent book for intermediate & expert users!

    By David

    from Dublin, Ireland

    About Me Developer

    Pros

    • Accurate
    • Concise
    • Easy to understand
    • Helpful examples

    Cons

    • Not comprehensive enough

    Best Uses

    • Expert
    • Intermediate

    Comments about oreilly Large Scale Machine Learning with Spark:

    An excellent book for new as well as the experienced data scientist and big data engineer who wants builds robust machine learning applications with Spark 2.0. Recommended!!

    Displaying reviews 1-6

    Back to top

     
    Buy 2 Get 1 Free Free Shipping Guarantee
    Buying Options
    Immediate Access - Go Digital what's this?
    Ebook:  $39.99
    Formats:  ePub, Mobi, PDF