Introduction to Apache Spark

Video description

Get up to speed on Apache Spark for building big data applications in Python, Java, or Scala. Recently updated with nearly an hour of new footage on DataFrames in Spark 1.3, this video workshop shows you how to explore data and apply algorithms with MLlib, GraphX, and Spark SQL. You’ll learn Spark and its core APIs by doing hands-on technical exercises with presenter Paco Nathan, host of the popular Just Enough Math video workshop.

With this workshop, you will:

  • Get going with the newest features of Spark 1.3
  • Open a Spark shell
  • Develop Spark apps for typical use cases
  • Use some machine-learning algorithms
  • Explore data sets loaded from HDFS or another filesystem
  • Work with Spark SQL, Spark Streaming, and Spark’s machine-learning library, MLlib
  • Use Maven, SBT, IPython Notebook, and other tooling
  • Learn about Spark follow-up courses and certification

Paco Nathan has led innovative data teams building large-scale apps for several years. He’s an expert in distributed systems, machine learning, cloud computing, and functional programming.

Product information

  • Title: Introduction to Apache Spark
  • Author(s):
  • Release date: March 2015
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781491919729