Below are the video training courses included in this Learning Path.
Introduction to Big Data
Presented by Vladimir Bacvanski2 hours 58 minutes
Start your exploration of big data, Hadoop, NoSQL, and related technologies here. You’ll learn what big data is and how to process it with MapReduce and Hadoop, including several ways to program big data applications. You’ll also cover NoSQL stores and their best uses and then conclude with NoSQL in the enterprise.
Learning Apache Cassandra
Presented by Ruth Stryker8 hours 6 minutes
Apache Cassandra is a distributed database management system for handling large amounts of data across many commodity servers. Get a solid understanding of Cassandra as you learn to use it for your own development projects. Begin with the basics of installing and communicating with Cassandra, then learn to create an application, work with clusters, and more.
Introduction to Apache Kafka
Presented by Gwen Shapira2 hours 55 minutes
Currently one of the hottest projects across the Hadoop ecosystem, Apache Kafka is a distributed, real-time data system that functions in a manner similar to a pub/sub messaging service, but with better throughput, built-in partitioning, replication, and fault tolerance. In this course, you’ll learn how to integrate Kafka into a data processing pipeline and become familiar with the entire Kafka ecosystem.
Introduction to Apache Spark
Presented by Paco Nathan4 hours 46 minutes
Get up to speed on Apache Spark for building big data applications in Python, Java, or Scala. This course teaches you how to explore data and apply algorithms with MLlib, GraphX, and Spark SQL. You’ll learn Spark and its core APIs by doing hands-on technical exercises with presenter Paco Nathan.
Building Big Data Platforms
Presented by O'Reilly Media, Inc.5 hours 45 minutes
What kinds of platforms have Netflix, LinkedIn, CERN, and PayPal constructed to handle big data operations unique to their businesses? And how can you apply some of these solutions to your own business? This course presents case studies from a variety of organizations that will be helpful as you build your own big data infrastructure.
Architectural Considerations for Hadoop Applications
Presented by Mark Grover, Gwen Shapira, Jonathan Seidman, and Ted Malaska2 hours 31 minutes
Implementing solutions with Apache Hadoop requires understanding not just Hadoop, but a broad range of related projects in the Hadoop ecosystem such as Hive, Pig, Oozie, Sqoop, and Flume. Using Clickstream analytics as an end-to-end example, you see how to architect and implement a complete solution with Hadoop.
An Introduction to Time Series with Team Apache
Presented by Patrick McFadin3 hours 50 minutes
As it becomes easier to create data, we’re faced with the need to collect and analyze at scales never seen before. Learn how to solve time-series data problems with technologies from Team Apache: Kafka, Spark and Cassandra. Using these technologies, you’ll work with an example weather collection network and the challenges it can produce.