Video description
Whether you’re a data engineer who needs to plan and implement a big data pipeline or a manager interested in learning how tools in the Hadoop technology stack address business goals, these videos will walk you through how to plan your big data solution. You’ll receive an introduction to the concepts of Apache Hadoop, and training on key components including Apache HBase, YARN, Cassandra, Kafka, and Spark.
Table of contents
- Introduction
- What Is Hadoop?
- Options For Data Input
- Hadoop Tools
- Conclusion
- Introduction
- Core Hadoop Components
- YARN: Components And Architecture
- Scheduling, Running And Monitoring Applications In YARN
- Conclusion
- Introduction
- Administration Basics
- Troubleshooting
- Tuning
- Operations Continuity
- Ecosystem
- Conclusion
- Introduction To Cassandra
- Getting Started With The Architecture
- Installing Cassandra
- Communicating With Cassandra
- Creating A Database
- Creating A Table
- Inserting Data
- Modeling Data
-
Creating An Application
- Understanding Cassandra Drivers
- Exploring The DataStax Java Driver
- Setting Up A Development Environment
- Creating An Application Page
- Acquiring The DataStax Java Driver Files
- Getting The DataStax Java Driver Files Through Maven
- Providing The DataStax Java Driver Files Manually
- Connecting To A Cassandra Cluster
- Executing A Query
- Displaying Query Results - Part 1
- Displaying Query Results - Part 2
- Using An MVC Pattern
- Pop Quiz - Creating an Application
- Lab: Create A Second Application - Part 1
- Lab: Create A Second Application - Part 2
- Lab: Create A Second Application - Part 3
- Updating And Deleting Data
- Selecting Hardware
-
Adding Nodes To A Cluster
- Understanding Cassandra Nodes
- Having A Network Connection - Part 1
- Having A Network Connection - Part 2
- Having A Network Connection - Part 3
- Specifying The IP Address Of A Node In Cassandra
- Specifying Seed Nodes
- Bootstrapping A Node
- Cleaning Up A Node
- Using cassandra-stress
- Pop Quiz - Adding Nodes to a Cluster
- Lab: Add A Third Node
- Monitoring A Cluster
- Repairing Nodes
- Removing A Node
- Redefining A Cluster For Multiple Data Centers
-
Resources For FurTher Learning
- Accessing Documentation
- Reading Blogs And Books
- Watching Video Recordings
- Posting Questions
- Attending Events
- Wrap Up
- The Case for Kafka
- The Basics
- Setting up a Kafka Cluster
- Writing a Kafka Producer
- Writing a Kafka Consumer
- Using Kafka from Python
- Troubleshooting Kafka
- Integrating Kafka and Hadoop with Flafka
- Kafka Availability and Consistency
- Kafka Ecosystem
- Future of Kafka
- Pre-Flight Check
- Spark Deconstructed
- A Brief History
- Simple Spark Apps
- Spark Essentials
- Spark Examples
- Unifying the Pieces - Spark SQL
- Unifying the Pieces - Spark Streaming
- Unifying the Pieces - MLlib and GraphX
- Unified Workflows Demo
- The Full SDLC
- Developer Certification
- Resources
- Introduction - Why DataFrames?
- ETL to Prepare the Data from Capital Bikeshare
- Create a DataFrame, Explore using SQL
- Data Preparation for Machine Learning Models
- Build a Classifier Using Naive Bayes
- Build a Classifier Using Decision Trees
- Build a Classifier Using Random Forests
- Use a DataFrame to Compare Models
- Parquet as a Best Practice with DataFrames
- How to Store a DataFrame with Parquet
- How to Read a DataFrame Back in From Parquet
- Use SQL to Estimate Route Durations
- Data Preparation for GraphX - Model Route Costs
- Use PageRank to Rank Popular Stations
- Optimize Routes to Columbus Circle
- Compare Results with Google Maps
- Analyze a Popular Tourist Route
- Examples of How to Use DataFrames in Python
- Summary - The New DataFrames Features in Spark
- Introduction
- Using Alluxio Locally
- Examples With Alluxio
- Deploying Alluxio On A Cluster
- Conclusion
Product information
- Title: A Beginner's Guide to Architecting Big Data Applications
- Author(s):
- Release date: December 2016
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491978610
You might also like
video
Understanding Tool Integration for Big Data Architecture
In this course, you’ll learn how to integrate Hadoop components to implement big data solutions for …
video
Advanced Architecture for Big Data Applications
Sharpen your architectural skills by understanding challenges in the main areas of distributed systems: storage, computation, …
video
Introduction to Apache HBase Operations
HBase master Jonathan Hsieh provides a complete overview of Apache HBase operations in this course designed …
book
HBase: The Definitive Guide
If you're looking for a scalable storage solution to accommodate a virtually endless amount of data, …