Intermediate Data Science with R

Intermediate Data Science with R

Video Training

If you already know some R and want to extend it to big data, machine learning, and distributed computing, this Learning Path will walk you through the techniques you need to know, including: advanced data wrangling; working with R packages like diplyr, tidyr, and ggplot; data modeling; and using tools like Spark, AWS, and AzureML.

Below are the video training courses included in this Learning Path.

1

Expert Data Wrangling With R

Presented by Garrett Grolemund 3 hours 50 minutes

Data scientists often spend 50–80% of their time preparing and transforming data sets before they begin more formal analysis work. This video tutorial shows you how to streamline your code—and your thinking—by introducing a set of principles and R packages like tidyr, dplyr, and ggvis that make this work much faster and easier.

2

Data Science with Microsoft Azure and R

Presented by Stephen Elston 6 hours 48 minutes

In this video, you’ll learn how to develop and deploy effective machine learning models in the Microsoft Azure Machine Learning (ML) environment. You’ll learn feature selection and dimensionality reduction, functional programming with R, and R object communications, as well as Azure ML web services, including how to create and update an Azure ML web service.

3

Using R for Big Data with Spark

Presented by Manuel Amunategui 2 hours 19 minutes

This video will show you how to leverage the power of Spark, distributed computing, and cloud storage to work with massive data sets not possible on a single computer. You’ll set-up your own extremely low-cost, easily terminated AWS account to work hands-on to create Spark clusters on the Amazon Web Services (AWS) platform; perform cluster based data modeling using Gaussian generalized linear models, binomial generalized linear models, Naive Bayes, and K-means modeling; access data from S3 Spark DataFrames and other formats like CSV, JSON, and HDFS; and do cluster-based data manipulation operations with tools like SparkR and SparkSQL.