Mastering Machine Learning with R - Third Edition

Book description

Stay updated with expert techniques for solving data analytics and machine learning challenges and gain insights from complex projects and power up your applications

Key Features

  • Build independent machine learning (ML) systems leveraging the best features of R 3.5
  • Understand and apply different machine learning techniques using real-world examples
  • Use methods such as multi-class classification, regression, and clustering

Book Description

Given the growing popularity of the R-zerocost statistical programming environment, there has never been a better time to start applying ML to your data. This book will teach you advanced techniques in ML ,using? the latest code in R 3.5. You will delve into various complex features of supervised learning, unsupervised learning, and reinforcement learning algorithms to design efficient and powerful ML models.

This newly updated edition is packed with fresh examples covering a range of tasks from different domains. Mastering Machine Learning with R starts by showing you how to quickly manipulate data and prepare it for analysis. You will explore simple and complex models and understand how to compare them. You'll also learn to use the latest library support, such as TensorFlow and Keras-R, for performing advanced computations. Additionally, you'll explore complex topics, such as natural language processing (NLP), time series analysis, and clustering, which will further refine your skills in developing applications. Each chapter will help you implement advanced ML algorithms using real-world examples. You'll even be introduced to reinforcement learning, along with its various use cases and models. In the concluding chapters, you'll get a glimpse into how some of these blackbox models can be diagnosed and understood.

By the end of this book, you'll be equipped with the skills to deploy ML techniques in your own projects or at work.

What you will learn

  • Prepare data for machine learning methods with ease
  • Understand how to write production-ready code and package it for use
  • Produce simple and effective data visualizations for improved insights
  • Master advanced methods, such as Boosted Trees and deep neural networks
  • Use natural language processing to extract insights in relation to text
  • Implement tree-based classifiers, including Random Forest and Boosted Tree

Who this book is for

This book is for data science professionals, machine learning engineers, or anyone who is looking for the ideal guide to help them implement advanced machine learning algorithms. The book will help you take your skills to the next level and advance further in this field. Working knowledge of machine learning with R is mandatory.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Mastering Machine Learning with R Third Edition
  3. About Packt
    1. Why subscribe?
    2. Packt.com
  4. Contributors
    1. About the author
    2. About the reviewers
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  6. Preparing and Understanding Data
    1. Overview
    2. Reading the data
    3. Handling duplicate observations
      1. Descriptive statistics
      2. Exploring categorical variables
    4. Handling missing values
    5. Zero and near-zero variance features
    6. Treating the data
      1. Correlation and linearity
    7. Summary
  7. Linear Regression
    1. Univariate linear regression
      1. Building a univariate model
      2. Reviewing model assumptions
    2. Multivariate linear regression
      1. Loading and preparing the data
      2. Modeling and evaluation – stepwise regression
      3. Modeling and evaluation – MARS
      4. Reverse transformation of natural log predictions
    3. Summary
  8. Logistic Regression
    1. Classification methods and linear regression
    2. Logistic regression
    3. Model training and evaluation
      1. Training a logistic regression algorithm
        1. Weight of evidence and information value
        2. Feature selection
        3. Cross-validation and logistic regression
      2. Multivariate adaptive regression splines
      3. Model comparison
    4. Summary
  9. Advanced Feature Selection in Linear Models
    1. Regularization overview
      1. Ridge regression
      2. LASSO
      3. Elastic net
    2. Data creation
    3. Modeling and evaluation
      1. Ridge regression
      2. LASSO
      3. Elastic net
    4. Summary
  10. K-Nearest Neighbors and Support Vector Machines
    1. K-nearest neighbors
    2. Support vector machines
    3. Manipulating data
      1. Dataset creation
      2. Data preparation
    4. Modeling and evaluation
      1. KNN modeling
      2. Support vector machine
    5. Summary
  11. Tree-Based Classification
    1. An overview of the techniques
      1. Understanding a regression tree
      2. Classification trees
      3. Random forest
      4. Gradient boosting
    2. Datasets and modeling
      1. Classification tree
      2. Random forest
        1. Extreme gradient boosting – classification
      3. Feature selection with random forests
    3. Summary
  12. Neural Networks and Deep Learning
    1. Introduction to neural networks
    2. Deep learning – a not-so-deep overview
      1. Deep learning resources and advanced methods
    3. Creating a simple neural network
      1. Data understanding and preparation
      2. Modeling and evaluation
    4. An example of deep learning
      1. Keras and TensorFlow background
      2. Loading the data
      3. Creating the model function
      4. Model training
    5. Summary
  13. Creating Ensembles and Multiclass Methods
    1. Ensembles
    2. Data understanding
    3. Modeling and evaluation
      1. Random forest model
      2. Creating an ensemble
    4. Summary
  14. Cluster Analysis
    1. Hierarchical clustering
      1. Distance calculations
    2. K-means clustering
    3. Gower and PAM
      1. Gower
      2. PAM
    4. Random forest
    5. Dataset background
    6. Data understanding and preparation
    7. Modeling
      1. Hierarchical clustering
      2. K-means clustering
      3. Gower and PAM
      4. Random forest and PAM
    8. Summary
  15. Principal Component Analysis
    1. An overview of the principal components
      1. Rotation
    2. Data
      1. Data loading and review
      2. Training and testing datasets
    3. PCA modeling
      1. Component extraction
      2. Orthogonal rotation and interpretation
      3. Creating scores from the components
      4. Regression with MARS
      5. Test data evaluation
    4. Summary
  16. Association Analysis
    1. An overview of association analysis
      1. Creating transactional data
    2. Data understanding
    3. Data preparation
    4. Modeling and evaluation
    5. Summary
  17. Time Series and Causality
    1. Univariate time series analysis
      1. Understanding Granger causality
    2. Time series data
      1. Data exploration
    3. Modeling and evaluation
      1. Univariate time series forecasting
      2. Examining the causality
        1. Linear regression
        2. Vector autoregression
    4. Summary
  18. Text Mining
    1. Text mining framework and methods
      1. Topic models
      2. Other quantitative analysis
    2. Data overview
      1. Data frame creation
    3. Word frequency
      1. Word frequency in all addresses
      2. Lincoln's word frequency
    4. Sentiment analysis
    5. N-grams
    6. Topic models
    7. Classifying text
      1. Data preparation
      2. LASSO model
    8. Additional quantitative analysis
    9. Summary
  19. Creating a Package
    1. Creating a new package
    2. Summary
  20. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Mastering Machine Learning with R - Third Edition
  • Author(s): Cory Lesmeister
  • Release date: January 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789618006