Learning Bayesian Models with R

Book description

Become an expert in Bayesian Machine Learning methods using R and apply them to solve real-world big data problems

About This Book

  • Understand the principles of Bayesian Inference with less mathematical equations
  • Learn state-of-the art Machine Learning methods
  • Familiarize yourself with the recent advances in Deep Learning and Big Data frameworks with this step-by-step guide

Who This Book Is For

This book is for statisticians, analysts, and data scientists who want to build a Bayes-based system with R and implement it in their day-to-day models and projects. It is mainly intended for Data Scientists and Software Engineers who are involved in the development of Advanced Analytics applications. To understand this book, it would be useful if you have basic knowledge of probability theory and analytics and some familiarity with the programming language R.

What You Will Learn

  • Set up the R environment
  • Create a classification model to predict and explore discrete variables
  • Get acquainted with Probability Theory to analyze random events
  • Build Linear Regression models
  • Use Bayesian networks to infer the probability distribution of decision variables in a problem
  • Model a problem using Bayesian Linear Regression approach with the R package BLR
  • Use Bayesian Logistic Regression model to classify numerical data
  • Perform Bayesian Inference on massively large data sets using the MapReduce programs in R and Cloud computing

In Detail

Bayesian Inference provides a unified framework to deal with all sorts of uncertainties when learning patterns form data using machine learning models and use it for predicting future observations. However, learning and implementing Bayesian models is not easy for data science practitioners due to the level of mathematical treatment involved. Also, applying Bayesian methods to real-world problems requires high computational resources. With the recent advances in computation and several open sources packages available in R, Bayesian modeling has become more feasible to use for practical applications today. Therefore, it would be advantageous for all data scientists and engineers to understand Bayesian methods and apply them in their projects to achieve better results.

Learning Bayesian Models with R starts by giving you a comprehensive coverage of the Bayesian Machine Learning models and the R packages that implement them. It begins with an introduction to the fundamentals of probability theory and R programming for those who are new to the subject. Then the book covers some of the important machine learning methods, both supervised and unsupervised learning, implemented using Bayesian Inference and R.

Every chapter begins with a theoretical description of the method explained in a very simple manner. Then, relevant R packages are discussed and some illustrations using data sets from the UCI Machine Learning repository are given. Each chapter ends with some simple exercises for you to get hands-on experience of the concepts and R packages discussed in the chapter.

The last chapters are devoted to the latest development in the field, specifically Deep Learning, which uses a class of Neural Network models that are currently at the frontier of Artificial Intelligence. The book concludes with the application of Bayesian methods on Big Data using the Hadoop and Spark frameworks.

Style and approach

The book first gives you a theoretical description of the Bayesian models in simple language, followed by details of its implementation in the R package. Each chapter has illustrations for the use of Bayesian model and the corresponding R package, using data sets from the UCI Machine Learning repository. Each chapter also contains sufficient exercises for you to get more hands-on practice.

Table of contents

  1. Learning Bayesian Models with R
    1. Table of Contents
    2. Learning Bayesian Models with R
    3. Credits
    4. About the Author
    5. About the Reviewers
    6. www.PacktPub.com
      1. Support files, eBooks, discount offers, and more
        1. Why subscribe?
        2. Free access for Packt account holders
    7. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    8. 1. Introducing the Probability Theory
      1. Probability distributions
      2. Conditional probability
      3. Bayesian theorem
      4. Marginal distribution
      5. Expectations and covariance
        1. Binomial distribution
        2. Beta distribution
        3. Gamma distribution
        4. Dirichlet distribution
        5. Wishart distribution
      6. Exercises
      7. References
      8. Summary
    9. 2. The R Environment
      1. Setting up the R environment and packages
        1. Installing R and RStudio
        2. Your first R program
      2. Managing data in R
        1. Data Types in R
        2. Data structures in R
        3. Importing data into R
        4. Slicing and dicing datasets
        5. Vectorized operations
      3. Writing R programs
        1. Control structures
        2. Functions
        3. Scoping rules
        4. Loop functions
          1. lapply
          2. sapply
          3. mapply
          4. apply
          5. tapply
      4. Data visualization
        1. High-level plotting functions
        2. Low-level plotting commands
        3. Interactive graphics functions
      5. Sampling
        1. Random uniform sampling from an interval
        2. Sampling from normal distribution
      6. Exercises
      7. References
      8. Summary
    10. 3. Introducing Bayesian Inference
      1. Bayesian view of uncertainty
        1. Choosing the right prior distribution
          1. Non-informative priors
          2. Subjective priors
          3. Conjugate priors
          4. Hierarchical priors
        2. Estimation of posterior distribution
          1. Maximum a posteriori estimation
          2. Laplace approximation
          3. Monte Carlo simulations
            1. The Metropolis-Hasting algorithm
              1. R packages for the Metropolis-Hasting algorithm
            2. Gibbs sampling
              1. R packages for Gibbs sampling
          4. Variational approximation
        3. Prediction of future observations
      2. Exercises
      3. References
      4. Summary
    11. 4. Machine Learning Using Bayesian Inference
      1. Why Bayesian inference for machine learning?
      2. Model overfitting and bias-variance tradeoff
      3. Selecting models of optimum complexity
        1. Subset selection
        2. Model regularization
      4. Bayesian averaging
      5. An overview of common machine learning tasks
      6. References
      7. Summary
    12. 5. Bayesian Regression Models
      1. Generalized linear regression
      2. The arm package
      3. The Energy efficiency dataset
      4. Regression of energy efficiency with building parameters
        1. Ordinary regression
        2. Bayesian regression
      5. Simulation of the posterior distribution
      6. Exercises
      7. References
      8. Summary
    13. 6. Bayesian Classification Models
      1. Performance metrics for classification
      2. The Naïve Bayes classifier
        1. Text processing using the tm package
        2. Model training and prediction
      3. The Bayesian logistic regression model
        1. The BayesLogit R package
        2. The dataset
        3. Preparation of the training and testing datasets
        4. Using the Bayesian logistic model
      4. Exercises
      5. References
      6. Summary
    14. 7. Bayesian Models for Unsupervised Learning
      1. Bayesian mixture models
        1. The bgmm package for Bayesian mixture models
      2. Topic modeling using Bayesian inference
        1. Latent Dirichlet allocation
      3. R packages for LDA
        1. The topicmodels package
        2. The lda package
      4. Exercises
      5. References
      6. Summary
    15. 8. Bayesian Neural Networks
      1. Two-layer neural networks
      2. Bayesian treatment of neural networks
      3. The brnn R package
      4. Deep belief networks and deep learning
        1. Restricted Boltzmann machines
        2. Deep belief networks
        3. The darch R package
        4. Other deep learning packages in R
      5. Exercises
      6. References
      7. Summary
    16. 9. Bayesian Modeling at Big Data Scale
      1. Distributed computing using Hadoop
      2. RHadoop for using Hadoop from R
      3. Spark – in-memory distributed computing
      4. SparkR
      5. Linear regression using SparkR
      6. Computing clusters on the cloud
        1. Amazon Web Services
        2. Creating and running computing instances on AWS
        3. Installing R and RStudio
        4. Running Spark on EC2
        5. Microsoft Azure
        6. IBM Bluemix
      7. Other R packages for large scale machine learning
        1. The parallel R package
        2. The foreach R package
      8. Exercises
      9. References
      10. Summary
    17. Index

Product information

  • Title: Learning Bayesian Models with R
  • Author(s): Dr. Hari M. Koduvely
  • Release date: October 2015
  • Publisher(s): Packt Publishing
  • ISBN: 9781783987603