Machine Learning With Go

Book description

Build simple, maintainable, and easy to deploy machine learning applications.

About This Book

  • Build simple, but powerful, machine learning applications that leverage Go’s standard library along with popular Go packages.
  • Learn the statistics, algorithms, and techniques needed to successfully implement machine learning in Go
  • Understand when and how to integrate certain types of machine learning model in Go applications.

Who This Book Is For

This book is for Go developers who are familiar with the Go syntax and can develop, build, and run basic Go programs. If you want to explore the field of machine learning and you love Go, then this book is for you! Machine Learning with Go will give readers the practical skills to perform the most common machine learning tasks with Go. Familiarity with some statistics and math topics is necessary.

What You Will Learn

  • Learn about data gathering, organization, parsing, and cleaning.
  • Explore matrices, linear algebra, statistics, and probability.
  • See how to evaluate and validate models.
  • Look at regression, classification, clustering.
  • Learn about neural networks and deep learning
  • Utilize times series models and anomaly detection.
  • Get to grip with techniques for deploying and distributing analyses and models.
  • Optimize machine learning workflow techniques

In Detail

The mission of this book is to turn readers into productive, innovative data analysts who leverage Go to build robust and valuable applications. To this end, the book clearly introduces the technical aspects of building predictive models in Go, but it also helps the reader understand how machine learning workflows are being applied in real-world scenarios.

Machine Learning with Go shows readers how to be productive in machine learning while also producing applications that maintain a high level of integrity. It also gives readers patterns to overcome challenges that are often encountered when trying to integrate machine learning in an engineering organization.

The readers will begin by gaining a solid understanding of how to gather, organize, and parse real-work data from a variety of sources. Readers will then develop a solid statistical toolkit that will allow them to quickly understand gain intuition about the content of a dataset. Finally, the readers will gain hands-on experience implementing essential machine learning techniques (regression, classification, clustering, and so on) with the relevant Go packages.

Finally, the reader will have a solid machine learning mindset and a powerful Go toolkit of techniques, packages, and example implementations.

Style and approach

This book connects the fundamental, theoretical concepts behind Machine Learning to practical implementations using the Go programming language.

Table of contents

  1. Preface
    1. What this book covers
    2. What you need for this book
    3. Who this book is for
    4. Conventions
    5. Reader feedback
    6. Customer support
      1. Downloading the example code
      2. Downloading the color images of this book
      3. Errata
      4. Piracy
      5. Questions
  2. Gathering and Organizing Data
    1. Handling data - Gopher style
    2. Best practices for gathering and organizing data with Go
    3. CSV files
      1. Reading in CSV data from a file
      2. Handling unexpected fields
      3. Handling unexpected types
      4. Manipulating CSV data with data frames
    4. JSON
      1. Parsing JSON
      2. JSON output
    5. SQL-like databases
      1. Connecting to an SQL database
      2. Querying the database
      3. Modifying the database
    6. Caching
      1. Caching data in memory
      2. Caching data locally on disk
    7. Data versioning
      1. Pachyderm jargon
      2. Deploying/installing Pachyderm
      3. Creating data repositories for data versioning
      4. Putting data into data repositories
      5. Getting data out of versioned data repositories
    8. References
    9. Summary
  3. Matrices, Probability, and Statistics
    1. Matrices and vectors
      1. Vectors
      2. Vector operations
      3. Matrices
      4. Matrix operations
    2. Statistics
      1. Distributions
      2. Statistical measures
        1. Measures of central tendency
        2. Measures of spread or dispersion
      3. Visualizing distributions
        1. Histograms
        2. Box plots
    3. Probability
      1. Random variables
      2. Probability measures
      3. Independent and conditional probability
      4. Hypothesis testing
        1. Test statistics
        2. Calculating p-values
    4. References
    5. Summary
  4. Evaluation and Validation
    1. Evaluation
      1. Continuous metrics
      2. Categorical metrics
        1. Individual evaluation metrics for categorical variables
        2. Confusion matrices, AUC, and ROC
    2. Validation
      1. Training and test sets
      2. Holdout set
      3. Cross validation
    3. References
    4. Summary
  5. Regression
    1. Understanding regression model jargon
    2. Linear regression
      1. Overview of linear regression
      2. Linear regression assumptions and pitfalls
      3. Linear regression example
        1. Profiling the data
        2. Choosing our independent variable
        3. Creating our training and test sets
        4. Training our model
        5. Evaluating the trained model
    3. Multiple linear regression
    4. Nonlinear and other types of regression
    5. References
    6. Summary
  6. Classification
    1. Understanding classification model jargon
    2. Logistic regression
      1. Overview of logistic regression
      2. Logistic regression assumptions and pitfalls
      3. Logistic regression example
        1. Cleaning and profiling the data
        2. Creating our training and test sets
        3. Training and testing the logistic regression model
    3. k-nearest neighbors
      1. Overview of kNN
      2. kNN assumptions and pitfalls
      3. kNN example
    4. Decision trees and random forests
      1. Overview of decision trees and random forests
      2. Decision tree and random forest assumptions and pitfalls
      3. Decision tree example
      4. Random forest example
    5. Naive bayes
      1. Overview of naive bayes and its big assumption
      2. Naive bayes example
    6. References
    7. Summary
  7. Clustering
    1. Understanding clustering model jargon
    2. Measuring Distance or Similarity
    3. Evaluating clustering techniques
      1. Internal clustering evaluation
      2. External clustering evaluation
    4. k-means clustering
      1. Overview of k-means clustering
      2. k-means assumptions and pitfalls
      3. k-means clustering example
        1. Profiling the data
        2. Generating clusters with k-means
        3. Evaluating the generated clusters
    5. Other clustering techniques
    6. References
    7. Summary
  8. Time Series and Anomaly Detection
    1. Representing time series data in Go
    2. Understanding time series jargon
    3. Statistics related to time series
      1. Autocorrelation
      2. Partial autocorrelation
    4. Auto-regressive models for forecasting
      1. Auto-regressive model overview
      2. Auto-regressive model assumptions and pitfalls
      3. Auto-regressive model example
        1. Transforming to a stationary series
        2. Analyzing the ACF and choosing an AR order
        3. Fitting and evaluating an AR(2) model
    5. Auto-regressive moving averages and other time series models
    6. Anomaly detection
    7. References
    8. Summary
  9. Neural Networks and Deep Learning
    1. Understanding neural net jargon
    2. Building a simple neural network
      1. Nodes in the network
      2. Network architecture
      3. Why do we expect this architecture to work?
      4. Training our neural network
    3. Utilizing the simple neural network
      1. Training the neural network on real data
      2. Evaluating the neural network
    4. Introducing deep learning
      1. What is a deep learning model?
      2. Deep learning with Go
        1. Setting up TensorFlow for use with Go
        2. Retrieving and calling a pretrained TensorFlow model
        3. Object detection using TensorFlow from Go
    5. References
    6. Summary
  10. Deploying and Distributing Analyses and Models
    1. Running models reliably on remote machines
      1. A brief introduction to Docker and Docker jargon
      2. Docker-izing a machine learning application
        1. Docker-izing the model training and export
        2. Docker-izing model predictions
        3. Testing the Docker images locally
        4. Running the Docker images on remote machines
    2. Building a scalable and reproducible machine learning pipeline
      1. Setting up a Pachyderm and Kubernetes cluster
      2. Building a Pachyderm machine learning pipeline
        1. Creating and filling the input repositories
        2. Creating and running the processing stages
      3. Updating pipelines and examining provenance
      4. Scaling pipeline stages
    3. References
    4. Summary
  11. Algorithms/Techniques Related to Machine Learning
    1. Gradient descent
    2. Entropy, information gain, and related methods
    3. Backpropagation

Product information

  • Title: Machine Learning With Go
  • Author(s): Daniel Whitenack
  • Release date: September 2017
  • Publisher(s): Packt Publishing
  • ISBN: 9781785882104