Scala for Machine Learning

Book description

Leverage Scala and Machine Learning to construct and study systems that can learn from data

In Detail

The discovery of information through data clustering and classification is becoming a key differentiator for competitive organizations. Machine learning applications are everywhere, from self-driving cars, engineering designs, biometrics, and trading strategies, to detection of genetic anomalies.

The book begins with an introduction to the functional capabilities of the Scala programming language that are critical to the creation of machine learning algorithms such as dependency injection and implicits.

Next, you'll learn about data preprocessing and filtering techniques. Following this, you'll move on to clustering and dimension reduction, Naïve Bayes, regression models, sequential data, regularization and kernelization, support vector machines, neural networks, generic algorithms, and re-enforcement learning. A review of the Akka framework and Apache Spark clusters concludes the tutorial.

What You Will Learn

  • Build dynamic workflows for scientific computing
  • Leverage open source libraries to extract patterns from time series
  • Write your own classification, clustering, or evolutionary algorithm
  • Perform relative performance tuning and evaluation of Spark
  • Master probabilistic models for sequential data
  • Experiment with advanced techniques such as regularization and kernelization
  • Solve big data problems with Scala parallel collections, Akka actors, and Apache Spark clusters
  • Apply key learning strategies to a technical analysis of financial markets

Table of contents

  1. Scala for Machine Learning
    1. Table of Contents
    2. Scala for Machine Learning
    3. Credits
    4. About the Author
    5. About the Reviewers
    6. www.PacktPub.com
      1. Support files, eBooks, discount offers, and more
        1. Why subscribe?
        2. Free access for Packt account holders
    7. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    8. 1. Getting Started
      1. Mathematical notation for the curious
      2. Why machine learning?
        1. Classification
        2. Prediction
        3. Optimization
        4. Regression
      3. Why Scala?
        1. Abstraction
          1. Higher-kind projection
          2. Covariant functors for vectors
          3. Contravariant functors for co-vectors
          4. Monads
        2. Scalability
        3. Configurability
        4. Maintainability
        5. Computation on demand
      4. Model categorization
      5. Taxonomy of machine learning algorithms
        1. Unsupervised learning
          1. Clustering
          2. Dimension reduction
        2. Supervised learning
          1. Generative models
          2. Discriminative models
        3. Semi-supervised learning
        4. Reinforcement learning
      6. Don't reinvent the wheel!
      7. Tools and frameworks
        1. Java
        2. Scala
        3. Apache Commons Math
          1. Description
          2. Licensing
          3. Installation
        4. JFreeChart
          1. Description
          2. Licensing
          3. Installation
        5. Other libraries and frameworks
      8. Source code
        1. Context versus view bounds
        2. Presentation
        3. Primitives and implicits
          1. Primitive types
          2. Type conversions
        4. Immutability
        5. Performance of Scala iterators
      9. Let's kick the tires
        1. An overview of computational workflows
        2. Writing a simple workflow
          1. Step 1 – scoping the problem
          2. Step 2 – loading data
          3. Step 3 – preprocessing the data
            1. Immutable normalization
          4. Step 4 – discovering patterns
            1. Analyzing data
            2. Plotting data
          5. Step 5 – implementing the classifier
            1. Selecting an optimizer
            2. Training the model
            3. Classifying observations
          6. Step 6 – evaluating the model
      10. Summary
    9. 2. Hello World!
      1. Modeling
        1. A model by any other name
        2. Model versus design
        3. Selecting features
        4. Extracting features
      2. Defining a methodology
      3. Monadic data transformation
        1. Error handling
        2. Explicit models
        3. Implicit models
      4. A workflow computational model
        1. Supporting mathematical abstractions
          1. Step 1 – variable declaration
          2. Step 2 – model definition
          3. Step 3 – instantiation
        2. Composing mixins to build a workflow
          1. Understanding the problem
          2. Defining modules
          3. Instantiating the workflow
        3. Modularization
      5. Profiling data
        1. Immutable statistics
        2. Z-Score and Gauss
      6. Assessing a model
        1. Validation
          1. Key quality metrics
          2. F-score for binomial classification
          3. F-score for multinomial classification
        2. Cross-validation
          1. One-fold cross validation
          2. K-fold cross validation
        3. Bias-variance decomposition
        4. Overfitting
      7. Summary
    10. 3. Data Preprocessing
      1. Time series in Scala
        1. Types and operations
        2. The magnet pattern
          1. The transpose operator
          2. The differential operator
        3. Lazy views
      2. Moving averages
        1. The simple moving average
        2. The weighted moving average
        3. The exponential moving average
      3. Fourier analysis
        1. Discrete Fourier transform
        2. DFT-based filtering
        3. Detection of market cycles
      4. The discrete Kalman filter
        1. The state space estimation
          1. The transition equation
          2. The measurement equation
        2. The recursive algorithm
          1. Prediction
          2. Correction
          3. Kalman smoothing
          4. Fixed lag smoothing
          5. Experimentation
          6. Benefits and drawbacks
      5. Alternative preprocessing techniques
      6. Summary
    11. 4. Unsupervised Learning
      1. Clustering
        1. K-means clustering
          1. Measuring similarity
          2. Defining the algorithm
          3. Step 1 – cluster configuration
            1. Defining clusters
            2. Initializing clusters
          4. Step 2 – cluster assignment
          5. Step 3 – reconstruction/error minimization
            1. Creating K-means components
            2. Tail recursive implementation
            3. Iterative implementation
          6. Step 4 – classification
          7. The curse of dimensionality
          8. Setting up the evaluation
          9. Evaluating the results
          10. Tuning the number of clusters
          11. Validation
        2. The expectation-maximization algorithm
          1. Gaussian mixture models
          2. Overview of EM
          3. Implementation
          4. Classification
          5. Testing
          6. The online EM algorithm
      2. Dimension reduction
        1. Principal components analysis
          1. Algorithm
          2. Implementation
          3. Test case
          4. Evaluation
        2. Non-linear models
          1. Kernel PCA
          2. Manifolds
      3. Performance considerations
        1. K-means
        2. EM
        3. PCA
      4. Summary
    12. 5. Naïve Bayes Classifiers
      1. Probabilistic graphical models
      2. Naïve Bayes classifiers
        1. Introducing the multinomial Naïve Bayes
          1. Formalism
          2. The frequentist perspective
          3. The predictive model
          4. The zero-frequency problem
        2. Implementation
          1. Design
          2. Training
            1. Class likelihood
            2. Binomial model
            3. The multinomial model
            4. Classifier components
          3. Classification
          4. F1 validation
          5. Feature extraction
          6. Testing
      3. The Multivariate Bernoulli classification
        1. Model
        2. Implementation
      4. Naïve Bayes and text mining
        1. Basics of information retrieval
        2. Implementation
          1. Analyzing documents
          2. Extracting the frequency of relative terms
          3. Generating the features
        3. Testing
          1. Retrieving the textual information
          2. Evaluating the text mining classifier
      5. Pros and cons
      6. Summary
    13. 6. Regression and Regularization
      1. Linear regression
        1. One-variate linear regression
          1. Implementation
          2. Test case
        2. Ordinary least squares regression
          1. Design
          2. Implementation
          3. Test case 1 – trending
          4. Test case 2 – feature selection
      2. Regularization
        1. Ln roughness penalty
        2. Ridge regression
          1. Design
          2. Implementation
          3. Test case
      3. Numerical optimization
      4. Logistic regression
        1. Logistic function
        2. Binomial classification
        3. Design
        4. The training workflow
          1. Step 1 – configuring the optimizer
          2. Step 2 – computing the Jacobian matrix
          3. Step 3 – managing the convergence of the optimizer
          4. Step 4 – defining the least squares problem
          5. Step 5 – minimizing the sum of square errors
          6. Test
        5. Classification
      5. Summary
    14. 7. Sequential Data Models
      1. Markov decision processes
        1. The Markov property
        2. The first order discrete Markov chain
      2. The hidden Markov model
        1. Notations
        2. The lambda model
        3. Design
        4. Evaluation – CF-1
          1. Alpha – the forward pass
          2. Beta – the backward pass
        5. Training – CF-2
          1. The Baum-Welch estimator (EM)
        6. Decoding – CF-3
          1. The Viterbi algorithm
        7. Putting it all together
        8. Test case 1 – training
        9. Test case 2 – evaluation
        10. HMM as a filtering technique
      3. Conditional random fields
        1. Introduction to CRF
        2. Linear chain CRF
      4. Regularized CRFs and text analytics
        1. The feature functions model
        2. Design
        3. Implementation
          1. Configuring the CRF classifier
          2. Training the CRF model
          3. Applying the CRF model
        4. Tests
          1. The training convergence profile
          2. Impact of the size of the training set
          3. Impact of the L2 regularization factor
      5. Comparing CRF and HMM
      6. Performance consideration
      7. Summary
    15. 8. Kernel Models and Support Vector Machines
      1. Kernel functions
        1. An overview
        2. Common discriminative kernels
        3. Kernel monadic composition
      2. Support vector machines
        1. The linear SVM
          1. The separable case – the hard margin
          2. The nonseparable case – the soft margin
        2. The nonlinear SVM
          1. Max-margin classification
          2. The kernel trick
      3. Support vector classifiers – SVC
        1. The binary SVC
          1. LIBSVM
          2. Design
          3. Configuration parameters
            1. The SVM formulation
            2. The SVM kernel function
            3. The SVM execution
          4. Interface to LIBSVM
          5. Training
          6. Classification
          7. C-penalty and margin
          8. Kernel evaluation
          9. Applications in risk analysis
      4. Anomaly detection with one-class SVC
      5. Support vector regression
        1. An overview
        2. SVR versus linear regression
      6. Performance considerations
      7. Summary
    16. 9. Artificial Neural Networks
      1. Feed-forward neural networks
        1. The biological background
        2. Mathematical background
      2. The multilayer perceptron
        1. The activation function
        2. The network topology
        3. Design
        4. Configuration
        5. Network components
          1. The network topology
          2. Input and hidden layers
          3. The output layer
          4. Synapses
          5. Connections
          6. The initialization weights
        6. The model
        7. Problem types (modes)
        8. Online training versus batch training
        9. The training epoch
          1. Step 1 – input forward propagation
            1. The computational flow
            2. Error functions
            3. Operating modes
            4. Softmax
          2. Step 2 – error backpropagation
            1. Weights' adjustment
            2. The error propagation
            3. The computational model
          3. Step 3 – exit condition
          4. Putting it all together
        10. Training and classification
          1. Regularization
          2. The model generation
          3. The Fast Fisher-Yates shuffle
          4. Prediction
          5. Model fitness
      3. Evaluation
        1. The execution profile
        2. Impact of the learning rate
        3. The impact of the momentum factor
        4. The impact of the number of hidden layers
        5. Test case
          1. Implementation
          2. Evaluation of models
          3. Impact of the hidden layers' architecture
      4. Convolution neural networks
        1. Local receptive fields
        2. Sharing of weights
        3. Convolution layers
        4. Subsampling layers
        5. Putting it all together
      5. Benefits and limitations
      6. Summary
    17. 10. Genetic Algorithms
      1. Evolution
        1. The origin
        2. NP problems
        3. Evolutionary computing
      2. Genetic algorithms and machine learning
      3. Genetic algorithm components
        1. Encoding
          1. Value encoding
          2. Predicate encoding
          3. Solution encoding
          4. The encoding scheme
            1. Flat encoding
            2. Hierarchical encoding
        2. Genetic operators
          1. Selection
          2. Crossover
          3. Mutation
        3. The fitness score
      4. Implementation
        1. Software design
        2. Key components
          1. Population
          2. Chromosomes
          3. Genes
        3. Selection
        4. Controlling the population growth
        5. The GA configuration
        6. Crossover
          1. Population
          2. Chromosomes
          3. Genes
        7. Mutation
          1. Population
          2. Chromosomes
          3. Genes
        8. Reproduction
        9. Solver
      5. GA for trading strategies
        1. Definition of trading strategies
          1. Trading operators
          2. The cost function
          3. Trading signals
          4. Trading strategies
          5. Trading signal encoding
        2. A test case
          1. Creating trading strategies
          2. Configuring the optimizer
          3. Finding the best trading strategy
          4. Tests
            1. The weighted score
            2. The unweighted score
      6. Advantages and risks of genetic algorithms
      7. Summary
    18. 11. Reinforcement Learning
      1. Reinforcement learning
        1. The problem
        2. A solution – Q-learning
          1. Terminology
          2. Concepts
          3. Value of a policy
          4. The Bellman optimality equations
          5. Temporal difference for model-free learning
          6. Action-value iterative update
        3. Implementation
          1. Software design
          2. The states and actions
          3. The search space
          4. The policy and action-value
          5. The Q-learning components
          6. The Q-learning training
          7. Tail recursion to the rescue
          8. The validation
          9. The prediction
        4. Option trading using Q-learning
          1. The OptionProperty class
          2. The OptionModel class
          3. Quantization
        5. Putting it all together
        6. Evaluation
        7. Pros and cons of reinforcement learning
      2. Learning classifier systems
        1. Introduction to LCS
        2. Why LCS?
        3. Terminology
        4. Extended learning classifier systems
        5. XCS components
          1. Application to portfolio management
          2. The XCS core data
          3. XCS rules
          4. Covering
          5. An implementation example
        6. Benefits and limitations of learning classifier systems
      3. Summary
    19. 12. Scalable Frameworks
      1. An overview
      2. Scala
        1. Object creation
        2. Streams
        3. Parallel collections
          1. Processing a parallel collection
          2. The benchmark framework
          3. Performance evaluation
      3. Scalability with Actors
        1. The Actor model
        2. Partitioning
        3. Beyond actors – reactive programming
      4. Akka
        1. Master-workers
          1. Exchange of messages
          2. Worker actors
          3. The workflow controller
          4. The master actor
          5. Master with routing
          6. Distributed discrete Fourier transform
          7. Limitations
        2. Futures
          1. The Actor life cycle
          2. Blocking on futures
          3. Handling future callbacks
          4. Putting it all together
      5. Apache Spark
        1. Why Spark?
        2. Design principles
          1. In-memory persistency
          2. Laziness
          3. Transforms and actions
          4. Shared variables
        3. Experimenting with Spark
          1. Deploying Spark
          2. Using Spark shell
          3. MLlib
          4. RDD generation
          5. K-means using Spark
        4. Performance evaluation
          1. Tuning parameters
          2. Tests
          3. Performance considerations
        5. Pros and cons
        6. 0xdata Sparkling Water
      6. Summary
    20. A. Basic Concepts
      1. Scala programming
        1. List of libraries and tools
        2. Code snippets format
        3. Best practices
          1. Encapsulation
          2. Class constructor template
          3. Companion objects versus case classes
          4. Enumerations versus case classes
          5. Overloading
          6. Design template for immutable classifiers
        4. Utility classes
          1. Data extraction
          2. Data sources
          3. Extraction of documents
          4. DMatrix class
          5. Counter
          6. Monitor
      2. Mathematics
        1. Linear algebra
          1. QR decomposition
          2. LU factorization
          3. LDL decomposition
          4. Cholesky factorization
          5. Singular Value Decomposition
          6. Eigenvalue decomposition
          7. Algebraic and numerical libraries
        2. First order predicate logic
        3. Jacobian and Hessian matrices
        4. Summary of optimization techniques
          1. Gradient descent methods
            1. Steepest descent
            2. Conjugate gradient
            3. Stochastic gradient descent
          2. Quasi-Newton algorithms
            1. BFGS
            2. L-BFGS
          3. Nonlinear least squares minimization
            1. Gauss-Newton
            2. Levenberg-Marquardt
          4. Lagrange multipliers
        5. Overview of dynamic programming
      3. Finances 101
        1. Fundamental analysis
        2. Technical analysis
          1. Terminology
          2. Trading data
          3. Trading signals and strategy
          4. Price patterns
        3. Options trading
        4. Financial data sources
      4. Suggested online courses
      5. References
    21. Index

Product information

  • Title: Scala for Machine Learning
  • Author(s): Patrick R. Nicolas
  • Release date: December 2014
  • Publisher(s): Packt Publishing
  • ISBN: 9781783558742