R: Data Analysis and Visualization

Book description

Master the art of building analytical models using R

About This Book

  • Load, wrangle, and analyze your data using the world's most powerful statistical programming language
  • Build and customize publication-quality visualizations of powerful and stunning R graphs
  • Develop key skills and techniques with R to create and customize data mining algorithms
  • Use R to optimize your trading strategy and build up your own risk management system
  • Discover how to build machine learning algorithms, prepare data, and dig deep into data prediction techniques with R

Who This Book Is For

This course is for data scientist or quantitative analyst who are looking at learning R and take advantage of its powerful analytical design framework. It's a seamless journey in becoming a full-stack R developer.

What You Will Learn

  • Describe and visualize the behavior of data and relationships between data
  • Gain a thorough understanding of statistical reasoning and sampling
  • Handle missing data gracefully using multiple imputation
  • Create diverse types of bar charts using the default R functions
  • Familiarize yourself with algorithms written in R for spatial data mining, text mining, and so on
  • Understand relationships between market factors and their impact on your portfolio
  • Harness the power of R to build machine learning algorithms with real-world data science applications
  • Learn specialized machine learning techniques for text mining, big data, and more

In Detail

The R learning path created for you has five connected modules, which are a mini-course in their own right. As you complete each one, you'll have gained key skills and be ready for the material in the next module!

This course begins by looking at the Data Analysis with R module. This will help you navigate the R environment. You'll gain a thorough understanding of statistical reasoning and sampling. Finally, you'll be able to put best practices into effect to make your job easier and facilitate reproducibility.

The second place to explore is R Graphs, which will help you leverage powerful default R graphics and utilize advanced graphics systems such as lattice and ggplot2, the grammar of graphics. You'll learn how to produce, customize, and publish advanced visualizations using this popular and powerful framework.

With the third module, Learning Data Mining with R, you will learn how to manipulate data with R using code snippets and be introduced to mining frequent patterns, association, and correlations while working with R programs.

The Mastering R for Quantitative Finance module pragmatically introduces both the quantitative finance concepts and their modeling in R, enabling you to build a tailor-made trading system on your own. By the end of the module, you will be well-versed with various financial techniques using R and will be able to place good bets while making financial decisions.

Finally, we'll look at the Machine Learning with R module. With this module, you'll discover all the analytical tools you need to gain insights from complex data and learn how to choose the correct algorithm for your specific needs. You'll also learn to apply machine learning methods to deal with common tasks, including classification, prediction, forecasting, and so on.

Style and approach

Learn data analysis, data visualization techniques, data mining, and machine learning all using R and also learn to build models in quantitative finance using this powerful language.

Table of contents

  1. R: Data Analysis and Visualization
    1. Table of Contents
    2. R: Data Analysis and Visualization
      1. Meet Your Course Guide
      2. Course Structure
      3. Course journey
      4. The Course Roadmap and Timeline
    3. I. Module 1: Data Analysis with R
      1. 1. RefresheR
        1. Navigating the basics
          1. Arithmetic and assignment
          2. Logicals and characters
          3. Flow of control
        2. Getting help in R
        3. Vectors
          1. Subsetting
          2. Vectorized functions
          3. Advanced subsetting
          4. Recycling
        4. Functions
        5. Matrices
        6. Loading data into R
        7. Working with packages
      2. 2. The Shape of Data
        1. Univariate data
        2. Frequency distributions
        3. Central tendency
        4. Spread
        5. Populations, samples, and estimation
        6. Probability distributions
        7. Visualization methods
      3. 3. Describing Relationships
        1. Multivariate data
        2. Relationships between a categorical and a continuous variable
        3. Relationships between two categorical variables
        4. The relationship between two continuous variables
          1. Covariance
          2. Correlation coefficients
          3. Comparing multiple correlations
        5. Visualization methods
          1. Categorical and continuous variables
          2. Two categorical variables
          3. Two continuous variables
          4. More than two continuous variables
      4. 4. Probability
        1. Basic probability
        2. A tale of two interpretations
        3. Sampling from distributions
          1. Parameters
          2. The binomial distribution
        4. The normal distribution
          1. The three-sigma rule and using z-tables
      5. 5. Using Data to Reason About the World
        1. Estimating means
        2. The sampling distribution
        3. Interval estimation
          1. How did we get 1.96?
        4. Smaller samples
      6. 6. Testing Hypotheses
        1. Null Hypothesis Significance Testing
          1. One and two-tailed tests
          2. When things go wrong
          3. A warning about significance
          4. A warning about p-values
        2. Testing the mean of one sample
          1. Assumptions of the one sample t-test
        3. Testing two means
          1. Don't be fooled!
          2. Assumptions of the independent samples t-test
        4. Testing more than two means
          1. Assumptions of ANOVA
        5. Testing independence of proportions
        6. What if my assumptions are unfounded?
      7. 7. Bayesian Methods
        1. The big idea behind Bayesian analysis
        2. Choosing a prior
        3. Who cares about coin flips
        4. Enter MCMC – stage left
        5. Using JAGS and runjags
        6. Fitting distributions the Bayesian way
        7. The Bayesian independent samples t-test
      8. 8. Predicting Continuous Variables
        1. Linear models
        2. Simple linear regression
        3. Simple linear regression with a binary predictor
          1. A word of warning
        4. Multiple regression
        5. Regression with a non-binary predictor
        6. Kitchen sink regression
        7. The bias-variance trade-off
          1. Cross-validation
          2. Striking a balance
        8. Linear regression diagnostics
          1. Second Anscombe relationship
          2. Third Anscombe relationship
          3. Fourth Anscombe relationship
        9. Advanced topics
      9. 9. Predicting Categorical Variables
        1. k-Nearest Neighbors
          1. Using k-NN in R
            1. Confusion matrices
            2. Limitations of k-NN
        2. Logistic regression
          1. Using logistic regression in R
        3. Decision trees
        4. Random forests
        5. Choosing a classifier
          1. The vertical decision boundary
          2. The diagonal decision boundary
          3. The crescent decision boundary
          4. The circular decision boundary
      10. 10. Sources of Data
        1. Relational Databases
          1. Why didn't we just do that in SQL?
        2. Using JSON
        3. XML
        4. Other data formats
        5. Online repositories
      11. 11. Dealing with Messy Data
        1. Analysis with missing data
          1. Visualizing missing data
          2. Types of missing data
            1. So which one is it?
          3. Unsophisticated methods for dealing with missing data
            1. Complete case analysis
            2. Pairwise deletion
            3. Mean substitution
            4. Hot deck imputation
            5. Regression imputation
            6. Stochastic regression imputation
          4. Multiple imputation
            1. So how does mice come up with the imputed values?
              1. Methods of imputation
          5. Multiple imputation in practice
        2. Analysis with unsanitized data
          1. Checking for out-of-bounds data
          2. Checking the data type of a column
          3. Checking for unexpected categories
          4. Checking for outliers, entry errors, or unlikely data points
          5. Chaining assertions
        3. Other messiness
          1. OpenRefine
          2. Regular expressions
          3. tidyr
      12. 12. Dealing with Large Data
        1. Wait to optimize
        2. Using a bigger and faster machine
        3. Be smart about your code
          1. Allocation of memory
          2. Vectorization
        4. Using optimized packages
        5. Using another R implementation
        6. Use parallelization
          1. Getting started with parallel R
          2. An example of (some) substance
        7. Using Rcpp
        8. Be smarter about your code
      13. 13. Reproducibility and Best Practices
        1. R Scripting
          1. RStudio
          2. Running R scripts
          3. An example script
          4. Scripting and reproducibility
        2. R projects
        3. Version control
        4. Communicating results
    4. II. Module 2: R Graphs
      1. 1. R Graphics
        1. Base graphics using the default package
        2. Trellis graphs using lattice
        3. Graphs inspired by Grammar of Graphics
      2. 2. Basic Graph Functions
        1. Introduction
        2. Creating basic scatter plots
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
            1. A note on R's built-in datasets
          5. See also
        3. Creating line graphs
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        4. Creating bar charts
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        5. Creating histograms and density plots
          1. How to do it...
          2. How it works...
          3. There's more...
          4. See also
        6. Creating box plots
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        7. Adjusting x and y axes' limits
          1. How to do it...
          2. How it works...
          3. There's more...
          4. See also
        8. Creating heat maps
          1. How to do it...
          2. How it works...
          3. There's more...
          4. See also
        9. Creating pairs plots
          1. How to do it...
          2. How it works...
          3. There's more...
          4. See also
        10. Creating multiple plot matrix layouts
          1. How to do it...
          2. How it works...
          3. There's more...
          4. See also
        11. Adding and formatting legends
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        12. Creating graphs with maps
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        13. Saving and exporting graphs
          1. How to do it...
          2. How it works...
          3. There's more...
          4. See also
      3. 3. Beyond the Basics – Adjusting Key Parameters
        1. Introduction
        2. Setting colors of points, lines, and bars
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        3. Setting plot background colors
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        4. Setting colors for text elements – axis annotations, labels, plot titles, and legends
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        5. Choosing color combinations and palettes
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        6. Setting fonts for annotations and titles
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        7. Choosing plotting point symbol styles and sizes
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        8. Choosing line styles and width
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. See also
        9. Choosing box styles
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        10. Adjusting axis annotations and tick marks
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        11. Formatting log axes
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        12. Setting graph margins and dimensions
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. See also
      4. 4. Creating Scatter Plots
        1. Introduction
        2. Grouping data points within a scatter plot
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        3. Highlighting grouped data points by size and symbol type
          1. Getting ready
          2. How to do it...
          3. How it works...
        4. Labeling data points
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        5. Correlation matrix using pairs plots
          1. Getting ready
          2. How to do it...
          3. How it works...
        6. Adding error bars
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        7. Using jitter to distinguish closely packed data points
          1. Getting ready
          2. How to do it...
          3. How it works...
        8. Adding linear model lines
          1. Getting ready
          2. How to do it...
          3. How it works...
        9. Adding nonlinear model curves
          1. Getting ready
          2. How to do it...
          3. How it works...
        10. Adding nonparametric model curves with lowess
          1. Getting ready
          2. How to do it...
          3. How it works...
        11. Creating three-dimensional scatter plots
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        12. Creating Quantile-Quantile plots
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        13. Displaying the data density on axes
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        14. Creating scatter plots with a smoothed density representation
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
      5. 5. Creating Line Graphs and Time Series Charts
        1. Introduction
        2. Adding customized legends for multiple-line graphs
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        3. Using margin labels instead of legends for multiple-line graphs
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        4. Adding horizontal and vertical grid lines
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        5. Adding marker lines at specific x and y values using abline
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        6. Creating sparklines
          1. Getting ready
          2. How to do it...
          3. How it works...
        7. Plotting functions of a variable in a dataset
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        8. Formatting time series data for plotting
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        9. Plotting the date or time variable on the x axis
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        10. Annotating axis labels in different human-readable time formats
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        11. Adding vertical markers to indicate specific time events
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        12. Plotting data with varying time-averaging periods
          1. Getting ready
          2. How to do it...
          3. How it works...
        13. Creating stock charts
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
      6. 6. Creating Bar, Dot, and Pie Charts
        1. Introduction
        2. Creating bar charts with more than one factor variable
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. See also
        3. Creating stacked bar charts
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        4. Adjusting the orientation of bars – horizontal and vertical
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        5. Adjusting bar widths, spacing, colors, and borders
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        6. Displaying values on top of or next to the bars
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
          5. See also
        7. Placing labels inside bars
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        8. Creating bar charts with vertical error bars
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
        9. Modifying dot charts by grouping variables
          1. Getting ready
          2. How to do it...
          3. How it works...
        10. Making better, readable pie charts with clockwise-ordered slices
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. See also
        11. Labeling a pie chart with percentage values for each slice
          1. Getting ready
          2. How it works...
          3. There's more...
          4. See also
        12. Adding a legend to a pie chart
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more...
      7. 7. Creating Histograms
        1. Introduction
        2. Visualizing distributions as count frequencies or probability densities
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        3. Setting the bin size and the number of breaks
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        4. Adjusting histogram styles – bar colors, borders, and axes
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        5. Overlaying a density line over a histogram
          1. Getting ready
          2. How to do it...
          3. How it works...
        6. Multiple histograms along the diagonal of a pairs plot
          1. Getting ready
          2. How to do it...
          3. How it works...
        7. Histograms in the margins of line and scatter plots
          1. Getting ready
          2. How to do it...
          3. How it works...
      8. 8. Box and Whisker Plots
        1. Introduction
        2. Creating box plots with narrow boxes for a small number of variables
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        3. Grouping over a variable
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        4. Varying box widths by the number of observations
          1. Getting ready
          2. How to do it...
          3. How it works...
        5. Creating box plots with notches
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        6. Including or excluding outliers
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. See also
        7. Creating horizontal box plots
          1. Getting ready
          2. How to do it...
          3. How it works...
        8. Changing the box styling
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        9. Adjusting the extent of plot whiskers outside the box
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        10. Showing the number of observations
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        11. Splitting a variable at arbitrary values into subsets
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
      9. 9. Creating Heat Maps and Contour Plots
        1. Introduction
        2. Creating heat maps of a single Z variable with a scale
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        3. Creating correlation heat maps
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        4. Summarizing multivariate data in a single heat map
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        5. Creating contour plots
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        6. Creating filled contour plots
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        7. Creating three-dimensional surface plots
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        8. Visualizing time series as calendar heat maps
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
      10. 10. Creating Maps
        1. Introduction
        2. Plotting global data by countries on a world map
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        3. Creating graphs with regional maps
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        4. Plotting data on Google maps
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        5. Creating and reading KML data
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. See Also
        6. Working with ESRI shapefiles
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
      11. 11. Data Visualization Using Lattice
        1. Introduction
        2. Creating bar charts
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
          5. See also
        3. Creating stacked bar charts
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
          5. See also
        4. Creating bar charts to visualize cross-tabulation
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
        5. Creating a conditional histogram
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
          5. See also
        6. Visualizing distributions through a kernel-density plot
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
        7. Creating a normal Q-Q plot
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
        8. Visualizing an empirical Cumulative Distribution Function
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
        9. Creating a boxplot
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
        10. Creating a conditional scatter plot
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
      12. 12. Data Visualization Using ggplot2
        1. Introduction
        2. Creating bar charts
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
          5. See also
        3. Creating multiple bar charts
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
          5. See also
        4. Creating a bar chart with error bars
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
        5. Visualizing the density of a numeric variable
          1. Getting ready
          2. How to do it...
          3. How it works…
          4. There's more...
        6. Creating a box plot
          1. Getting ready
          2. How to do it...
          3. How it works…
        7. Creating a layered plot with a scatter plot and fitted line
          1. Getting ready
          2. How to do it...
          3. How it works…
          4. There's more...
        8. Creating a line chart
          1. Getting ready
          2. How to do it...
          3. How it works…
          4. There's more...
        9. Graph annotation with ggplot
          1. Getting ready
          2. How to do it...
          3. How it works...
      13. 13. Inspecting Large Datasets
        1. Introduction
        2. Multivariate continuous data visualization
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
          5. See also
        3. Multivariate categorical data visualization
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
        4. Visualizing mixed data
          1. Getting ready
          2. How to do it…
        5. Zooming and filtering
          1. Getting ready
          2. How to do it...
          3. How it works…
          4. There's more...
      14. 14. Three-dimensional Visualizations
        1. Introduction
        2. Three-dimensional scatter plots
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
          5. See also...
        3. Three-dimensional scatter plots with a regression plane
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
        4. Three-dimensional bar charts
          1. Getting ready
          2. How to do it…
          3. How it works…
        5. Three-dimensional density plots
          1. Getting ready
          2. How to do it...
          3. How it works…
      15. 15. Finalizing Graphs for Publications and Presentations
        1. Introduction
        2. Exporting graphs in high-resolution image formats – PNG, JPEG, BMP, and TIFF
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        3. Exporting graphs in vector formats – SVG, PDF, and PS
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        4. Adding mathematical and scientific notations (typesetting)
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        5. Adding text descriptions to graphs
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        6. Using graph templates
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
        7. Choosing font families and styles under Windows, Mac OS X, and Linux
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
          5. See also
        8. Choosing fonts for PostScripts and PDFs
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more
    5. III. Module 3: Learning Data Mining with R
      1. 1. Warming Up
        1. Big data
          1. Scalability and efficiency
        2. Data source
        3. Data mining
          1. Feature extraction
          2. Summarization
          3. The data mining process
            1. CRISP-DM
            2. SEMMA
        4. Social network mining
          1. Social network
        5. Text mining
          1. Information retrieval and text mining
          2. Mining text for prediction
        6. Web data mining
        7. Why R?
          1. What are the disadvantages of R?
        8. Statistics
          1. Statistics and data mining
          2. Statistics and machine learning
          3. Statistics and R
          4. The limitations of statistics on data mining
        9. Machine learning
          1. Approaches to machine learning
          2. Machine learning architecture
        10. Data attributes and description
          1. Numeric attributes
          2. Categorical attributes
          3. Data description
          4. Data measuring
        11. Data cleaning
          1. Missing values
          2. Junk, noisy data, or outlier
        12. Data integration
        13. Data dimension reduction
          1. Eigenvalues and Eigenvectors
          2. Principal-Component Analysis
          3. Singular-value decomposition
          4. CUR decomposition
        14. Data transformation and discretization
          1. Data transformation
          2. Normalization data transformation methods
          3. Data discretization
        15. Visualization of results
          1. Visualization with R
      2. 2. Mining Frequent Patterns, Associations, and Correlations
        1. An overview of associations and patterns
          1. Patterns and pattern discovery
            1. The frequent itemset
            2. The frequent subsequence
            3. The frequent substructures
          2. Relationship or rules discovery
            1. Association rules
            2. Correlation rules
        2. Market basket analysis
          1. The market basket model
          2. A-Priori algorithms
            1. Input data characteristics and data structure
            2. The A-Priori algorithm
            3. The R implementation
            4. A-Priori algorithm variants
          3. The Eclat algorithm
            1. The R implementation
          4. The FP-growth algorithm
            1. Input data characteristics and data structure
            2. The FP-growth algorithm
            3. The R implementation
          5. The GenMax algorithm with maximal frequent itemsets
            1. The R implementation
          6. The Charm algorithm with closed frequent itemsets
            1. The R implementation
          7. The algorithm to generate association rules
            1. The R implementation
        3. Hybrid association rules mining
          1. Mining multilevel and multidimensional association rules
          2. Constraint-based frequent pattern mining
        4. Mining sequence dataset
          1. Sequence dataset
          2. The GSP algorithm
        5. The R implementation
          1. The SPADE algorithm
            1. The R implementation
          2. Rule generation from sequential patterns
        6. High-performance algorithms
      3. 3. Classification
        1. Classification
        2. Generic decision tree induction
          1. Attribute selection measures
          2. Tree pruning
          3. General algorithm for the decision tree generation
          4. The R implementation
        3. High-value credit card customers classification using ID3
          1. The ID3 algorithm
          2. The R implementation
          3. Web attack detection
          4. High-value credit card customers classification
        4. Web spam detection using C4.5
          1. The C4.5 algorithm
          2. The R implementation
          3. A parallel version with MapReduce
          4. Web spam detection
        5. Web key resource page judgment using CART
          1. The CART algorithm
          2. The R implementation
          3. Web key resource page judgment
        6. Trojan traffic identification method and Bayes classification
          1. Estimating
            1. Prior probability estimation
            2. Likelihood estimation
          2. The Bayes classification
          3. The R implementation
          4. Trojan traffic identification method
        7. Identify spam e-mail and Naïve Bayes classification
          1. The Naïve Bayes classification
          2. The R implementation
          3. Identify spam e-mail
        8. Rule-based classification of player types in computer games and rule-based classification
          1. Transformation from decision tree to decision rules
          2. Rule-based classification
          3. Sequential covering algorithm
          4. The RIPPER algorithm
            1. The R implementation
          5. Rule-based classification of player types in computer games
      4. 4. Advanced Classification
        1. Ensemble (EM) methods
          1. The bagging algorithm
          2. The boosting and AdaBoost algorithms
          3. The Random forests algorithm
          4. The R implementation
          5. Parallel version with MapReduce
        2. Biological traits and the Bayesian belief network
          1. The Bayesian belief network (BBN) algorithm
          2. The R implementation
          3. Biological traits
        3. Protein classification and the k-Nearest Neighbors algorithm
          1. The kNN algorithm
          2. The R implementation
        4. Document retrieval and Support Vector Machine
          1. The SVM algorithm
          2. The R implementation
          3. Parallel version with MapReduce
          4. Document retrieval
        5. Classification using frequent patterns
          1. The associative classification
            1. CBA
          2. Discriminative frequent pattern-based classification
          3. The R implementation
          4. Text classification using sentential frequent itemsets
        6. Classification using the backpropagation algorithm
          1. The BP algorithm
          2. The R implementation
          3. Parallel version with MapReduce
      5. 5. Cluster Analysis
        1. Search engines and the k-means algorithm
          1. The k-means clustering algorithm
          2. The kernel k-means algorithm
          3. The k-modes algorithm
          4. The R implementation
          5. Parallel version with MapReduce
          6. Search engine and web page clustering
        2. Automatic abstraction of document texts and the k-medoids algorithm
          1. The PAM algorithm
          2. The R implementation
          3. Automatic abstraction and summarization of document text
        3. The CLARA algorithm
          1. The CLARA algorithm
          2. The R implementation
        4. CLARANS
          1. The CLARANS algorithm
          2. The R implementation
        5. Unsupervised image categorization and affinity propagation clustering
          1. Affinity propagation clustering
          2. The R implementation
          3. Unsupervised image categorization
          4. The spectral clustering algorithm
          5. The R implementation
        6. News categorization and hierarchical clustering
          1. Agglomerative hierarchical clustering
          2. The BIRCH algorithm
          3. The chameleon algorithm
          4. The Bayesian hierarchical clustering algorithm
          5. The probabilistic hierarchical clustering algorithm
          6. The R implementation
          7. News categorization
      6. 6. Advanced Cluster Analysis
        1. Customer categorization analysis of e-commerce and DBSCAN
          1. The DBSCAN algorithm
          2. Customer categorization analysis of e-commerce
        2. Clustering web pages and OPTICS
          1. The OPTICS algorithm
          2. The R implementation
          3. Clustering web pages
        3. Visitor analysis in the browser cache and DENCLUE
          1. The DENCLUE algorithm
          2. The R implementation
          3. Visitor analysis in the browser cache
        4. Recommendation system and STING
          1. The STING algorithm
          2. The R implementation
          3. Recommendation systems
        5. Web sentiment analysis and CLIQUE
          1. The CLIQUE algorithm
          2. The R implementation
          3. Web sentiment analysis
        6. Opinion mining and WAVE clustering
          1. The WAVE cluster algorithm
          2. The R implementation
          3. Opinion mining
        7. User search intent and the EM algorithm
          1. The EM algorithm
          2. The R implementation
          3. The user search intent
        8. Customer purchase data analysis and clustering high-dimensional data
          1. The MAFIA algorithm
          2. The SURFING algorithm
          3. The R implementation
          4. Customer purchase data analysis
        9. SNS and clustering graph and network data
          1. The SCAN algorithm
          2. The R implementation
          3. Social networking service (SNS)
      7. 7. Outlier Detection
        1. Credit card fraud detection and statistical methods
          1. The likelihood-based outlier detection algorithm
          2. The R implementation
          3. Credit card fraud detection
        2. Activity monitoring – the detection of fraud involving mobile phones and proximity-based methods
          1. The NL algorithm
          2. The FindAllOutsM algorithm
          3. The FindAllOutsD algorithm
          4. The distance-based algorithm
          5. The Dolphin algorithm
          6. The R implementation
          7. Activity monitoring and the detection of mobile fraud
        3. Intrusion detection and density-based methods
          1. The OPTICS-OF algorithm
          2. The High Contrast Subspace algorithm
          3. The R implementation
          4. Intrusion detection
        4. Intrusion detection and clustering-based methods
          1. Hierarchical clustering to detect outliers
          2. The k-means-based algorithm
          3. The ODIN algorithm
          4. The R implementation
        5. Monitoring the performance of the web server and classification-based methods
          1. The OCSVM algorithm
          2. The one-class nearest neighbor algorithm
          3. The R implementation
          4. Monitoring the performance of the web server
        6. Detecting novelty in text, topic detection, and mining contextual outliers
          1. The conditional anomaly detection (CAD) algorithm
          2. The R implementation
          3. Detecting novelty in text and topic detection
        7. Collective outliers on spatial data
          1. The route outlier detection (ROD) algorithm
          2. The R implementation
          3. Characteristics of collective outliers
        8. Outlier detection in high-dimensional data
          1. The brute-force algorithm
          2. The HilOut algorithm
          3. The R implementation
      8. 8. Mining Stream, Time-series, and Sequence Data
        1. The credit card transaction flow and STREAM algorithm
          1. The STREAM algorithm
          2. The single-pass-any-time clustering algorithm
          3. The R implementation
          4. The credit card transaction flow
        2. Predicting future prices and time-series analysis
          1. The ARIMA algorithm
          2. Predicting future prices
        3. Stock market data and time-series clustering and classification
          1. The hError algorithm
          2. Time-series classification with the 1NN classifier
          3. The R implementation
          4. Stock market data
        4. Web click streams and mining symbolic sequences
          1. The TECNO-STREAMS algorithm
          2. The R implementation
          3. Web click streams
        5. Mining sequence patterns in transactional databases
          1. The PrefixSpan algorithm
          2. The R implementation
      9. 9. Graph Mining and Network Analysis
        1. Graph mining
          1. Graph
          2. Graph mining algorithms
        2. Mining frequent subgraph patterns
          1. The gPLS algorithm
          2. The GraphSig algorithm
          3. The gSpan algorithm
          4. Rightmost path extensions and their supports
          5. The subgraph isomorphism enumeration algorithm
          6. The canonical checking algorithm
          7. The R implementation
        3. Social network mining
          1. Community detection and the shingling algorithm
          2. The node classification and iterative classification algorithms
          3. The R implementation
      10. 10. Mining Text and Web Data
        1. Text mining and TM packages
        2. Text summarization
          1. Topic representation
          2. The multidocument summarization algorithm
          3. The Maximal Marginal Relevance algorithm
          4. The R implementation
        3. The question answering system
        4. Genre categorization of web pages
        5. Categorizing newspaper articles and newswires into topics
          1. The N-gram-based text categorization
          2. The R implementation
        6. Web usage mining with web logs
          1. The FCA-based association rule mining algorithm
          2. The R implementation
    6. IV. Module 4: Mastering R for Quantitative Finance
      1. 1. Time Series Analysis
        1. Multivariate time series analysis
          1. Cointegration
          2. Vector autoregressive models
            1. VAR implementation example
          3. Cointegrated VAR and VECM
        2. Volatility modeling
          1. GARCH modeling with the rugarch package
            1. The standard GARCH model
            2. The Exponential GARCH model (EGARCH)
            3. The Threshold GARCH model (TGARCH)
          2. Simulation and forecasting
        3. References and reading list
      2. 2. Factor Models
        1. Arbitrage pricing theory
          1. Implementation of APT
          2. Fama-French three-factor model
        2. Modeling in R
          1. Data selection
          2. Estimation of APT with principal component analysis
          3. Estimation of the Fama-French model
        3. References
      3. 3. Forecasting Volume
        1. Motivation
        2. The intensity of trading
        3. The volume forecasting model
        4. Implementation in R
          1. The data
            1. Loading the data
            2. The seasonal component
            3. AR(1) estimation and forecasting
            4. SETAR estimation and forecasting
            5. Interpreting the results
          2. References
      4. 4. Big Data – Advanced Analytics
        1. Getting data from open sources
        2. Introduction to big data analysis in R
        3. K-means clustering on big data
          1. Loading big matrices
          2. Big data K-means clustering analysis
        4. Big data linear regression analysis
          1. Loading big data
          2. Fitting a linear regression model on large datasets
        5. References
      5. 5. FX Derivatives
        1. Terminology and notations
        2. Currency options
        3. Exchange options
          1. Two-dimensional Wiener processes
          2. The Margrabe formula
          3. Application in R
        4. Quanto options
          1. Pricing formula for a call quanto
          2. Pricing a call quanto in R
        5. References
      6. 6. Interest Rate Derivatives and Models
        1. The Black model
          1. Pricing a cap with Black's model
        2. The Vasicek model
        3. The Cox-Ingersoll-Ross model
        4. Parameter estimation of interest rate models
        5. Using the SMFI5 package
        6. References
      7. 7. Exotic Options
        1. A general pricing approach
        2. The role of dynamic hedging
        3. How R can help a lot
        4. A glance beyond vanillas
        5. Greeks – the link back to the vanilla world
        6. Pricing the Double-no-touch option
        7. Another way to price the Double-no-touch option
        8. The life of a Double-no-touch option – a simulation
        9. Exotic options embedded in structured products
        10. References
      8. 8. Optimal Hedging
        1. Hedging of derivatives
          1. Market risk of derivatives
          2. Static delta hedge
          3. Dynamic delta hedge
          4. Comparing the performance of delta hedging
        2. Hedging in the presence of transaction costs
          1. Optimization of the hedge
          2. Optimal hedging in the case of absolute transaction costs
          3. Optimal hedging in the case of relative transaction costs
        3. Further extensions
        4. References
      9. 9. Fundamental Analysis
        1. The basics of fundamental analysis
        2. Collecting data
        3. Revealing connections
        4. Including multiple variables
        5. Separating investment targets
        6. Setting classification rules
        7. Backtesting
        8. Industry-specific investment
        9. References
      10. 10. Technical Analysis, Neural Networks, and Logoptimal Portfolios
        1. Market efficiency
        2. Technical analysis
          1. The TA toolkit
          2. Markets
          3. Plotting charts - bitcoin
          4. Built-in indicators
            1. SMA and EMA
            2. RSI
            3. MACD
          5. Candle patterns: key reversal
          6. Evaluating the signals and managing the position
          7. A word on money management
          8. Wraping up
        3. Neural networks
          1. Forecasting bitcoin prices
            1. Evaluation of the strategy
        4. Logoptimal portfolios
          1. A universally consistent, non-parametric investment strategy
          2. Evaluation of the strategy
        5. References
      11. 11. Asset and Liability Management
        1. Data preparation
          1. Data source at first glance
          2. Cash-flow generator functions
          3. Preparing the cash-flow
        2. Interest rate risk measurement
        3. Liquidity risk measurement
        4. Modeling non-maturity deposits
          1. A Model of deposit interest rate development
          2. Static replication of non-maturity deposits
        5. References
      12. 12. Capital Adequacy
        1. Principles of the Basel Accords
          1. Basel I
          2. Basel II
            1. Minimum capital requirements
            2. Supervisory review
            3. Transparency
          3. Basel III
        2. Risk measures
          1. Analytical VaR
          2. Historical VaR
          3. Monte-Carlo simulation
        3. Risk categories
          1. Market risk
          2. Credit risk
          3. Operational risk
        4. References
      13. 13. Systemic Risks
        1. Systemic risk in a nutshell
        2. The dataset used in our examples
        3. Core-periphery decomposition
          1. Implementation in R
          2. Results
        4. The simulation method
          1. The simulation
          2. Implementation in R
          3. Results
        5. Possible interpretations and suggestions
        6. References
    7. V. Module 5: Machine Learning with R module
      1. 1. Introducing Machine Learning
        1. The origins of machine learning
        2. Uses and abuses of machine learning
          1. Machine learning successes
          2. The limits of machine learning
          3. Machine learning ethics
        3. How machines learn
          1. Data storage
          2. Abstraction
          3. Generalization
          4. Evaluation
        4. Machine learning in practice
          1. Types of input data
          2. Types of machine learning algorithms
          3. Matching input data to algorithms
        5. Machine learning with R
          1. Installing R packages
          2. Loading and unloading R packages
      2. 2. Managing and Understanding Data
        1. R data structures
          1. Vectors
          2. Factors
          3. Lists
          4. Data frames
          5. Matrixes and arrays
        2. Managing data with R
          1. Saving, loading, and removing R data structures
          2. Importing and saving data from CSV files
        3. Exploring and understanding data
          1. Exploring the structure of data
          2. Exploring numeric variables
            1. Measuring the central tendency – mean and median
            2. Measuring spread – quartiles and the five-number summary
            3. Visualizing numeric variables – boxplots
            4. Visualizing numeric variables – histograms
            5. Understanding numeric data – uniform and normal distributions
            6. Measuring spread – variance and standard deviation
          3. Exploring categorical variables
            1. Measuring the central tendency – the mode
          4. Exploring relationships between variables
            1. Visualizing relationships – scatterplots
            2. Examining relationships – two-way cross-tabulations
      3. 3. Lazy Learning – Classification Using Nearest Neighbors
        1. Understanding nearest neighbor classification
          1. The k-NN algorithm
            1. Measuring similarity with distance
            2. Choosing an appropriate k
            3. Preparing data for use with k-NN
          2. Why is the k-NN algorithm lazy?
        2. Example – diagnosing breast cancer with the k-NN algorithm
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
            1. Transformation – normalizing numeric data
            2. Data preparation – creating training and test datasets
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
            1. Transformation – z-score standardization
            2. Testing alternative values of k
      4. 4. Probabilistic Learning – Classification Using Naive Bayes
        1. Understanding Naive Bayes
          1. Basic concepts of Bayesian methods
            1. Understanding probability
            2. Understanding joint probability
            3. Computing conditional probability with Bayes' theorem
          2. The Naive Bayes algorithm
            1. Classification with Naive Bayes
            2. The Laplace estimator
            3. Using numeric features with Naive Bayes
        2. Example – filtering mobile phone spam with the Naive Bayes algorithm
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
            1. Data preparation – cleaning and standardizing text data
            2. Data preparation – splitting text documents into words
            3. Data preparation – creating training and test datasets
            4. Visualizing text data – word clouds
            5. Data preparation – creating indicator features for frequent words
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
      5. 5. Divide and Conquer – Classification Using Decision Trees and Rules
        1. Understanding decision trees
          1. Divide and conquer
          2. The C5.0 decision tree algorithm
            1. Choosing the best split
            2. Pruning the decision tree
        2. Example – identifying risky bank loans using C5.0 decision trees
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
            1. Data preparation – creating random training and test datasets
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
            1. Boosting the accuracy of decision trees
            2. Making mistakes more costlier than others
        3. Understanding classification rules
          1. Separate and conquer
          2. The 1R algorithm
          3. The RIPPER algorithm
          4. Rules from decision trees
          5. What makes trees and rules greedy?
        4. Example – identifying poisonous mushrooms with rule learners
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
      6. 6. Forecasting Numeric Data – Regression Methods
        1. Understanding regression
          1. Simple linear regression
          2. Ordinary least squares estimation
          3. Correlations
          4. Multiple linear regression
        2. Example – predicting medical expenses using linear regression
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
            1. Exploring relationships among features – the correlation matrix
            2. Visualizing relationships among features – the scatterplot matrix
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
            1. Model specification – adding non-linear relationships
            2. Transformation – converting a numeric variable to a binary indicator
            3. Model specification – adding interaction effects
            4. Putting it all together – an improved regression model
        3. Understanding regression trees and model trees
          1. Adding regression to trees
        4. Example – estimating the quality of wines with regression trees and model trees
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
          3. Step 3 – training a model on the data
            1. Visualizing decision trees
          4. Step 4 – evaluating model performance
            1. Measuring performance with the mean absolute error
          5. Step 5 – improving model performance
      7. 7. Black Box Methods – Neural Networks and Support Vector Machines
        1. Understanding neural networks
          1. From biological to artificial neurons
          2. Activation functions
          3. Network topology
            1. The number of layers
            2. The direction of information travel
            3. The number of nodes in each layer
          4. Training neural networks with backpropagation
        2. Example – Modeling the strength of concrete with ANNs
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
        3. Understanding Support Vector Machines
          1. Classification with hyperplanes
            1. The case of linearly separable data
            2. The case of nonlinearly separable data
          2. Using kernels for non-linear spaces
        4. Example – performing OCR with SVMs
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
      8. 8. Finding Patterns – Market Basket Analysis Using Association Rules
        1. Understanding association rules
          1. The Apriori algorithm for association rule learning
          2. Measuring rule interest – support and confidence
          3. Building a set of rules with the Apriori principle
        2. Example – identifying frequently purchased groceries with association rules
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
            1. Data preparation – creating a sparse matrix for transaction data
            2. Visualizing item support – item frequency plots
            3. Visualizing the transaction data – plotting the sparse matrix
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
            1. Sorting the set of association rules
            2. Taking subsets of association rules
            3. Saving association rules to a file or data frame
      9. 9. Finding Groups of Data – Clustering with k-means
        1. Understanding clustering
          1. Clustering as a machine learning task
          2. The k-means clustering algorithm
            1. Using distance to assign and update clusters
            2. Choosing the appropriate number of clusters
        2. Example – finding teen market segments using k-means clustering
          1. Step 1 – collecting data
          2. Step 2 – exploring and preparing the data
            1. Data preparation – dummy coding missing values
            2. Data preparation – imputing the missing values
          3. Step 3 – training a model on the data
          4. Step 4 – evaluating model performance
          5. Step 5 – improving model performance
      10. 10. Evaluating Model Performance
        1. Measuring performance for classification
          1. Working with classification prediction data in R
          2. A closer look at confusion matrices
          3. Using confusion matrices to measure performance
          4. Beyond accuracy – other measures of performance
            1. The kappa statistic
            2. Sensitivity and specificity
            3. Precision and recall
            4. The F-measure
          5. Visualizing performance trade-offs
            1. ROC curves
        2. Estimating future performance
          1. The holdout method
            1. Cross-validation
            2. Bootstrap sampling
      11. 11. Improving Model Performance
        1. Tuning stock models for better performance
          1. Using caret for automated parameter tuning
            1. Creating a simple tuned model
            2. Customizing the tuning process
        2. Improving model performance with meta-learning
          1. Understanding ensembles
          2. Bagging
          3. Boosting
          4. Random forests
            1. Training random forests
            2. Evaluating random forest performance
      12. 12. Specialized Machine Learning Topics
        1. Working with proprietary files and databases
          1. Reading from and writing to Microsoft Excel, SAS, SPSS, and Stata files
          2. Querying data in SQL databases
        2. Working with online data and services
          1. Downloading the complete text of web pages
          2. Scraping data from web pages
            1. Parsing XML documents
            2. Parsing JSON from web APIs
        3. Working with domain-specific data
          1. Analyzing bioinformatics data
          2. Analyzing and visualizing network data
        4. Improving the performance of R
          1. Managing very large datasets
            1. Generalizing tabular data structures with dplyr
            2. Making data frames faster with data.table
            3. Creating disk-based data frames with ff
            4. Using massive matrices with bigmemory
          2. Learning faster with parallel computing
            1. Measuring execution time
            2. Working in parallel with multicore and snow
            3. Taking advantage of parallel with foreach and doParallel
            4. Parallel cloud computing with MapReduce and Hadoop
          3. GPU computing
          4. Deploying optimized learning algorithms
            1. Building bigger regression models with biglm
            2. Growing bigger and faster random forests with bigrf
            3. Training and evaluating models in parallel with caret
    8. A. Reflect and Test Yourself Answers
      1. Module 1: Data Analysis with R
        1. Chapter 1: RefresheR
        2. Chapter 2: The Shape of Data
        3. Chapter 3: Describing Relationships
        4. Chapter 4: Probability
        5. Chapter 5: Using Data to Reason About the World
        6. Chapter 6: Testing Hypotheses
        7. Chapter 7: Bayesian Methods
        8. Chapter 8: Predicting Continuous Variables
        9. Chapter 9: Predicting Categorical Variables
        10. Chapter 10: Sources of Data
        11. Chapter 11: Dealing with Messy Data
        12. Chapter 12: Dealing with Large Data
      2. Module 2: R Graphs
        1. Chapter 1: R Graphics
        2. Chapter 2: Basic Graph Functions
        3. Chapter 3: Beyond the Basics – Adjusting Key Parameters
        4. Chapter 4: Creating Scatter Plots
        5. Chapter 5: Creating Line Graphs and Time Series Charts
        6. Chapter 6: Creating Bar, Dot, and Pie Charts
        7. Chapter 7: Creating Histograms
        8. Chapter 8: Box and Whisker Plots
        9. Chapter 9: Creating Heat Maps and Contour Plots
      3. Module 4: Mastering R for Quantitative Finance
        1. Chapter 1: Time Series Analysis
        2. Chapter 3: Forecasting Volume
        3. Chapter 4: Big Data – Advanced Analytics
        4. Chapter 5: FX Derivatives
        5. Chapter 6: Interest Rate Derivatives and Models
        6. Chapter 7: Exotic Options
        7. Chapter 8: Optimal Hedging
        8. Chapter 9: Fundamental Analysis
      4. Module 5: Machine Learning with R
        1. Chapter 1: Introducing Machine Learning
        2. Chapter 2: Managing and Understanding Data
        3. Chapter 3: Lazy Learning – Classification Using Nearest Neighbors
        4. Chapter 4: Probabilistic Learning – Classification Using Naive Bayes
        5. Chapter 5: Divide and Conquer – Classification Using Decision Trees and Rules
        6. Chapter 6: Forecasting Numeric Data – Regression Methods
        7. Chapter 7: Black Box Methods – Neural Networks and Support Vector Machines
        8. Chapter 8: Finding Patterns – Market Basket Analysis Using Association Rules
    9. B. Bibliography
    10. Index

Product information

  • Title: R: Data Analysis and Visualization
  • Author(s): Tony Fischetti, Brett Lantz, Jaynal Abedin, Hrishi V. Mittal, Bater Makhabel, Edina Berlinger, Ferenc Illés, Milán Badics, Ádám Banai, Gergely Daróczi, Barbara Dömötör, Gergely Gabler, Dániel Havran, Péter Juhász, István Margitai, Balázs Márkus, Péter Medvegyev, Julia Molnár, Balázs Árpád Szucs, Ágnes Tuza, Tamás Vadász, Kata Váradi, Ágnes Vidovics-Dancs
  • Release date: June 2016
  • Publisher(s): Packt Publishing
  • ISBN: 9781786463500