Book description
Create data mining algorithms
About This Book- Develop a strong strategy to solve predictive modeling problems using the most popular data mining algorithms
- Real-world case studies will take you from novice to intermediate to apply data mining techniques
- Deploy cutting-edge sentiment analysis techniques to real-world social media data using R
This Learning Path is for R developers who are looking to making a career in data analysis or data mining. Those who come across data mining problems of different complexities from web, text, numerical, political, and social media domains will find all information in this single learning path.
What You Will Learn- Discover how to manipulate data in R
- Get to know top classification algorithms written in R
- Explore solutions written in R based on R Hadoop projects
- Apply data management skills in handling large data sets
- Acquire knowledge about neural network concepts and their applications in data mining
- Create predictive models for classification, prediction, and recommendation
- Use various libraries on R CRAN for data mining
- Discover more about data potential, the pitfalls, and inferencial gotchas
- Gain an insight into the concepts of supervised and unsupervised learning
- Delve into exploratory data analysis
- Understand the minute details of sentiment analysis
Data mining is the first step to understanding data and making sense of heaps of data. Properly mined data forms the basis of all data analysis and computing performed on it. This learning path will take you from the very basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining—social media mining.
You will learn how to manipulate data with R using code snippets and how to mine frequent patterns, association, and correlation while working with R programs. You will discover how to write code for various predication models, stream data, and time-series data. You will also be introduced to solutions written in R based on R Hadoop projects.
Now that you are comfortable with data mining with R, you will move on to implementing your knowledge with the help of end-to-end data mining projects. You will learn how to apply different mining concepts to various statistical and data applications in a wide range of fields. At this stage, you will be able to complete complex data mining cases and handle any issues you might encounter during projects.
After this, you will gain hands-on experience of generating insights from social media data. You will get detailed instructions on how to obtain, process, and analyze a variety of socially-generated data while providing a theoretical background to accurately interpret your findings. You will be shown R code and examples of data that can be used as a springboard as you get the chance to undertake your own analyses of business, social, or political data.
This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:
- Learning Data Mining with R by Bater Makhabel
- R Data Mining Blueprints by Pradeepta Mishra
- Social Media Mining with R by Nathan Danneman and Richard Heimann
A complete package with which will take you from the basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining—social media mining.
Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.
Table of contents
-
R: Mining Spatial, Text, Web, and Social Media Data
- Table of Contents
- R: Mining Spatial, Text, Web, and Social Media Data
- R: Mining Spatial, Text, Web, and Social Media Data
- Credits
- Preface
-
1. Module 1
- 1. Warming Up
-
2. Mining Frequent Patterns, Associations, and Correlations
- An overview of associations and patterns
- Market basket analysis
- Hybrid association rules mining
- Mining sequence dataset
- The R implementation
- High-performance algorithms
- Time for action
- Summary
-
3. Classification
- Classification
- Generic decision tree induction
- High-value credit card customers classification using ID3
- Web spam detection using C4.5
- Web key resource page judgment using CART
- Trojan traffic identification method and Bayes classification
- Identify spam e-mail and Naïve Bayes classification
- Rule-based classification of player types in computer games and rule-based classification
- Time for action
- Summary
- 4. Advanced Classification
- 5. Cluster Analysis
-
6. Advanced Cluster Analysis
- Customer categorization analysis of e-commerce and DBSCAN
- Clustering web pages and OPTICS
- Visitor analysis in the browser cache and DENCLUE
- Recommendation system and STING
- Web sentiment analysis and CLIQUE
- Opinion mining and WAVE clustering
- User search intent and the EM algorithm
- Customer purchase data analysis and clustering high-dimensional data
- SNS and clustering graph and network data
- Time for action
- Summary
-
7. Outlier Detection
- Credit card fraud detection and statistical methods
- Activity monitoring – the detection of fraud involving mobile phones and proximity-based methods
- Intrusion detection and density-based methods
- Intrusion detection and clustering-based methods
- Monitoring the performance of the web server and classification-based methods
- Detecting novelty in text, topic detection, and mining contextual outliers
- Collective outliers on spatial data
- Outlier detection in high-dimensional data
- Time for action
- Summary
- 8. Mining Stream, Time-series, and Sequence Data
- 9. Graph Mining and Network Analysis
- 10. Mining Text and Web Data
- A. Algorithms and Data Structures
-
2. Module 2
-
1. Data Manipulation Using In-built R Data
- What is data mining?
- Introduction to the R programming language
- Data type conversion
- Sorting and merging dataframes
- Indexing or subsetting dataframes
- Date and time formatting
- Creating new functions
- Loop concepts - the for loop
- Loop concepts - the repeat loop
- Loop concepts - while conditions
- Apply concepts
- String manipulation
- NA and missing value management
- Missing value imputation techniques
- Summary
-
2. Exploratory Data Analysis with Automobile Data
- Univariate data analysis
- Bivariate analysis
- Multivariate analysis
- Understanding distributions and transformation
- Interpreting distributions
- Variable binning or discretizing continuous data
- Contingency tables, bivariate statistics, and checking for data normality
- Hypothesis testing
- Non-parametric methods
- Summary
- 3. Visualize Diamond Dataset
- 4. Regression with Automobile Data
- 5. Market Basket Analysis with Groceries Data
- 6. Clustering with E-commerce Data
- 7. Building a Retail Recommendation Engine
- 8. Dimensionality Reduction
- 9. Applying Neural Network to Healthcare Data
-
1. Data Manipulation Using In-built R Data
-
3. Module 3
- 1. Going Viral
- 2. Getting Started with R
- 3. Mining Twitter with R
- 4. Potentials and Pitfalls of Social Media Data
-
5. Social Media Mining – Fundamentals
- Key concepts of social media mining
- Good data versus bad data
- Understanding sentiments
- Sentiment polarity – data and classification
- Supervised social media mining – lexicon-based sentiment
- Supervised social media mining – Naive Bayes classifiers
- Unsupervised social media mining – Item Response Theory for text scaling
- Summary
- 6. Social Media Mining – Case Studies
- A. Conclusions and Next Steps
- Bibliography
- Index
Product information
- Title: R: Mining Spatial, Text, Web, and Social Media Data
- Author(s):
- Release date: June 2017
- Publisher(s): Packt Publishing
- ISBN: 9781788293747
You might also like
book
Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications
Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications brings together all the information, …
book
R Data Mining
Mine valuable insights from your data using popular tools and techniques in R About This Book …
book
R Machine Learning Projects
Master a range of machine learning domains with real-world projects using TensorFlow for R, H2O, MXNet, …
book
Develop Intelligent iOS Apps with Swift: Understand Texts, Classify Sentiments, and Autodetect Answers in Text Using NLP
Build smart apps capable of analyzing language and performing language-specific tasks, such as script identification, tokenization, …