Book description
This book covers the fundamentals of machine learning with Python in a concise and dynamic manner. It covers data mining and large-scale machine learning using Apache Spark.
About This Book
Take your first steps in the world of data science by understanding the tools and techniques of data analysis
Train efficient Machine Learning models in Python using the supervised and unsupervised learning methods
Learn how to use Apache Spark for processing Big Data efficiently
Who This Book Is For
If you are a budding data scientist or a data analyst who wants to analyze and gain actionable insights from data using Python, this book is for you. Programmers with some experience in Python who want to enter the lucrative world of Data Science will also find this book to be very useful, but you don't need to be an expert Python coder or mathematician to get the most from this book.
What You Will Learn
Learn how to clean your data and ready it for analysis
Implement the popular clustering and regression methods in Python
Train efficient machine learning models using decision trees and random forests
Visualize the results of your analysis using Python’s Matplotlib library
Use Apache Spark’s MLlib package to perform machine learning on large datasets
In Detail
Join Frank Kane, who worked on Amazon and IMDb’s machine learning algorithms, as he guides you on your first steps into the world of data science. Hands-On Data Science and Python Machine Learning gives you the tools that you need to understand and explore the core topics in the field, and the confidence and practice to build and analyze your own machine learning models. With the help of interesting and easy-to-follow practical examples, Frank Kane explains potentially complex topics such as Bayesian methods and K-means clustering in a way that anybody can understand them.
Based on Frank’s successful data science course, Hands-On Data Science and Python Machine Learning empowers you to conduct data analysis and perform efficient machine learning using Python. Let Frank help you unearth the value in your data using the various data mining and data analysis techniques available in Python, and to develop efficient predictive models to predict future results. You will also learn how to perform large-scale machine learning on Big Data using Apache Spark. The book covers preparing your data for analysis, training machine learning models, and visualizing the final data analysis.
Style and approach
This comprehensive book is a perfect blend of theory and hands-on code examples in Python which can be used for your reference at any time.
Table of contents
- Preface
- Getting Started
- Statistics and Probability Refresher, and Python Practice
-
Matplotlib and Advanced Probability Concepts
-
A crash course in Matplotlib
- Generating multiple plots on one graph
- Saving graphs as images
- Adjusting the axes
- Adding a grid
- Changing line types and colors
- Labeling axes and adding a legend
- A fun example
- Generating pie charts
- Generating bar charts
- Generating scatter plots
- Generating histograms
- Generating box-and-whisker plots
- Try it yourself
- Covariance and correlation
- Conditional probability
- Bayes' theorem
- Summary
-
A crash course in Matplotlib
- Predictive Models
-
Machine Learning with Python
- Machine learning and train/test
- Using train/test to prevent overfitting of a polynomial regression
- Bayesian methods - Concepts
- Implementing a spam classifier with Naïve Bayes
- K-Means clustering
- Clustering people based on income and age
- Measuring entropy
- Decision trees - Concepts
- Decision trees - Predicting hiring decisions using Python
- Ensemble learning
- Support vector machine overview
- Using SVM to cluster people by using scikit-learn
- Summary
- Recommender Systems
- More Data Mining and Machine Learning Techniques
-
Dealing with Real-World Data
- Bias/variance trade-off
- K-fold cross-validation to avoid overfitting
- Data cleaning and normalisation
- Cleaning web log data
- Normalizing numerical data
- Detecting outliers
- Summary
- Apache Spark - Machine Learning on Big Data
- Testing and Experimental Design
Product information
- Title: Hands-On Data Science and Python Machine Learning
- Author(s):
- Release date: July 2017
- Publisher(s): Packt Publishing
- ISBN: 9781787280748
You might also like
video
Machine Learning, Data Science and Generative AI with Python
This course begins with a Python crash course and then guides you on setting up Microsoft …
book
Python for Data Science
Python is an ideal choice for accessing, manipulating, and gaining insights from data of all kinds. …
book
Machine Learning for Time-Series with Python
Get better insights from time-series data and become proficient in model performance analysis Key Features Explore …
book
Introduction to Machine Learning with Python
Machine learning has become an integral part of many commercial applications and research projects, but this …