With Early Release ebooks, you get books in their earliest form—the author's raw and unedited content as he or she writes—so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters are available, and the final ebook bundle is released.
Feature engineering is essential to applied machine learning, but using domain knowledge to strengthen your predictive models can be difficult and expensive. To help fill the information gap on feature engineering, this complete hands-on guide teaches beginning-to-intermediate data scientists how to work with this widely practiced but little discussed topic.
Author Alice Zheng explains common practices and mathematical principles to help engineer features for new data and tasks. If you understand basic machine learning concepts like supervised and unsupervised learning, you’re ready to get started. Not only will you learn how to implement feature engineering in a systematic and principled way, you’ll also learn how to practice better data science.
Learn exactly what feature engineering is, why it’s important, and how to do it well
Explore various techniques such as feature scaling, bin-counting, and frequent sequence mining
Understand what is unsupervised feature learning and how it works in deep learning
See the methods in action for text mining, image tagging, churn prediction, and targeting advertising
Chapter 2Basic Feature Engineering for Text Data: Flatten and Filter
Chapter 3The Effects of Feature Scaling: From Bag-of-Words to Tf-Idf
Chapter 4Categorical data: Counting eggs in the age of robotic chickens
Chapter 5Sequences and Series: Dealing with Event Logs
Chapter 6Dimensionality Reduction: Squashing the Data Pancake with PCA
Chapter 7Turning images into features: Feature learning with deep learning
Chapter 8Metrics and metric learning
Chapter 9Miscellaneous but important
Appendix ALinear Modeling and Linear Algebra Basics
Alice is a technical leader in the field of Machine Learning. Her experience spans algorithm and platform development and applications. Currently, she is a Senior Manager in Amazon's Ad Platform. Previous roles include Director of Data Science at GraphLab/Dato/Turi, machine learning researcher at Microsoft Research, Redmond, and postdoctoral fellow at Carnegie Mellon University. She received a Ph.D. in Electrical Engineering and Computer science, and B.A. degrees in Computer Science in Mathematics, all from U.C. Berkeley.