Machine Learning for Email
Spam Filtering and Priority Inbox
Publisher: O'Reilly Media
Final Release Date: October 2011
Pages: 148

If you’re an experienced programmer willing to crunch data, this concise guide will show you how to use machine learning to work with email. You’ll learn how to write algorithms that automatically sort and redirect email based on statistical patterns. Authors Drew Conway and John Myles White approach the process in a practical fashion, using a case-study driven approach rather than a traditional math-heavy presentation.

This book also includes a short tutorial on using the popular R language to manipulate and analyze data. You’ll get clear examples for analyzing sample data and writing machine learning programs with R.

  • Mine email content with R functions, using a collection of sample files
  • Analyze the data and use the results to write a Bayesian spam classifier
  • Rank email by importance, using factors such as thread activity
  • Use your email ranking analysis to write a priority inbox program
  • Test your classifier and priority inbox with a separate email sample set
Table of Contents
Product Details
About the Author
Recommended for You
Customer Reviews


by PowerReviews
oreillyMachine Learning for Email

(based on 1 review)

Ratings Distribution

  • 5 Stars



  • 4 Stars



  • 3 Stars



  • 2 Stars



  • 1 Stars



Reviewed by 1 customer

Displaying review 1

Back to top

(1 of 1 customers found this review helpful)


practical machine learning is introduced

By hu

from Tokyo, Japan

About Me Developer

Verified Reviewer


  • Accurate
  • Concise
  • Easy to understand
  • Helpful examples


  • Too basic

Best Uses

  • Intermediate
  • Novice
  • Student

Comments about oreilly Machine Learning for Email:

Thanks to R programming language, the reader could concentrate on the main purpose to understand the core procedures related to machine learning. Because the author explains the codes precisely also, the readers could understand the technologies clearly even if they can't understand some part of the codes.
Regarding the introduced machine learning methods, they are just basic statistical methods. Some of the readers having experience working with machine learning prior may feel a little tired. However, the introduced approach is enough to classify spam and ham.
In addition to the classification of spam and ham, this book introduced a way how to rank emails with many practical idea.
- if a period an user sends the response after viewing is short, it would be important email for him.
- if a period an user interacts with a thread is long, the thread would be important for him. Therefore, the terms included in the thread are ranked as high.
Through the book, because the sample codes use practical sample email data which can be obtained from the web, the introduced machine learning methods address practical use case though simple.

Displaying review 1

Back to top

Buy 2 Get 1 Free Free Shipping Guarantee
Buying Options
Immediate Access - Go Digital what's this?
Ebook:  $20.99
Formats:  DAISY, ePub, Mobi, PDF
Print & Ebook:  $27.49
Print:  $24.99