Machine Learning for Hackers
Case Studies and Algorithms to Get You Started
Publisher: O'Reilly Media
Released: February 2012
Pages: 324

If you’re an experienced programmer interested in crunching data, this book will get you started with machine learning—a toolkit of algorithms that enables computers to train themselves to automate useful tasks. Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation.

Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you’ll learn how to analyze sample datasets and write simple machine learning algorithms. Machine Learning for Hackers is ideal for programmers from any background, including business, government, and academic research.

  • Develop a naïve Bayesian classifier to determine if an email is spam, based only on its text
  • Use linear regression to predict the number of page views for the top 1,000 websites
  • Learn optimization techniques by attempting to break a simple letter cipher
  • Compare and contrast U.S. Senators statistically, based on their voting records
  • Build a “whom to follow” recommendation system from Twitter data
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyMachine Learning for Hackers
 
3.5

(based on 10 reviews)

Ratings Distribution

  • 5 Stars

     

    (2)

  • 4 Stars

     

    (4)

  • 3 Stars

     

    (2)

  • 2 Stars

     

    (1)

  • 1 Stars

     

    (1)

60%

of respondents would recommend this to a friend.

Pros

  • Helpful examples (5)
  • Well-written (5)
  • Easy to understand (4)

Cons

    Best Uses

    • Intermediate (6)
      • Reviewer Profile:
      • Developer (7)

    Reviewed by 10 customers

    Sort by

    Displaying reviews 1-10

    Back to top

    (5 of 5 customers found this review helpful)

     
    2.0

    There's just too much missing

    By Rob

    from Seattle, WA

    About Me Developer

    Verified Reviewer

    Pros

    • Easy to understand
    • Well-written

    Cons

    • Not comprehensive enough
    • Too basic

    Best Uses

      Comments about oreilly Machine Learning for Hackers:

      This was a pretty disappointing text.

      I'm reading this as an experienced programmer and hobbyist AI/ML/Statistics guru, and there's just too much that's missing for me to recommend this book. It reads less like "Machine Learning for Hackers" and more like "Statistics for People Who Want to Use R Without Understanding the Fundamentals." I found myself excited at the beginning of the chapter and disappointed by how little actual detail or information was provided beyond "type these commands to get numbers and hope for a good number."

      Chapter 12 ("Model Comparison") is a great example of this. While talking about SVMs, this is a snippet of what's provided:

      "As you can see from looking at Figure 12-6, the rather complicated decision boundary chosen by the sigmoid kernel wraps around as we change the value of gamma. To really get a better intuition for what's happening, we recommend that you experiment with many more values of gamma than the four we've just shown you."

      ... Really? To get a better feel for what's happening, I should just try more values for gamma? There is no mention of what, fundamentally, gamma is. The reader is supposed to just try different values and not worry about any details. This is one example, but it is a good example of how disappointed I was near the end of many chapters.

      I understand this book is targeted at beginners, but the number of times the author glosses over (or cleverly avoids) actually explaining an incredibly fundamental piece of a chapter leaves the reader wondering if the authors genuinely understand the material themselves.

      I'm giving it two stars because it's easy to read, there are decent suggestions as to which R packages are useful for statistical analysis, and because I enjoyed the overfitting examples and graphical depictions early on in the book.

      The book is not terrible, but it is lacking.

      (1 of 1 customers found this review helpful)

       
      3.0

      Basic machine learning theory

      By dahla

      from Ringsted, Denmark

      About Me Developer

      Verified Reviewer

      Pros

      • Practical Examples

      Cons

      • Not for beginners

      Best Uses

        Comments about oreilly Machine Learning for Hackers:

        I've long been fascinated by Artificial Intelligence and wanted to get started without knowing where to begin. This is why I picked up this book, thinking this would be a good starting point.
        Truth be told, this was a good book and gave some insight, but not what I was currently looking for though. So for beginners into AI this is not the starting point.
        What this book did give me though, was a brush-up on statistics, predictions and an introduction to R. Going through the book the author starts building up knowledge on how to use predictions, estimates, clustering and similar techniques in order to make a machine learn to know what to do next based on previous events. The theory is then backed up by practical using the language R.
        The one chapter I liked the most was about building a simple recommendation engine on who to follow on Twitter based on your current profile. That sample got through some graph theory combined with clustering models, all summed up with some graphical elements summing up the points going through the chapter.
        Unfortunately in the end I still felt left in the dark not knowing where to go from here. R seems like a really strong language for performing many types of statistical analysis, but I have yet to see how I should use that in some mainstream application. This is probably due to lack of knowledge on my side, but it just underlines my point about this not being a "beginners" book regarding machine learning.

        To summarize it, the author did present basic statistical models that can be used in order to aid machine learning, all this combined with practical examples. But you need to have a higher baseline and previous knowledge about machine learning and ideas about in to utilize it in order to fully enjoy this book.

         
        4.0

        Machine Learning for Hackers

        By Mary Anne

        from Portland, Oregon

        About Me Data Scientist

        Verified Reviewer

        Pros

        • Helpful examples
        • R Plyr Svm Glmnet

        Cons

          Best Uses

          • Intermediate

          Comments about oreilly Machine Learning for Hackers:

          Machine Learning for Hackers gets you started using R for machine learning. The book does a good job telling you how to install R and where to find help.
          There are lots examples on how to explore data using ggplot2. Other package covered include plyr which they equal to map reduce. tm package which is used in polynomial regression. glmnet and the Lamda function. K-Nearist neighbor algorithm which uses the class package.

          (5 of 8 customers found this review helpful)

           
          1.0

          Broken Code

          By Craig

          from Sacramento, CA

          Verified Reviewer

          Pros

            Cons

            • Too many errors

            Best Uses

              Comments about oreilly Machine Learning for Hackers:

              A book heavily focused on the results of code to illustrate concepts takes on a BIG risk of that code being or becoming broken. The UFO example should refer to Unidentified Faulty Objects. Used the online code to work through a few more steps but still ended up with errors, errors that should not occur when cutting and pasting.

              (2 of 2 customers found this review helpful)

               
              4.0

              Great book

              By Filipe X

              from Recife, Brazil

              About Me Developer, Maker

              Verified Reviewer

              Pros

              • Easy to understand

              Cons

                Best Uses

                • Intermediate
                • Student

                Comments about oreilly Machine Learning for Hackers:

                It goes through the very basics of statistics to build the necessary knowledge to the machine learning algorithms. On the other hand it doesn't explains in depth shown blocks of code, leaving to the reader to understand particularities of the R programming language. The use of R allows the easy processing of data with few lines of code, on the downside its a very different language so it requires some effort to be understood. For beginners in R, its very valuable to lookup and understand used functions to enlighten used algorithms. This book is truly made for hackers as it requires low level statistics and high level of curiosity to play with code, it also uses real word data on its examples making it even more attractive and fun.

                (10 of 10 customers found this review helpful)

                 
                3.0

                Enjoyable but light on useful detail

                By XYZ

                from Cambridge, MA

                About Me Developer

                Verified Reviewer

                Pros

                • Concise
                • Easy to understand
                • Well-written

                Cons

                • Not comprehensive enough
                • Too basic

                Best Uses

                • Intermediate

                Comments about oreilly Machine Learning for Hackers:

                (Disclosure: I received a free review copy of this book.)

                I had high hopes for this book after the first few chapters. The emphasis in the early chapters on cleaning data rings true to anyone who has ever had to deal with a body of real-world data.

                But after that it fell into a repetitive pattern: state a problem, give a nontechnical description of a machine learning algorithm, and explain how to call the appropriate ML library in R. With no math and little description of most algorithms, if you want to do something besides use R's built-in libraries, this book isn't so helpful.

                The writing style is lively and enjoyable, and the authors picked interesting real-world examples. They probably could write a really good book on machine learning, but this one isn't it.

                (0 of 7 customers found this review helpful)

                 
                4.0

                the best book to start mACHINE LEARNING

                By abhi1one

                from india

                Pros

                • Easy to understand
                • Helpful examples

                Cons

                  Best Uses

                    Comments about oreilly Machine Learning for Hackers:

                    this may be the book that helped me to start hacking AI and ML!!!

                    (11 of 12 customers found this review helpful)

                     
                    4.0

                    Plenty of content with good examples

                    By Peter

                    from Melbourne, Australia

                    About Me Developer

                    Verified Reviewer

                    Pros

                    • Concise
                    • Helpful examples
                    • Well-written

                    Cons

                      Best Uses

                      • Expert
                      • Intermediate

                      Comments about oreilly Machine Learning for Hackers:

                      Machine Learning for Hackers provides an introduction to Machine Learning and the increasingly popular statistics oriented language: R.

                      The book covers the basic concepts and some useful tools, including:

                      An introduction to R;
                      * Basic stats and probability;
                      * Supervised and unsupervised learning;
                      * Linear regression and categorization;
                      * Non-linear data and regularization;
                      * Principal Component Analysis (PCA) and input correlation;
                      * Multidimensional scaling (MDS) for clustering;
                      * k-nearest neighbour (kNN) for social network analysis; and
                      * SVMs for non-linear classification.

                      The general structure of each section is to first introduce a new concept, then demonstrate it by applying the concept to a trivial data set. Next, the technique is applied to a real data set. This structure is a great way to understand a technique.

                      The complete process of first massaging the data and then determining the technique to apply is covered. Occasionally the author makes a wrong turn and the analysis fails. The demonstration of failure, why it occurs and what to do about it is a great feature of the book.

                      The book is almost completely lacking in any of the mathematics or workings of the underlying algorithms being used, which may be considered a good or bad thing. Sometimes the book felt more like a tutorial on using R's various machine learning packages, rather than learning about machine learning itself.

                      If you aren't familiar with R or machine learning, this book presents a significant learning curve. Unfortunately, R's syntax can be quite opaque, even to experienced programmers. Indeed, due to the heavy R component in this book, a better title may have been "Machine Learning with R".

                      I'm not sure you can "hack" machine learning without properly understanding the underlying concepts, but with this book you can undoubtedly try.

                      The book presents a relatively quick, somewhat cursory overview of Machine Learning. It provides a good starting point for further study.

                      (1 of 6 customers found this review helpful)

                       
                      5.0

                      Accurate book for hard core programmers

                      By Sheik

                      from Chennai, India

                      About Me Developer

                      Verified Reviewer

                      Pros

                      • Accurate
                      • Helpful examples
                      • Well-written

                      Cons

                        Best Uses

                        • Expert
                        • Intermediate
                        • Student

                        Comments about oreilly Machine Learning for Hackers:

                        When you have enough time on the week-end and want to learn truly some interesting and futuristic concepts in computing. Do read this book followed by working out the examples. If you are serious developers and coding is your passion, then this book will take you to some level up and incite your innovative ideas for your products. For academic people, this should be one of the paper in your course. A very good book from O'Reilly by actual field experienced authors.

                        (6 of 16 customers found this review helpful)

                         
                        5.0

                        Sacándole el jugo al ordenador

                        By DGGONZALEZ

                        from Buenos Aires, Argentina

                        About Me Developer

                        Verified Reviewer

                        Pros

                        • Helpful examples
                        • Well-written

                        Cons

                          Best Uses

                          • Intermediate

                          Comments about oreilly Machine Learning for Hackers:

                          La parte más importante del título del libro son las últimas palabras "for hackers".
                          Si usted es de esas personas que quiere hacer con su ordenador algo más que matar marcianos, escribir trabajos del colegio o leer sus mails, esta obra le enseñara a convertirlo en una "máquina que aprende"

                          La enorme cantidad de datos que se producen en la vida real hacen imposible procesarlos por lo que es necesario recurrir a los ordenadores. Sin embargo a estos hay que enseñarles criterios sobre como manipular la información. Los autores nos enseñan algunos de esos criterios además de como implementarlos en proyectos que resultan útiles para cualquier usuario.

                          Lo mejor de todo es que no necesitamos ningún software caro ni difícil de conseguir. Usan el lenguaje R que es libre y gratuito.

                          No es un libro para empezar desde cero pero si uno está interesado en la web hay abundante documentación para la preparación previa.

                          En definitiva una obra absolutamente recomendable.

                          Displaying reviews 1-10

                          Back to top

                           
                          Buy 2 Get 1 Free Free Shipping Guarantee
                          Buying Options
                          Immediate Access - Go Digital what's this?
                          Ebook: $31.99
                          Formats:  DAISY, ePub, Mobi, PDF
                          Print & Ebook: $43.99
                          Print: $39.99