Hilary Mason: Advanced Machine Learning
Publisher: O'Reilly Media
Final Release Date: August 2012
Run time: 2 hours 13 minutes

In this sequel to An Introduction to Machine Learning with Web Data, bit.ly lead scientist Hilary Mason shows you how to solve real-world problems with machine learning. Using real data from an actual ecommerce website, you will apply production quality algorithms to understand all the issues that arise when working in a live environment.

Learn how to apply best practices to common types of machine learning problems, extract quantifiable data, and explore several open source tools and how to use them.

Segments include:

  • Introduction: Discover what the course covers, and what you'll learn.
  • Classification Part 1: Techniques and best practices to learn from your data.
  • Classification Part 2: Learning which attributes maximize desired behavior.
  • Clustering: How to explore and visualize unstructured data when your data is a mess and there's no known structure.
  • Learning from Data: Best practices for offline vs stream analysis.
  • Conclusions: Asking the right questions is hard. Once you've formulated the question you'll know whether your task is easy or hard.
Table of Contents
Product Details
About the Author
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyHilary Mason: Advanced Machine Learning
 
3.0

(based on 4 reviews)

Ratings Distribution

  • 5 Stars

     

    (0)

  • 4 Stars

     

    (1)

  • 3 Stars

     

    (2)

  • 2 Stars

     

    (1)

  • 1 Stars

     

    (0)

Pros

    Cons

    • Not comprehensive enough (4)
    • Too basic (4)

    Best Uses

    • Novice (4)
    • Student (3)
      • Reviewer Profile:
      • Developer (3)

    Reviewed by 4 customers

    Sort by

    Displaying reviews 1-4

    Back to top

    (4 of 5 customers found this review helpful)

     
    2.0

    An Introduction to Common Machine Learni

    By g-man

    from Brisbane, QLD, Australia

    About Me Developer

    Verified Reviewer

    Pros

    • Easy to understand
    • Helpful examples

    Cons

    • Not comprehensive enough
    • Too basic

    Best Uses

    • Novice

    Comments about oreilly Hilary Mason: Advanced Machine Learning:

    The title of this video series promised so much but I'm slightly disappointed.Though many topics are covered, the detail of how each technique works is not conveyed to the extent that you will learn how to implement them. Presentation of the material is a bit awkward but you do get a clear idea of what techniques are available and when to use them.

    Although each video is of high quality, I couldn't help thinking "Why?". Most of the time you're left watching folk sitting around a table trying to look interested in the speaker. This adds little to conveying any information to the viewer and is even distracting. On-screen content is presented with slightly lower quality which is counter-intuitive because it contains the key information being presented. I much prefer an on-screen style.

    If you have little experience with Machine Learning, you will be introduced to some fundamental techniques and pick-up some good practices along-the-way. Decision Trees, K-Means Clustering, Sim-Hashing, Bloom Filters and many others are presented. The style is friendly to novices and you will not feel lost. When and how to use these methods is covered well and may assist you in identifying the appropriate one for your projects.

    In combination with the accompanying source code, one can get a reasonable overview of some important machine learning techniques. The example code uses many third party, open-source libraries which are not easy to install but, more importantly obscure, the detail of the techniques being discussed.

    You will not gain much knowledge of how the techniques presented work but this series does provide good introductions to a range of really useful ones and demonstrates their appropriate use.

     
    4.0

    More "Intro to Machine Learning part 2"

    By Jim Schubert

    from Richmond, VA

    About Me Developer

    Verified Reviewer

    Pros

    • Easy to understand
    • Helpful examples

    Cons

    • Not comprehensive enough
    • Too basic

    Best Uses

    • Novice
    • Student

    Comments about oreilly Hilary Mason: Advanced Machine Learning:

    I watched this video as part of O'Reilly Media's blogger program. I haven't worked with machine learning topics in the past, and I was interested to learn a bit from this video. It turns out that I use many of the machine learning concepts in the linux terminal almost daily, but on much smaller data (personal computer logs). I've even parsed Apache logs in almost the same way as presented in this video's "Learning from your data" segment.

    At first, I was little confused why this video is called "Advanced Machine Learning" because I didn't feel like any of the topics were all too advanced. Each segment seems to only skim the surface of a very general topic. In fact, it seems to me that this video is more of a continuation of Ms. Mason's other Machine Learning video on O'Reilly-- "An Introduction to Machine Learning with Web Data." It may be more appropriately named "An introduction to data analysis", and that's not a bad thing! Don't be turned away by a misleading title. Fair warning: I've rated the video based on the content with my proposed title.

    If you're looking for an in-depth discussion of machine learning algorithms, this isn't the video for you. If you're looking for an introduction in getting things done with data, you should check out this video. Although the amount of information is pretty light, it is still a good way to get your start conceptually. If you look at the scripts and sample data provided in the code repository, you'll be off to a good start to learn more about your data.

    For instance, Hillary makes it a point to break things down into a few simple steps:

    1) What is your data?
    2) What do you want to learn from your data?
    3) How to extract that information.

    Again, I wouldn't recommend this video if you're software engineer with a desire to learn in-depth machine learning algorithms. I do recommend this video if you're interested in understanding some fundamentals of machine learning and how they're applied in some advanced production scenarios (especially at bit.ly).

    I also recommend checking out the code examples and learning the basics of the python modules used in the scripts. They will help you analyze your data in a meaningful way.

    (2 of 2 customers found this review helpful)

     
    3.0

    Cool ML Algorithms, needs more depth

    By D Witherspoon

    from Colorado

    About Me Developer

    Verified Reviewer

    Pros

    • Concise

    Cons

    • Not comprehensive enough
    • Too basic

    Best Uses

    • Novice
    • Student

    Comments about oreilly Hilary Mason: Advanced Machine Learning:

    Advanced Machine Learning video collection is a quick presentation of some machine learning techniques and algorithms covered in just over 2 hours. Even though this is a short period to try and cover any of the many algorithms in machine learning, there is a chance that you might learn something. If you are an experienced in the many topics of machine learning, then you will know that you cannot cover anything in enough detail in 2 hours and therefore this would not be helpful to you. If you are fairly new to the topic, then you might learn a little bit about interesting algorithms, but if you expectation is that you will be able to directly apply them or be able to explain them to co-workers then you will have to dive deeper somewhere else. There were some interesting algorithms that were covered that I had not worked with like Bloom Filter, Simhashing, and Hamming Distance. Hilary explains these algorithms through examples written in Python and utilizing libraries that have implemented these algorithms. The problem is that she does not go into enough detail that you will be able to implement them in another language, therefore you will need to research them to get a better understanding. I did enjoy the advice that she gave about becoming a better data analyst is to watch and talk with other data analyst to see the tools that they use and the approaches that they take. If you take that approach to what she is presenting here, then you will learn some new topics to apply to data mining with the caveat that you will need to spend time researching to better understand the details of the algorithms. Personally I would have enjoyed learning more details about the random forest decision tress and on dimensionality reduction. Since Hilary has the opportunity to create a collection of videos on advanced machine learning, she had the opportunity to dive a bit deeper on the different algorithms and the different situations you can apply them. She could have also taken the time to explain the results that are presented after running the algorithms.

    On a side note, it even seems like the students in the class don't seem to fully understand what is being presented to them. Not to mention, what is the person with the iPad even doing the entire time of the videos. She seems to never look up and there is no way that she is coding or if she is, then I would like to know what application she is using.

    The only people that I would recommend this collection of videos for would be someone interested in starting to be a data scientist and have not taken a machine learning class. Otherwise, I would look at other books like "R in a Nutshell" or the course provided by Standford or other free online courses.

    If you do get this collection of videos, I do recommend downloading the source and files from Github. Then you can follow through the examples even if you are using Window. I would recommend installing cygwin before watching the videos and make sure that you have python and a editor configured.

    (4 of 4 customers found this review helpful)

     
    3.0

    Very good overview, but too shallow in d

    By mko

    from Poland

    Verified Reviewer

    Pros

    • Concise

    Cons

    • Not comprehensive enough
    • Too basic

    Best Uses

    • Novice
    • Student

    Comments about oreilly Hilary Mason: Advanced Machine Learning:

    I have seen presentations made by Hilary already (e.g. from Strata) and I think they were very good in terms of being presented as conference materials. During conference, you have limited time and you obviously want to show as much as possible. On the other hand, when it comes to lectures and workshops you have as much time as you can devote to the topic. That's why I have expected much deeper analysis of the topics covered in this particular video material. My expectations were that by watching this video I will learn all the topics and will be able to apply them right after finishing the show.This was not quite the case.

    Let's talk about bight sides first. I must admit that for people who are new to data analysis this video is really a good overview of some of the tools available on the marker. There are loots of various algorithms and applications discussed here. You will learn about various metrics, decision trees, k-nearest neighbor basics, dimensionality reduction, principal component analysis, simhash, Hamming distance, Bloom filters, MapReduce, Hadoop. And this is definitely a benefit for people who are not yet familiar with these topics. However, there is another side of the coin. The deepness of the lecture is quite shallow. Basically, you will be presented with some basic examples for each topic, based on very simple data. The fact is, that Hilary presents these basic data analysis quick and dirty way, using CLI and Python, and does it quite efficiently. For sure this will be very useful for computer geeks who are familiar with CLI (mostly Linux users and advanced OS X users) and Python itself. I'd argue that Python is the best tool to visualize the results in a first place, but it's just a weapon of choice. It could be done in R as well, which in my opinion, is far more suited for data analysis than Python is. When it comes to Windows users, I am pretty sure that they will not benefit from this video, as they mostly use Excel for brief data analysis, have no idea what CLI is and when they hear 'Python' they think 'ZOO'. Advanced Windows developers, please excuse my irony.

    The last thing I am not happy with here are explanations of the results. Hilary simply assumes that values produced during calculations are self-explanatory. They are not. I think that for people who see the tree for the first time having detailed explanation of how to read the tree would be very helpful. The same refers to other topics presented during the lecture.

    I'd suggest this video to people who start working with data analysis and just want to get the right direction. Make sure to dig for details somewhere else. If you don't know Python and are not familiar with CLI I would strongly consider buying this Video. Maybe "R in a Nutshell" would be a slightly better idea.

    Displaying reviews 1-4

    Back to top

     
    Buy 2 Get 1 Free Free Shipping Guarantee
    Buying Options
    Immediate Access - Go Digital what's this?
    Video: $79.99
    (Streaming, Downloadable)