Programming Collective Intelligence
Building Smart Web 2.0 Applications
Publisher: O'Reilly Media
Released: August 2007
Pages: 368
Description
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Recently Viewed
DNS and BIND, 4th Edition
By Paul Albitz, Cricket Liu
April 2001
Linux Device Drivers, 2nd Edition
By Jonathan Corbet, Alessandro Rubini
June 2001
Information Architecture for the World Wide Web, 3rd Edition
By Peter Morville, Louis Rosenfeld
November 2006
Ebook: $31.99
Print & Ebook: $43.99
Print: $39.99
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreilly Programming Collective Intelligence
 
4.4

(based on 13 reviews)

Ratings Distribution

  • 5 Stars

     

    (7)

  • 4 Stars

     

    (4)

  • 3 Stars

     

    (2)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

REVIEWS

Reviewed by 13 customers

Sort by

Displaying reviews 1-10

Back to top

Previous | Next »

(12 of 12 customers found this review helpful)

 
5.0

The Python way to collective intelligence

By dwa

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

Programming Collective Intelligence is a new book from O'Reilly, which was written by Toby Segaran. The author graduated from MIT and is currently working at Metaweb Technologies. He develops ways to put large public datasets into Freebase, a free online semantic database. You can find more information about him on his blog: http://blog.kiwitobes.com/.

Web 2.0 cannot exist without Collective Intelligence. The "giants" use it everywhere, YouTube recommends similar movies, Last.fm knows what would you like to listen and Flickr which photos are your favorites etc. This technology empowers intelligent search, clustering, building price models and ranking on the web. I cannot imagine modern service without data analysis. That is the reason why it is worth to start read about it.

There are many titles about collective intelligence but recently I have read two, this one and "Collective Intelligence in Action". Both are very pragmatic, but the O'Railly's one is more focused on the merit of the CI. The code listings are much shorter (but examples are written in Python, so that was easy). In general these books comparison is like Java vs. Python. If you would like to build recommendation engine "in Action"/Java way, you would have to read whole book, attach extra jar-s and design dozens of classes. The rapid Python way requires reading only 15 pages and voila, you have got the first recommendations. It is awesome!

So how about rest of the book, there are still 319 pages! Further chapters say about: discovering groups, searching, ranking, optimization, document filtering, decision trees, price models or genetic algorithms. The book explains how to implement Simulated Annealing, k-Nearest Neighbors, Bayesian Classifier and many more. Take a look at the table of contents (here: http://oreilly.com/catalog/9780596529321/preview.html), it does not list all the algorithms but you can find more information there.

Each chapter has about 20-30 pages. You do not have to read them all, you can choose the most important and still know what is going on. Every chapter contains minimum amount of theoretical introduction, for total beginners it might be not enough. I recommend this book for students who had statistics course (not only IT or computing science), this book will show you how to use your knowledge in practice _ there are many inspiring examples.

For those who do not know Python - do not be afraid _ at the beginning you will find short introduction to language syntax. All listings are very short and well described by the author _ sometimes line by line. The book also contains necessary information about basic standard libraries responsible for xml processing or web pages downloading.

If you would like to start learn about collective intelligence I would strongly recommend reading "Programming Collective Intelligence" first, then "Collective Intelligence in Action". The first one shows how easy it is to implement basic algorithms, the second one would show you how to use existing open source projects related to machine learning.

You can find more about this book on it's catalogue page: http://oreilly.com/catalog/9780596529321/

(1 of 1 customers found this review helpful)

 
4.0

A fascinating read with lots of code examples

By www.thegeniusfiles.com

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

If you are a computer science student and want to learn about the algorithms and theory behind Web 2.0, this is a good place to start. Although the author tries to make the book intelligible to novices, you will benefit by having some previous programming experience - say, perhaps up to the 300 level. Also, the code is Python, so you might want to study up on that first. In my opinion a good Linux distro like Ubuntu will simplify the coding experience (it's easier to download and install the Python libraries in Ubuntu Synaptic than to install them in Windows).

The really nice thing about this book is that the author explains the principle of what each code example is doing before launching into the code. That's important because much of it is grounded in methods of statistical analysis.

As another reviewer pointed out, there are some errors in some of the code examples. If you have no prior experience with Python, this would be very confusing. However, you can access the revisions through Safari Online, so all is not lost.

If it weren't for the code errors, I'd give this book 5 out of 5.

(4 of 4 customers found this review helpful)

 
5.0

A visionary book that illuminates the Internet

By AlexeySmirnov

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

This is a visionary book because it predicts a lot of what will happen to the Internet soon. How do we process information in the Internet age? Instead of reading magazines and newspapers we use blogs as our source of news. This is because blogs offer much more customized news feed. In a typical newspaper, how much of its content is of interest to a reader? I guess half is a big value but typically it is less than that.

I start my working day with consuming two sweet drinks. One drink is a cup of coffee. Another is a virtual information soup made of 100 blogs. I glance over most of the stories quickly using Google Reader and select those that I am interested in. I might read them in greater detail later on during the day, in the evening, or on a weekend. I do not know which drink gives me more pleasure - the delicious cup of coffee or sweet virtual soup. I like the latter a lot because it is rich with media content - with bright images, cool videos, wow-type web pages.

However, I often discover news that I wish I found out earlier. In other words, there are so many news sources that reading them all or just looking at the headlines of major blogs will take too much time. We need targeted information delivery service.

This is the main idea of this book. In fact, it starts with explaining how to make recommendations given a set of preferences of a number of people and your own preferences. What are those cool things that you have not tried out yet but everybody else did? The example described in the book is applied to Delicious which does not offer recommendations yet.

I often try to decide what my interests are. The blogs that I am reading might answer this question if one builds groups of them. In fact, I have done this manually, but I found out that this categorization is not perfect. The book answers this question in Chapter 3.

After that the book deviates into a number of additional topics such as search, neural networks, discrete optimization. The author Toby Segaran has a great ability to explain difficult concepts using simple words and pictures. As most of the stuff was familiar to me I was wondering how easy a new concept seemed and how much time I spent originally understanding it.

After that the main melody of the book is there again - the next chapter explains how to filter documents, for example to decide if a particular news story is interesting to you or not. Then the book deviates again into decision trees and building price models and even matching people on a dating site. However, there comes our melody again - this time it explains how to extract trends from a lot of news sources, that is decide what people are discussing today. This feature is similar to Google News except that the user has no control of news sources.

I was surprised when I found out that Python is such a popular language in a scientific community. The book describes lots of libraries dealing with numerical data or displaying various charts. The book will serve as a great introduction to Python language even though there are lots of introductory books available. In fact, learning Python this way it easier and more enjoyable.

After reading the book I definitely want to try out the tricks explained there and improve my information soup. This book is my virtual cookbook.

(2 of 3 customers found this review helpful)

 
3.0

Good book, bad code.

By Anonymous

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

Pretty interesting book, definitively worth a read - at least for the self-taught guys like me. Too bad the code examples are of such a low quality. Prepare for a complete rewrite if you planned on using them (M. Segaran, we'd be happy to see you submitting your code for review on comp.lang.py !-)

Now don't take me wrong : it's still one of the very few CS-related books I didn't regret to buy.

(1 of 2 customers found this review helpful)

 
3.0

Information is great, but too many errors

By Amit Lamba

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

I've just purchased the book and I don't doubt that the book is full of amazing information on the topic of Collective Intelligence. But, without even getting past the Preface, I've already discovered 2 errors in actual code examples! From there I decided to check the errata list here on O'reilly, and albeit a lot of the errors are unconfirmed, but it seems alarmingly high. Errata is to be expected, but with a book that deals so heavily in mathematical formulae and code snippets, it's a pain to have to cross check everything. There should have been a better job on proof reading this before it was pushed out to the masses. I have the August 2007 version of the book, so if there is a 03/2008 printing of this book, I'd much rather get that version if the errata have been fixed in the newer printing. If someone can confirm this, I'd gladly revise my review of the book, if that's even possible. Outside of these errors, this book would easily be a 5 star book.

 
5.0

Very informative, engaging read. Code a bit terse

By Jeff

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

Everything that everyone else has said about how well written this book is, how applicable the examples are, etc, is spot on. It's a very engaging read.I have a single request. Could someone make a downloadable version of the code available with more descriptive variable names. (Terse is fine for the book itself). I'm finding I'm having to rename variables as I go so that I can more easily grasp the math.An example of one that I've renamed:def sim_pearson(prefs,p1,p2): # Get the list of mutually rated items mutuallyRatedItems={} for item in prefs[p1]: if item in prefs[p2]: mutuallyRatedItems[item]=1 # if they are no ratings in common, return 0 if len(mutuallyRatedItems)==0: return 0 # Sum calculations numMutuallyRatedItems=len(mutuallyRatedItems) # Sums of all the preferences sum_Person1_MutuallRatings=sum([prefs[p1][item] for item in mutuallyRatedItems]) sum_Person2_MutuallRatings=sum([prefs[p2][item] for item in mutuallyRatedItems]) # Sums of the squares sumOfSquaresOfPerson1MutualRatings=sum([pow(prefs[p1][item],2) for item in mutuallyRatedItems]) sumOfSquaresOfPerson2MutualRatings=sum([pow(prefs[p2][item],2) for item in mutuallyRatedItems]) # Sum of the products sum_ProductOf_Ratings_OfBothUsers_MutualItems=sum([prefs[p1][item]*prefs[p2][item] for item in mutuallyRatedItems]) # Calculate r (Pearson score) numerator =sum_ProductOf_Ratings_OfBothUsers_MutualItems-(sum_Person1_MutuallRatings*sum_Person2_MutuallRatings/numMutuallyRatedItems) denominator=sqrt((sumOfSquaresOfPerson1MutualRatings-pow(sum_Person1_MutuallRatings,2)/numMutuallyRatedItems)*(sumOfSquaresOfPerson2MutualRatings-pow(sum_Person2_MutuallRatings,2)/numMutuallyRatedItems)) if denominator==0: return 0 pearsonCorrelation=numerator/denominator return pearsonCorrelationGreat book. I'm really enjoying it.

 
5.0

Grant V

By Anonymous

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

This book is a must read for web developers wanting to go beyond CRUD. The author presents fairly complex topics in a brief yet easy to understand manner. I really appreciate how the most appropriate algorithms were chosen and then how practical examples of their use are given. I am neither a Python programmer nor a math genius and managed to get tons of value from this material.

(1 of 1 customers found this review helpful)

 
5.0

At Last! An accessible book on machine learning

By Tim Harvey

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

I have slugged my way through my share of major texts and academic papers on machine learning, data mining, NLP, and so on, and this book is a breath of fresh air. It is the first book I've seen that makes the topic accessible and has the reader immediately work with a great cross section of useful methods and applications.

The writing is clear, and rests on a good selection of problems and applications, instead of highly technical descriptions of algorithms. It appears squarely aimed at people who want to learn by working with real problems and data, and build up a good, workable toolkit, in the process.

Within my own profession, technical writing, structured documentation and technologies like DITA and document reuse represent the cutting edge in controlled documentation but the profession has no answer to democratic writing and publishing, represented by the web and wikis, that are overwhelming controlled writing and publishing. This book finally opens that door and provides the profession with the means to work with democratic material. I strongly urge members of my profession to study this book, and learn a different approach to how content can be detected, rated, and organized.

Tim Harvey

Text Wrestler, Google

 
4.0

Amust read for webstie development

By Doug Wake

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

What an awesome book. Collective Intelligence shows you how to utilize mathematical search algorithms in a way for analyzing websites. In school I learned calcIII and calcV which was series and matrices respectively to analyze numbers. Well in this book you begin by understanding how to search for a character or string of data with a very well laid out plan. You even get to use real time data when developing your project in Python.

Then the improvement, when using some algorithms, Euclidian distance, Pearson coefficient, Entropy, Conditional Probability, you begin to utilize numbers by giving your websites a weight for how you search. Toby Segaran is excellent in explanation of building a smart web application to keep track where you go. It also teaches you to find out who has visited your site as well as the probability that your site will be chosen next.

Also with utilizing these search algorithms is good way to show the history of data to gain marketing for advertisements. Utilizing AI in the back ground is always great find but can be difficult to teach, but this book was laid out very well in getting to the point for a programmer to learn high rate math algorithms.

What a nice read,

Doug Wake

 
4.0

Great book for smart web apps

By W. Blanchard

from Undisclosed

Comments about oreilly Programming Collective Intelligence:

I just read Collective Intelligence. I found it very useful in understanding the different methods used to collect data. This evolved from using algorithms suited for a certain problem to creating your own algorithms with machine-learning techniques. The many different algorithms and python script examples were helpful.

Displaying reviews 1-10

Back to top

Previous | Next »

 
Buy 2 Get 1 Free Free Shipping Guarantee
Buying Options
Save a Tree - Go Digital  what is this?
Ebook: $31.99
Formats: APK, DAISY, ePub, Mobi, PDF
Print & Ebook: $43.99
Print: $39.99