
Mining the Social Web
Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
Publisher: O'Reilly Media
Release Date: February 2011
Pages: 356
Read on Safari with a 10-day trial
Start your free trial now Buy on AmazonWhere’s the cart? Now you can get everything on Safari. To purchase books, visit Amazon or your favorite retailer. Questions? See our FAQ or contact customer service:
1-800-889-8969 / 707-827-7019
support@oreilly.com
Want to tap the tremendous amount of valuable social data in Facebook, Twitter, LinkedIn, and Google+? This refreshed edition helps you discover who’s making connections with social media, what they’re talking about, and where they’re located. You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.
Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.
- Get a straightforward synopsis of the social web landscape
- Use adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+
- Learn how to employ easy-to-use Python tools to slice and dice the data you collect
- Explore social connections in microformats with the XHTML Friends Network
- Apply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detection
- Build interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits
"A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data."
--Alex Martelli, Senior Staff Engineer, Google
Table of Contents
-
Chapter 1 Introduction: Hacking on Twitter Data
-
Installing Python Development Tools
-
Collecting and Manipulating Twitter Data
-
Closing Remarks
-
-
Chapter 2 Microformats: Semantic Markup and Common Sense Collide
-
XFN and Friends
-
Exploring Social Connections with XFN
-
Geocoordinates: A Common Thread for Just About Anything
-
Slicing and Dicing Recipes (for the Health of It)
-
Collecting Restaurant Reviews
-
Summary
-
-
Chapter 3 Mailboxes: Oldies but Goodies
-
mbox: The Quick and Dirty on Unix Mailboxes
-
mbox + CouchDB = Relaxed Email Analysis
-
Threading Together Conversations
-
Visualizing Mail “Events” with SIMILE Timeline
-
Analyzing Your Own Mail Data
-
Closing Remarks
-
-
Chapter 4 Twitter: Friends, Followers, and Setwise Operations
-
RESTful and OAuth-Cladded APIs
-
A Lean, Mean Data-Collecting Machine
-
Constructing Friendship Graphs
-
Summary
-
-
Chapter 5 Twitter: The Tweet, the Whole Tweet, and Nothing but the Tweet
-
Pen : Sword :: Tweet : Machine Gun (?!?)
-
Analyzing Tweets (One Entity at a Time)
-
Juxtaposing Latent Social Networks (or #JustinBieber Versus #TeaParty)
-
Visualizing Tons of Tweets
-
Closing Remarks
-
-
Chapter 6 LinkedIn: Clustering Your Professional Network for Fun (and Profit?)
-
Motivation for Clustering
-
Clustering Contacts by Job Title
-
Fetching Extended Profile Information
-
Geographically Clustering Your Network
-
Closing Remarks
-
-
Chapter 7 Google+: TF-IDF, Cosine Similarity, and Collocations
-
Harvesting Google+ Data
-
Data Hacking with NLTK
-
Text Mining Fundamentals
-
Finding Similar Documents
-
Bigram Analysis
-
Tapping into Your Gmail
-
Before You Go Off and Try to Build a Search Engine…
-
Closing Remarks
-
-
Chapter 8 Blogs et al.: Natural Language Processing (and Beyond)
-
NLP: A Pareto-Like Introduction
-
A Typical NLP Pipeline with NLTK
-
Sentence Detection in Blogs with NLTK
-
Summarizing Documents
-
Entity-Centric Analysis: A Deeper Understanding of the Data
-
Closing Remarks
-
-
Chapter 9 Facebook: The All-in-One Wonder
-
Tapping into Your Social Network Data
-
Visualizing Facebook Data
-
Closing Remarks
-
-
Chapter 10 The Semantic Web: A Cocktail Discussion
-
An Evolutionary Revolution?
-
Man Cannot Live on Facts Alone
-
Hope
-
-
Colophon