Publisher: O'Reilly Media
Final Release Date: October 2003
Pages: 428
The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your data in a different form than it's presented, or wanted to collect data from several sites and see it side-by-side without the constraints of a browser, then Spidering Hacks is for you.Spidering Hacks takes you to the next level in Internet data retrieval--beyond search engines--by showing you how to create spiders and bots to retrieve information from your favorite sites and data sources. You'll no longer feel constrained by the way host sites think you want to see their data presented--you'll learn how to scrape and repurpose raw data so you can view in a way that's meaningful to you.Written for developers, researchers, technical assistants, librarians, and power users, Spidering Hacks provides expert tips on spidering and scraping methodologies. You'll begin with a crash course in spidering concepts, tools (Perl, LWP, out-of-the-box utilities), and ethics (how to know when you've gone too far: what's acceptable and unacceptable). Next, you'll collect media files and data from databases. Then you'll learn how to interpret and understand the data, repurpose it for use in other applications, and even build authorized interfaces to integrate the data into your own content. By the time you finish Spidering Hacks, you'll be able to:
- Aggregate and associate data from disparate locations, then store and manipulate the data as you like
- Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites
- Integrate third-party data into your own applications or web sites
- Make your own site easier to scrape and more usable to others
- Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day
Like the other books in O'Reilly's popular Hacks series, Spidering Hacks brings you 100 industrial-strength tips and tools from the experts to help you master this technology. If you're interested in data retrieval of any type, this book provides a wealth of data for finding a wealth of data.
|
- Title:
- Spidering Hacks
- By:
- Morbus Iff, Tara Calishain
- Publisher:
- O'Reilly Media
- Formats:
-
- Print
- Ebook
- Safari Books Online
- Print:
- October 2003
- Ebook:
- June 2009
- Pages:
- 428
- Print ISBN:
- 978-0-596-00577-1
- | ISBN 10:
- 0-596-00577-6
- Ebook ISBN:
- 978-0-596-10428-3
- | ISBN 10:
- 0-596-10428-6
|
-
Morbus Iff Kevin Hemenway, coauthor of Mac OS X Hacks, is better known as Morbus Iff, the creator of disobey.com, which bills itself as "content for the discontented." Publisher and developer of more home cooking than you could ever imagine, he'd love to give you a Fry Pan of Intellect upside the head. Politely, of course. And with love. View Morbus Iff's full profile page. -
Tara Calishain Tara Calishain is the creator of the site, ResearchBuzz. She is an expert on Internet search engines and how they can be used effectively in business situations. View Tara Calishain's full profile page. |
Colophon Our look is the result of reader comments, our own experimentation, and feedback from distribution channels. Distinctive covers complement our distinctive approach to technical topics, breathing personality and life into potentially dry subjects. The tool on the cover of Spidering Hacks is a flex scraper. Flex scrapers are sometimes referred to as putty knives or push scrapers. These rugged tools are commonly used for light-duty construction or home projects, such as wallpapering, painting, or woodworking. Flex scrapers are usually three inches wide, with steel blades ground thinner than a typical putty knife to give maximum flexibility. Thus, they are the perfect choice for applying lighter compounds over broader areas and at a faster rate than putty knives. High-end flex scrapers have ergonomic handles designed to fit the hand and reduce fatigue. Just as a well-designed flex scraper gives improved blade control, so too does a well-designed spidering or scraping hack give greater control and and flexibility when gathering information from the Web and automating and speeding complex tasks. Genevieve d'Entremont was the production editor for Spidering Hacks. Brian Sawyer was the copyeditor. Matt Hutchinson proofread the book. Derek Di Matteo, Marlowe Shaeffer, and Claire Cloutier provided quality control. Julie Hawks wrote the index.Emma Colby designed the cover of this book, based on a series design by Edie Freedman. The cover image is an original photograph by Emma Colby. Emma Colby produced the cover layout with QuarkXPress 4.1 using Adobe's Helvetica Neue and ITC Garamond fonts.David Futato designed the interior layout. This book was converted from Microsoft Word to FrameMaker 5.5.6 by Andrew Savikas. The text font is Linotype Birka; the heading font is Adobe Helvetica Neue Condensed; and the code font is LucasFont's TheSans Mono Condensed. The illustrations that appear in the book were produced by Robert Romano and Jessamyn Read using Macromedia FreeHand 9 and Adobe Photoshop 6. This colophon was written by Derek Di Matteo. |
|
Table of Contents
|
Product Details
|
About the Author
|
Colophon
|
 |
|
 |
|
|
|
Recommended for You
|
 |
|
|
|
Customer Reviews
4/17/2014 (0 of 1 customers found this review helpful) 4.0Dated, But Still Relevant and Helpful By LK the Web Mistress :-) from Central New Jersey About Me Developer, Sys Admin - Accurate
- Helpful examples
- Well-written
9/14/2010 (2 of 2 customers found this review helpful) 3.0Not as helpful as I would like By garthm9 from Atlanta, GA - Accurate
- Concise
- Easy to understand
- Helpful examples
By garyamort from Undisclosed 2/10/2004 4.0Spidering Hacks Review By Doug Smith from Undisclosed 1/15/2004 4.0Spidering Hacks Review By Bill Day from Undisclosed 11/24/2003 4.0Spidering Hacks Review By Mike Sipin from Undisclosed 11/19/2003 5.0Spidering Hacks Review By Marcus P. Zillman, M.S., A.M.H.A. from Undisclosed
|
|
|