Download data and create interactive applications involving any web page with this simple but powerful Python framework. This engaging guide shows you how to bypass the idiosyncracies of web content through Scrapy, the most popular tool for creating mash-ups and other services from web content.
Most of the world's information is tied up in moderately structured, plain-text websites, and for years programmers have been writing ad-hoc scripts to download and parse HTML in order to extract nuggets of useful information.
With Scrapy, you can create APIs and real-time alerts as well as mashups, and deal with legacy components, aggregate hard-to-reach big data, and even use it for data discovery. With examples you can run yourself, this book explains in simple terms how to find the content you want on a website and develop a Scrapy application to download it. The author goes even further, showing you simple ways to create full-fledged applications based on web scraping.
Dimitrios Kouzis-Loukas routinely beats impossible deadlines by using Python, CoffeeScript and little-known secrets of the Web. After years of designing and validating microprocessors with ARM Ltd UK, he envisions a world where software is as reliable, robust and as well-tested as the hardware.