Web Scraping with Python
Collecting Data from the Modern Web
Publisher: O'Reilly Media
Final Release Date: June 2015
Pages: 256

Learn web scraping and crawling techniques to access unlimited data from any web source in any format. With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once.

Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice.

  • Learn how to parse complicated HTML pages
  • Traverse multiple pages and sites
  • Get a general overview of APIs and how they work
  • Learn several methods for storing the data you scrape
  • Download, read, and extract data from documents
  • Use tools and techniques to clean badly formatted data
  • Read and write natural languages
  • Crawl through forms and logins
  • Understand how to scrape JavaScript
  • Learn image processing and text recognition
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyWeb Scraping with Python
 
4.4

(based on 13 reviews)

Ratings Distribution

  • 5 Stars

     

    (6)

  • 4 Stars

     

    (6)

  • 3 Stars

     

    (1)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

100%

of respondents would recommend this to a friend.

Pros

  • Easy to understand (10)
  • Helpful examples (9)
  • Well-written (8)
  • Accurate (7)
  • Concise (7)

Cons

  • Too basic (3)

Best Uses

  • Intermediate (11)
  • Novice (6)
  • Student (6)
    • Reviewer Profile:
    • Developer (7), Sys admin (3)

Reviewed by 13 customers

Displaying reviews 1-10

Back to top

Previous | Next »

 
4.0

Great intro for python novices and programming amateurs

By Dev Marketer

from Oakland, CA

Verified Reviewer

Pros

  • Concise
  • Helpful examples

Cons

    Best Uses

    • Intermediate
    • Novice
    • Student

    Comments about oreilly Web Scraping with Python:

    Recently bought Web Scraping Secrets Exposed ebook at www.outscrape.com and realized there was a lot I could do with web scraping. That author recommended this book, so I picked it up and it's been a good combo. If you want some ideas of what to scrape and how to apply scraping that's a good place to start along with this book.

     
    5.0

    great Book

    By ehzShelter

    from Dhaka

    About Me Developer, Maker, Sys Admin

    Verified Reviewer

    Pros

    • Accurate
    • Concise
    • Easy to understand

    Cons

      Best Uses

      • Intermediate

      Comments about oreilly Web Scraping with Python:

      useful

      (1 of 2 customers found this review helpful)

       
      4.0

      Useful for the noninitiated (like myself)

      By dvon79

      from Mexico City, MX

      Verified Buyer

      Comments about oreilly Web Scraping with Python:

      No comments other than the review headline: I've found it useful at the beginners level.

       
      4.0

      A book suitable for junior/intermediate scrappers

      By Fantastic BOBOski

      from Melbourne, Australia

      About Me Developer

      Verified Reviewer

      Pros

      • Easy to understand
      • Helpful examples

      Cons

      • Too basic

      Best Uses

      • Intermediate
      • Student

      Comments about oreilly Web Scraping with Python:

      Pros:
      Easy to understand;
      Fairly comprehensive;
      Suitable for junior/intermediate readers

      Cons:
      Python code is NOT Pythonic, but is okay in general.

      (1 of 1 customers found this review helpful)

       
      4.0

      This is awesome

      By shashi

      from india

      About Me Educator

      Pros

      • Accurate

      Cons

      • Too basic

      Best Uses

      • Intermediate

      Comments about oreilly Web Scraping with Python:

      this is a definitive guide that everyone should < b>buy< /b>

       
      5.0

      Very helpful with my project

      By Jeff

      from Houston, TX

      About Me Designer, Developer

      Verified Buyer

      Pros

      • Easy to understand
      • Helpful examples
      • Well-written

      Cons

        Best Uses

        • Intermediate

        Comments about oreilly Web Scraping with Python:

        I am working on scraping some of the independent T,V, station listing to help with my MythTV scheduling. The book goes into good depth of the subject.

         
        4.0

        Good food for thought

        By Jon W

        from Southampton UK

        About Me Developer, Sys Admin

        Verified Buyer

        Pros

        • Accurate
        • Concise
        • Easy to understand
        • Helpful examples
        • Well-written

        Cons

          Best Uses

          • Intermediate
          • Student

          Comments about oreilly Web Scraping with Python:

          I am still working through this book but so far it has proved useful.

           
          5.0

          Great Read of you want to scrape with Python

          By Ted R.

          from Denver, CO

          About Me Enthusiast

          Verified Buyer

          Pros

          • Accurate
          • Concise
          • Easy to understand
          • Helpful examples
          • Well-written

          Cons

            Best Uses

            • I'm A Novice

            Comments about oreilly Web Scraping with Python:

            The author does an excellent job of explaining how to scrape with Python. I'm new to the subject and this has helped me learn the basics. Bravo, Ryan!

            (7 of 7 customers found this review helpful)

             
            4.0

            A very fun and informative book

            By Bobby

            from California

            About Me Developer, Sys Admin

            Verified Buyer

            Pros

            • Accurate
            • Easy to understand
            • Helpful examples
            • Well-written

            Cons

            • Only Touched On Mysql

            Best Uses

            • Intermediate
            • Novice
            • Real World Code Use

            Comments about oreilly Web Scraping with Python:

            This was a really fun book. It was a good decision for the author to specify Python3 in the book, so it'll not only have a longer shelf life, but also show others Python3 is a really great language. I appreciated the author touching on subjects, even when there was limitations. An example was the discussion about Scrappy even though it hasn't yet been ported to Python3 as of this writing.

            I went through this book over the course of a few weekends, and it was awesome being able to apply some of the tasks I picked up from this book to 'real life' projects very quickly.

            I would recommend this book to anyone who has an interest in web scraping, as well as anyone who may have gotten past a Python course or book and is looking to being able to quickly apply Python to real use.

            I think the author was wise to keep the book small enough to consume for most folks, but still offer good example code and explanations of work.

            (8 of 8 customers found this review helpful)

             
            5.0

            Highly Recommended. I wish I had this two years ago.

            By Gerard

            from Seattle

            About Me Developer

            Verified Reviewer

            Pros

            • Concise
            • Easy to understand
            • Helpful examples
            • Well-written

            Cons

              Best Uses

              • Intermediate
              • Novice
              • Student

              Comments about oreilly Web Scraping with Python:

              I really liked this book, for the following reasons:
              1. it is a great introduction to webscraping. The reader is given confidence to use well-known Python packages such as BeautifulSoup and get useful results from scraping webpages in a very short time.
              2. where to go after learning the basics? - the author describes the tools, techniques and frameworks to use for scraping dynamic websites, including code examples. This is the most challenging part of the book because it frequently involves combining tools and the reader will have to get his/her hands dirty and learn by doing also. This is reasonable since different websites present different challenges.
              3. I liked the author's writing style. She favors simple explanations, presents brief historical context on technologies when appropriate, explains potential pitfalls and makes clear recommendations among technical choices based on her experience.

              Highly recommended. I wish I had this book two years ago.

              Displaying reviews 1-10

              Back to top

              Previous | Next »

               
              Buy 2 Get 1 Free Free Shipping Guarantee
              Buying Options
              Immediate Access - Go Digital what's this?
              Ebook:  $27.99
              Formats:  DAISY, ePub, Mobi, PDF
              Print & Ebook:  $35.19
              Print:  $31.99