Book description
Learn how to extract information from websites using Beautiful Soup and the Python urllib2 module. This practical, hands-on guide covers everything you need to know to get a head start in website scraping.
In Detail
Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping. Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need without writing excess code for an application. It doesn't take much code to write an application using Beautiful Soup.
Getting Started with Beautiful Soup is a practical guide to Beautiful Soup using Python. The book starts by walking you through the installation of each and every feature of Beautiful Soup using simple examples which include sample Python codes as well as diagrams and screenshots wherever required for better understanding. The book discusses the problems of how exactly you can get data out of a website and provides an easy solution with the help of a real website and sample code.
Getting Started with Beautiful Soup goes over the different methods to install Beautiful Soup in both Linux and Windows systems. You will then learn about searching, navigating, content modification, encoding support, and output formatting with the help of examples and sample Python codes for each example so that you can try them out to get a better understanding. This book is a practical guide for scraping information from any website. If you want to learn how to efficiently scrape pages from websites, then this book is for you.
What You Will Learn
- Learn how to scrape HTML pages from websites
- Implement a simple method to scrape any website with the help of developer tools, the Python urllib2 module, and Beautiful Soup
- Learn how to search for information within an HTML/XML page
- Modify the contents of an HTML tree
- Understand encoding support in Beautiful Soup
- Learn about the different types of output formatting
Table of contents
-
Getting Started with Beautiful Soup
- Table of Contents
- Getting Started with Beautiful Soup
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Preface
- 1. Installing Beautiful Soup
- 2. Creating a BeautifulSoup Object
-
3. Search Using Beautiful Soup
-
Searching in Beautiful Soup
- Searching with find()
- Searching with find_all()
- Searching for Tags in relation
- Using search methods to scrape information from a web page
- Quick reference
- Summary
-
Searching in Beautiful Soup
- 4. Navigation Using Beautiful Soup
- 5. Modifying Content Using Beautiful Soup
- 6. Encoding Support in Beautiful Soup
- 7. Output in Beautiful Soup
- 8. Creating a Web Scraper
- Index
Product information
- Title: Getting Started with Beautiful Soup
- Author(s):
- Release date: January 2014
- Publisher(s): Packt Publishing
- ISBN: 9781783289554
You might also like
book
Web Scraping with Python
Learn web scraping and crawling techniques to access unlimited data from any web source in any …
book
Web Scraping with Python
Successfully scrape data from any website with the power of Python About This Book A hands-on …
book
Modern JavaScript for the Impatient
Exploit the Power of Modern JavaScript and Avoid the Pitfalls JavaScript was originally designed for small-scale …
book
Hands-On Web Scraping with Python
Collect and scrape different complexities of data from the modern Web using the latest tools, best …