Books & Videos

Table of Contents

  1. Chapter 1 Introduction to Web Automation

    1. The Web as Data Source

    2. History of LWP

    3. Installing LWP

    4. Words of Caution

    5. LWP in Action

  2. Chapter 2 Web Basics

    1. URLs

    2. An HTTP Transaction

    3. LWP::Simple

    4. Fetching Documents Without LWP::Simple

    5. Example: AltaVista

    6. HTTP POST

    7. Example: Babelfish

  3. Chapter 3 The LWP Class Model

    1. The Basic Classes

    2. Programming with LWP Classes

    3. Inside the do_GET and do_POST Functions

    4. User Agents

    5. HTTP::Response Objects

    6. LWP Classes: Behind the Scenes

  4. Chapter 4 URLs

    1. Parsing URLs

    2. Relative URLs

    3. Converting Absolute URLs to Relative

    4. Converting Relative URLs to Absolute

  5. Chapter 5 Forms

    1. Elements of an HTML Form

    2. LWP and GET Requests

    3. Automating Form Analysis

    4. Idiosyncrasies of HTML Forms

    5. POST Example: License Plates

    6. POST Example: ABEBooks.com

    7. File Uploads

    8. Limits on Forms

  6. Chapter 6 Simple HTML Processing with Regular Expressions

    1. Automating Data Extraction

    2. Regular Expression Techniques

    3. Troubleshooting

    4. When Regular Expressions Aren't Enough

    5. Example: Extracting Linksfrom a Bookmark File

    6. Example: Extracting Linksfrom Arbitrary HTML

    7. Example: Extracting Temperatures from Weather Underground

  7. Chapter 7 HTML Processing with Tokens

    1. HTML as Tokens

    2. Basic HTML::TokeParser Use

    3. Individual Tokens

    4. Token Sequences

    5. More HTML::TokeParser Methods

    6. Using Extracted Text

  8. Chapter 8 Tokenizing Walkthrough

    1. The Problem

    2. Getting the Data

    3. Inspecting the HTML

    4. First Code

    5. Narrowing In

    6. Rewrite for Features

    7. Alternatives

  9. Chapter 9 HTML Processing with Trees

    1. Introduction to Trees

    2. HTML::TreeBuilder

    3. Processing

    4. Example: BBC News

    5. Example: Fresh Air

  10. Chapter 10 Modifying HTML with Trees

    1. Changing Attributes

    2. Deleting Images

    3. Detaching and Reattaching

    4. Attaching in Another Tree

    5. Creating New Elements

  11. Chapter 11 Cookies, Authentication,and Advanced Requests

    1. Cookies

    2. Adding Extra Request Header Lines

    3. Authentication

    4. An HTTP Authentication Example:The Unicode Mailing Archive

  12. Chapter 12 Spiders

    1. Types of Web-Querying Programs

    2. A User Agent for Robots

    3. Example: A Link-Checking Spider

    4. Ideas for Further Expansion

  1. Appendix A LWP Modules

  2. Appendix B HTTP Status Codes

    1. 100s: Informational

    2. 200s: Successful

    3. 300s: Redirection

    4. 400s: Client Errors

    5. 500s: Server Errors

  3. Appendix C Common MIME Types

  4. Appendix D Language Tags

  5. Appendix E Common Content Encodings

  6. Appendix F ASCII Table

  7. Appendix G User's View of Object-Oriented Modules

    1. A User's View of Object-Oriented Modules

    2. Modules and Their Functional Interfaces

    3. Modules with Object-Oriented Interfaces

    4. What Can You Do with Objects?

    5. What's in an Object?

    6. What Is an Object Value?

    7. So Why Do Some Modules Use Objects?

    8. The Gory Details

  8. Colophon