On the World Wide Web, speed and efficiency are vital. Users have little patience for slow web pages, while network administrators want to make the most of their available bandwidth. A properly designed web cache reduces network traffic and improves access times to popular web sites--a boon to network administrators and web users alike.
Web Caching hands you all the technical information you need to design, deploy, and operate an effective web caching service. It starts with the basics of how web caching works, from the HTTP headers that govern cachability to cache validation and replacement algorithms.
Topics covered in this book include:
Designing an effective cache solution
Configuring web browsers to use a cache
Setting up a collection of caches that can talk to each other
Configuring an interception cache or proxy
Monitoring and fine-tuning the performance of a cache
Configuring web servers to cooperate with web caches
Benchmarking cache products
The book also covers the important political aspects of web caching, including privacy, intellectual property, and security issues.
Internet service providers, large corporations, or educational institutions--in short, any network that provides connectivity to a wide variety of users--can reap enormous benefit from running a well-tuned web caching service. Web Caching shows you how to do it right.
Chapter 1 Introduction
Web Architecture
Web Transport Protocols
Why Cache the Web?
Why Not Cache the Web?
Types of Web Caches
Caching Proxy Features
Meshes, Clusters, and Hierarchies
Products
Chapter 2 How Web Caching Works
HTTP Requests
Is It Cachable?
Hits, Misses, and Freshness
Hit Ratios
Validation
Forcing a Cache to Refresh
Cache Replacement
Chapter 3 Politics of Web Caching
Privacy
Request Blocking
Copyright
Offensive Content
Dynamic Web Pages
Content Integrity
Cache Busting and Server Busting
Advertising
Trust
Effects of Proxies
Chapter 4 Configuring Cache Clients
Proxy Addresses
Manual Proxy Configuration
Proxy Auto-Configuration Script
Web Proxy Auto-Discovery
Other Configuration Options
The Bottom Line
Chapter 5 Interception Proxying and Caching
Overview
The IP Layer: Routing
The TCP Layer: Ports and Delivery
The Application Layer: HTTP
Debugging Interception
Issues
To Intercept or Not To Intercept
Chapter 6 Configuring Servers to Work with Caches
Important HTTP Headers
Being Cache-Friendly
Being Cache-Unfriendly
Other Issues for Content Providers
Chapter 7 Cache Hierarchies
How Hierarchies Work
Why Join a Hierarchy?
Why Not Join a Hierarchy?
Optimizing Hierarchies
Chapter 8 Intercache Protocols
ICP
CARP
HTCP
Cache Digests
Which Protocol to Use
Chapter 9 Cache Clusters
The Hot Spare
Throughput and Load Sharing
Bandwidth
Chapter 10 Design Considerations for Caching Services
Appliance or Software Solution
Disk Space
Memory
Network Interfaces
Operating Systems
High Availability
Intercepting Traffic
Load Sharing
Location
Using a Hierarchy
Chapter 11 Monitoring the Health of Your Caches
What to Monitor?
Monitoring Tools
Chapter 12 Benchmarking Proxy Caches
Metrics
Performance Bottlenecks
Benchmarking Tools
Benchmarking Gotchas
How to Benchmark a Proxy Cache
Sample Benchmark Results
Appendix Analysis of Production Cache Trace Data
Reply and Object Sizes
Content Types
HTTP Headers
Protocols
Port Numbers
Popularity
Cachability
Service Times
Hit Ratios
Object Life Cycle
Request Methods
Reply Status Code
Appendix Internet Cache Protocol
ICPv2 Message Format
Opcodes
Option Flags
Experimental Features
Appendix Cache Array Routing Protocol
Membership Table
Routing Function
Examples
Appendix Hypertext Caching Protocol
Message Format and Magic Constants
HTCP Data Types
HTCP Opcodes
Appendix Cache Digests
The Cache Digest Implementation
Message Format
An Example
Appendix HTTP Status Codes
1xx Intermediate Status
2xx Successful Response
3xx Redirects
4xx Request Errors
5xx Server Errors
Appendix U.S.C. 17 Sec. 512. Limitations on Liability Relating to Material Online
Duane Wessels became interested in web caching in 1994 as a topic for his master's thesis in telecommunications at the University of Colorado, Boulder. He worked with members of the Harvest research project to develop web caching software. After the departure of other members to industry jobs, he continued the software development under the name Squid. Another significant part of Duane's research with the National Laboratory for Applied Network Research has been the operation of 6 to 8 large caches throughout the U.S. These caches receive requests from hundreds of other caches, all connected in a "global cache mesh."
Our look is the result of reader comments, our own experimentation, and feedback from distribution channels. Distinctive covers complement our distinctive approach to technical topics, breathing personality and life into potentially dry subjects. The animal on the cover of Web Caching is a rock thrush. Rock thrushes belong to the order Passeriformes, the largest order of birds, containing 5,700 species, or over half of all living birds. Passerines, as birds of this order are called, are perching birds with four toes on each foot, three that point forward and one larger one that points backward. Rock thrushes belong to either the genus Monticola or the genus Petrocossyphus, such as Monticola solitarius, the blue rock thrush, and Petrocossyphus imerinus, the littoral rock thrush. Leanne Soylemez was the production editor and copyeditor for Web Caching. Matt Hutchinson was the proofreader, and Jeff Holcomb provided quality control. Brenda Miller wrote the index.
Edie Freedman designed the cover of this book. The cover image is a 19th-century engraving from the Dover Pictorial Archive. Emma Colby produced the cover layout with QuarkXPress 4.1 using Adobe's ITC Garamond font.
Melanie Wang designed the interior layout based on a series design by Nancy Priest. The print version of this book was created by translating the DocBook XML markup of its source files into a set of gtroff macros using a filter developed at O'Reilly & Associates by Norman Walsh. Steve Talbott designed and wrote the underlying macro set on the basis of the GNU troff s macros; Lenny Muellner adapted them to XML and implemented the book design. The GNU groff text formatter version 1.11.1 was used to generate PostScript output. The text and heading fonts are ITC Garamond Light and Garamond Book; the code font is Constant Willison. The illustrations that appear in the book were produced by Robert Romano and Jessamyn Read using Macromedia FreeHand 9 and Adobe Photoshop 6. This colophon was written by Leanne Soylemez.