PDF Explained

Book description

At last, here’s an approachable introduction to the widely used Portable Document Format. PDFs are everywhere, both online and in printed form, but few people take advantage of the useful features or grasp the nuances of this format. This concise book provides a hands-on tour of the world’s leading page-description language for programmers, power users, and professionals in the search, electronic publishing, and printing industries. Illustrated with lots of examples, this book is the documentation you need to fully understand PDF.

  • Build a simple PDF file from scratch in a text editor
  • Learn the layout and content of a PDF file, as well as the syntax of its objects
  • Examine the logical structure of PDF objects, and learn how pages and their resources are arranged into a document
  • Create vector graphics and raster images in PDF, and deal with transparency, color spaces, and patterns
  • Explore PDF operators for building and showing text strings
  • Get up to speed on bookmarks, metadata, hyperlinks, annotations, and file attachments
  • Learn how encryption and document permissions work in PDF
  • Use the pdftk program to process PDF files from the command line

Publisher resources

View/Submit Errata

Table of contents

  1. A Note Regarding Supplemental Files
  2. Preface
    1. Who Should Read This Book
    2. Organization of Contents
    3. Content Updates
      1. May 22, 2012
    4. Acknowledgments
    5. Conventions Used in This Book
    6. Obtaining Code Examples
    7. Using Code Examples
    8. Safari® Books Online
    9. How to Contact Us
  3. 1. Introduction
    1. A Little History
      1. Page Description Languages
        1. Other page description languages
      2. Development of PDF
      3. Some Advantages of PDF
        1. Random access and linearization
        2. Stream creation and incremental update
        3. Embedded fonts
        4. Searchable text
      4. ISO Standardization
      5. Specialized Kinds of PDF
        1. PDF/A
        2. PDF/X
      6. Version Summary
    2. What’s in a PDF?
      1. Text and Fonts
      2. Vector Images
      3. Raster Images
      4. Color Spaces
      5. Metadata
      6. Navigation
      7. Optional Content
      8. Multimedia
      9. Interactive Forms
      10. Logical Structure and Reflow
      11. Security
      12. Compression
    3. Who Uses PDF?
      1. The Printing Industry
      2. Ebooks and Publishing
      3. PDF Forms
      4. Document Archiving
      5. As a File Format
    4. Useful Free Software
  4. 2. Building a Simple PDF
    1. Basic PDF Syntax
      1. Document Content
      2. Page Content
      3. File Structure
    2. Document Structure
    3. Building the Elements
      1. File Header
      2. Main Objects
      3. Graphical Content
      4. Catalog, Cross-Reference Table, and Trailer
    4. Putting it Together
    5. Remarks
  5. 3. File Structure
    1. File Layout
      1. Header
      2. Body
      3. Cross-Reference Table
      4. Trailer
    2. Lexical Conventions
    3. Objects
      1. Integers and Real Numbers
      2. Strings
        1. Hexadecimal strings
      3. Names
      4. Boolean Values
      5. Arrays
      6. Dictionaries
      7. Indirect References
    4. Streams and Filters
    5. Incremental Update
    6. Object and Cross-Reference Streams
    7. Linearized PDF
    8. How a PDF File is Read
    9. How a PDF File is Written
  6. 4. Document Structure
    1. Trailer Dictionary
    2. Document Information Dictionary
    3. Document Catalog
    4. Pages and Page Trees
    5. Text Strings
    6. Dates
    7. Putting it Together
  7. 5. Graphics
    1. Looking at Content Streams
    2. Operators and Graphics State
    3. Building and Painting Paths
      1. Bézier Curves
        1. Drawing circles with Bézier curves
      2. Filled Shapes and Winding Rules
    4. Colors and Color Spaces
    5. Transformations
    6. Clipping
    7. Transparency
    8. Shadings and Patterns
    9. Form XObjects
    10. Image XObjects
  8. 6. Text and Fonts
    1. Text and Fonts in PDF
    2. Text State
    3. Printing Text
      1. Text Sections
      2. Text Space and Text Positioning
      3. Showing Text
        1. Character and word spacing
        2. Text transforms
        3. Text rise
        4. Kerning and glyph adjustment
        5. Text rendering modes
    4. Defining and Embedding Fonts
      1. Font Types in PDF
      2. Type 1 Fonts
      3. Font Encodings
      4. Embedding a Font
    5. Extracting Text from a Document
    6. Resources
  9. 7. Document Metadata and Navigation
    1. Bookmarks and Destinations
      1. Destinations
      2. The Document Outline (Bookmarks)
        1. Building an example
    2. XML Metadata
    3. Annotations and Hyperlinks
    4. File Attachments
  10. 8. Encrypted Documents
    1. Introduction
    2. The Encryption Dictionary
    3. Reading Encrypted Documents
    4. Writing Encrypted Documents
    5. Editing Encrypted Documents
  11. 9. Working with Pdftk
    1. Command Line Syntax
    2. Merging Documents
      1. What Happens when Files are Merged
    3. Splitting Documents
      1. What Happens when Files are Split
    4. Stamps and Watermarks
      1. How a Stamp Is Added
    5. Extracting and Setting Metadata
    6. File Attachments
    7. Encryption and Decryption
      1. Decrypting Input Files
      2. Encrypting the Output
    8. Compression
  12. 10. PDF Software and Documentation
    1. PDF Viewers
      1. Adobe Reader
      2. Preview
      3. Xpdf
      4. GSview
    2. Software Libraries
      1. iText for Java and C#
      2. TCPDF for PHP
      3. Processing PDF with Perl
      4. PDF on Mac OS X
    3. Converting Formats
      1. PDF to PostScript and Back Again
      2. Rasterizing PDF to an Image
      3. Printing Files to PDF
    4. PDF Editors
      1. Adobe Acrobat
      2. Editing with Preview on Mac OS X
    5. PDF and Graphics Documentation
      1. ISO 32000 and the PDF File Format
      2. PDF Hacks
      3. Related Topics
      4. Forums and Discussion
      5. Adobe’s Website Resources
  13. Index
  14. About the Author
  15. Copyright

Product information

  • Title: PDF Explained
  • Author(s): John Whitington
  • Release date: December 2011
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781449310028