Introducing Regular Expressions

Book description

If you’re a programmer new to regular expressions, this easy-to-follow guide is a great place to start. You’ll learn the fundamentals step-by-step with the help of numerous examples, discovering first-hand how to match, extract, and transform text by matching specific words, characters, and patterns.

Regular expressions are an essential part of a programmer’s toolkit, available in various Unix utlilities as well as programming languages such as Perl, Java, JavaScript, and C#. When you’ve finished this book, you’ll be familiar with the most commonly used syntax in regular expressions, and you’ll understand how using them will save you considerable time.

  • Discover what regular expressions are and how they work
  • Learn many of the differences between regular expressions used with command-line tools and in various programming languages
  • Apply simple methods for finding patterns in text, including digits, letters, Unicode characters, and string literals
  • Learn how to use zero-width assertions and lookarounds
  • Work with groups, backreferences, character classes, and quantifiers
  • Use regular expressions to mark up plain text with HTML5

Publisher resources

View/Submit Errata

Table of contents

  1. Introducing Regular Expressions
  2. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  3. Preface
    1. Who Should Read This Book
    2. What You Need to Use This Book
    3. Conventions Used in This Book
    4. Using Code Examples
    5. Safari® Books Online
    6. How to Contact Us
    7. Acknowledgments
  4. 1. What Is a Regular Expression?
    1. Getting Started with Regexpal
    2. Matching a North American Phone Number
    3. Matching Digits with a Character Class
    4. Using a Character Shorthand
    5. Matching Any Character
    6. Capturing Groups and Back References
    7. Using Quantifiers
    8. Quoting Literals
    9. A Sample of Applications
    10. What You Learned in Chapter 1
    11. Technical Notes
  5. 2. Simple Pattern Matching
    1. Matching String Literals
    2. Matching Digits
    3. Matching Non-Digits
    4. Matching Word and Non-Word Characters
    5. Matching Whitespace
    6. Matching Any Character, Once Again
    7. Marking Up the Text
      1. Using sed to Mark Up Text
      2. Using Perl to Mark Up Text
    8. What You Learned in Chapter 2
    9. Technical Notes
  6. 3. Boundaries
    1. The Beginning and End of a Line
    2. Word and Non-word Boundaries
    3. Other Anchors
    4. Quoting a Group of Characters as Literals
    5. Adding Tags
      1. Adding Tags with sed
      2. Adding Tags with Perl
    6. What You Learned in Chapter 3
    7. Technical Notes
  7. 4. Alternation, Groups, and Backreferences
    1. Alternation
    2. Subpatterns
    3. Capturing Groups and Backreferences
      1. Named Groups
    4. Non-Capturing Groups
      1. Atomic Groups
    5. What You Learned in Chapter 4
    6. Technical Notes
  8. 5. Character Classes
    1. Negated Character Classes
    2. Union and Difference
    3. POSIX Character Classes
    4. What You Learned in Chapter 5
    5. Technical Notes
  9. 6. Matching Unicode and Other Characters
    1. Matching a Unicode Character
      1. Using vim
    2. Matching Characters with Octal Numbers
    3. Matching Unicode Character Properties
    4. Matching Control Characters
    5. What You Learned in Chapter 6
    6. Technical Notes
  10. 7. Quantifiers
    1. Greedy, Lazy, and Possessive
    2. Matching with *, +, and ?
    3. Matching a Specific Number of Times
    4. Lazy Quantifiers
    5. Possessive Quantifiers
    6. What You Learned in Chapter 7
    7. Technical Notes
  11. 8. Lookarounds
    1. Positive Lookaheads
    2. Negative Lookaheads
    3. Positive Lookbehinds
    4. Negative Lookbehinds
    5. What You Learned in Chapter 8
    6. Technical Notes
  12. 9. Marking Up a Document with HTML
    1. Matching Tags
    2. Transforming Plain Text with sed
      1. Substitution with sed
      2. Handling Roman Numerals with sed
      3. Handling a Specific Paragraph with sed
      4. Handling the Lines of the Poem with sed
    3. Appending Tags
      1. Using a Command File with sed
    4. Transforming Plain Text with Perl
      1. Handling Roman Numerals with Perl
      2. Handling a Specific Paragraph with Perl
      3. Handling the Lines of the Poem with Perl
      4. Using a File of Commands with Perl
    5. What You Learned in Chapter 9
    6. Technical Notes
  13. 10. The End of the Beginning
    1. Learning More
    2. Notable Tools, Implementations, and Libraries
      1. Perl
      2. PCRE
      3. Ruby (Oniguruma)
      4. Python
      5. RE2
    3. Matching a North American Phone Number
    4. Matching an Email Address
    5. What You Learned in Chapter 10
  14. A. Regular Expression Reference
    1. Regular Expressions in QED
    2. Metacharacters
    3. Character Shorthands
    4. Whitespace
    5. Unicode Whitespace Characters
    6. Control Characters
    7. Character Properties
    8. Script Names for Character Properties
    9. POSIX Character Classes
    10. Options/Modifiers
    11. ASCII Code Chart with Regex
    12. Technical Notes
  15. Regular Expression Glossary
  16. Index
  17. About the Author
  18. Colophon
  19. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  20. Copyright

Product information

  • Title: Introducing Regular Expressions
  • Author(s): Michael Fitzgerald
  • Release date: July 2012
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781449392680