Mastering Python Regular Expressions

Book description

For Python developers, this concise and down-to-earth guide to regular expressions is all you need to gain vital new knowledge. From a theoretical overview to Python specifics, it explains everything in crystal clear language.

In Detail

Regular expressions are used by many text editors, utilities, and programming languages to search and manipulate text based on patterns. They are considered the Swiss army knife of text processing. Powerful search, replacement, extraction and validation of strings, repetitive and complex tasks are reduced to a simple pattern using regular expressions.

Mastering Python Regular Expressions will teach you about Regular Expressions, starting from the basics, irrespective of the language being used, and then it will show you how to use them in Python. You will learn the finer details of what Python supports and how to do it, and the differences between Python 2.x and Python 3.x.

The book starts with a general review of the theory behind the regular expressions to follow with an overview of the Python regex module implementation, and then moves on to advanced topics like grouping, looking around, and performance.

You will explore how to leverage Regular Expressions in Python, some advanced aspects of Regular Expressions and also how to measure and improve their performance. You will get a better understanding of the working of alternators and quantifiers. Also, you will comprehend the importance of grouping before finally moving on to performance optimization techniques like the RegexBuddy Tool and Backtracking.

Mastering Python Regular Expressions provides all the information essential for a better understanding of Regular Expressions in Python.

What You Will Learn

  • Explore the regular expressions syntax
  • Improve the readability and future maintenance of the regex
  • Find solutions for typical problems with regular expressions
  • Familiarize yourself with match and search operations
  • Leverage the look around technique to create powerful regular expressions
  • Gain insight on the uses of Groups
  • Get to know how the regex engine works through the Backtracking process
  • Enhance the performance of your regular expressions

Table of contents

  1. Mastering Python Regular Expressions
    1. Table of Contents
    2. Mastering Python Regular Expressions
    3. Credits
    4. About the Authors
    5. About the Reviewers
    6. www.PacktPub.com
      1. Support files, eBooks, discount offers and more
        1. Why Subscribe?
        2. Free Access for Packt account holders
    7. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    8. 1. Introducing Regular Expressions
      1. History, relevance, and purpose
      2. The regular expression syntax
        1. Literals
        2. Character classes
        3. Predefined character classes
        4. Alternation
        5. Quantifiers
          1. Greedy and reluctant quantifiers
        6. Boundary Matchers
      3. Summary
    9. 2. Regular Expressions with Python
      1. A brief introduction
      2. Backslash in string literals
        1. String Python 2.x
      3. Building blocks for Python regex
        1. RegexObject
          1. Searching
            1. match(string[, pos[, endpos]])
            2. search(string[, pos[, endpos]])
            3. findall(string[, pos[, endpos]])
            4. finditer(string[, pos[, endpos]])
          2. Modifying a string
            1. split(string, maxsplit=0)
            2. sub(repl, string, count=0)
            3. subn(repl, string, count=0)
        2. MatchObject
          1. group([group1, …])
          2. groups([default])
          3. groupdict([default])
          4. start([group])
          5. end([group])
          6. span([group])
          7. expand(template)
        3. Module operations
          1. escape()
          2. purge()
      4. Compilation flags
        1. re.IGNORECASE or re.I
        2. re.MULTILINE or re.M
        3. re.DOTALL or re.S
        4. re.LOCALE or re.L
        5. re.UNICODE or re.U
        6. re.VERBOSE or re.X
        7. re.DEBUG
      5. Python and regex special considerations
        1. Differences between Python and other flavors
        2. Unicode
        3. What's new in Python 3
      6. Summary
    10. 3. Grouping
      1. Introduction
      2. Backreferences
      3. Named groups
      4. Non-capturing groups
        1. Atomic groups
      5. Special cases with groups
        1. Flags per group
        2. yes-pattern|no-pattern
      6. Overlapping groups
      7. Summary
    11. 4. Look Around
      1. Look ahead
        1. Negative look ahead
      2. Look around and substitutions
      3. Look behind
        1. Negative look behind
      4. Look around and groups
      5. Summary
    12. 5. Performance of Regular Expressions
      1. Benchmarking regular expressions with Python
      2. The RegexBuddy tool
      3. Understanding the Python regex engine
        1. Backtracking
      4. Optimization recommendations
        1. Reuse compiled patterns
        2. Extract common parts in alternation
        3. Shortcut to alternation
        4. Use non-capturing groups when appropriate
        5. Be specific
        6. Don't be greedy
      5. Summary
    13. Index

Product information

  • Title: Mastering Python Regular Expressions
  • Author(s): Félix López, Víctor Romero
  • Release date: February 2014
  • Publisher(s): Packt Publishing
  • ISBN: 9781783283156