Programming Computer Vision with Python

Book description

If you want a basic understanding of computer vision’s underlying theory and algorithms, this hands-on introduction is the ideal place to start. You’ll learn techniques for object recognition, 3D reconstruction, stereo imaging, augmented reality, and other computer vision applications as you follow clear examples written in Python.

Programming Computer Vision with Python explains computer vision in broad terms that won’t bog you down in theory. You get complete code samples with explanations on how to reproduce and build upon each example, along with exercises to help you apply what you’ve learned. This book is ideal for students, researchers, and enthusiasts with basic programming and standard mathematical skills.

  • Learn techniques used in robot navigation, medical image analysis, and other computer vision applications
  • Work with image mappings and transforms, such as texture warping and panorama creation
  • Compute 3D reconstructions from several images of the same scene
  • Organize images based on similarity or content, using clustering methods
  • Build efficient image retrieval techniques to search for images based on visual content
  • Use algorithms to classify image content and recognize objects
  • Access the popular OpenCV library through a Python interface

Publisher resources

View/Submit Errata

Table of contents

  1. Programming Computer Vision with Python
  2. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  3. Preface
    1. Prerequisites and Overview
      1. What You Need to Know
      2. What You Will Learn
      3. Chapter Overview
    2. Introduction to Computer Vision
    3. Python and NumPy
    4. Notation and Conventions
    5. Using Code Examples
    6. How to Contact Us
    7. Safari® Books Online
    8. Acknowledgments
  4. 1. Basic Image Handling and Processing
    1. 1.1 PIL—The Python Imaging Library
      1. Convert Images to Another Format
      2. Create Thumbnails
      3. Copy and Paste Regions
      4. Resize and Rotate
    2. 1.2 Matplotlib
      1. Plotting Images, Points, and Lines
      2. Image Contours and Histograms
      3. Interactive Annotation
    3. 1.3 NumPy
      1. Array Image Representation
      2. Graylevel Transforms
      3. Image Resizing
      4. Histogram Equalization
      5. Averaging Images
      6. PCA of Images
      7. Using the Pickle Module
    4. 1.4 SciPy
      1. Blurring Images
      2. Image Derivatives
      3. Morphology—Counting Objects
      4. Useful SciPy Modules
        1. Reading and writing .mat files
        2. Saving arrays as images
    5. 1.5 Advanced Example: Image De-Noising
    6. Exercises
    7. Conventions for the Code Examples
  5. 2. Local Image Descriptors
    1. 2.1 Harris Corner Detector
      1. Finding Corresponding Points Between Images
    2. 2.2 SIFT—Scale-Invariant Feature Transform
      1. Interest Points
      2. Descriptor
      3. Detecting Interest Points
      4. Matching Descriptors
    3. 2.3 Matching Geotagged Images
      1. Downloading Geotagged Images from Panoramio
      2. Matching Using Local Descriptors
      3. Visualizing Connected Images
    4. Exercises
  6. 3. Image to Image Mappings
    1. 3.1 Homographies
      1. The Direct Linear Transformation Algorithm
      2. Affine Transformations
    2. 3.2 Warping Images
      1. Image in Image
      2. Piecewise Affine Warping
      3. Registering Images
    3. 3.3 Creating Panoramas
      1. RANSAC
      2. Robust Homography Estimation
      3. Stitching the Images Together
    4. Exercises
  7. 4. Camera Models and Augmented Reality
    1. 4.1 The Pin-Hole Camera Model
      1. The Camera Matrix
      2. Projecting 3D Points
      3. Factoring the Camera Matrix
      4. Computing the Camera Center
    2. 4.2 Camera Calibration
      1. A Simple Calibration Method
    3. 4.3 Pose Estimation from Planes and Markers
    4. 4.4 Augmented Reality
      1. PyGame and PyOpenGL
      2. From Camera Matrix to OpenGL Format
      3. Placing Virtual Objects in the Image
      4. Tying It All Together
      5. Loading Models
    5. Exercises
  8. 5. Multiple View Geometry
    1. 5.1 Epipolar Geometry
      1. A Sample Data Set
      2. Plotting 3D Data with Matplotlib
      3. Computing F—The Eight Point Algorithm
      4. The Epipole and Epipolar Lines
    2. 5.2 Computing with Cameras and 3D Structure
      1. Triangulation
      2. Computing the Camera Matrix from 3D Points
      3. Computing the Camera Matrix from a Fundamental Matrix
        1. The uncalibrated case—projective reconstruction
        2. The calibrated case—metric reconstruction
    3. 5.3 Multiple View Reconstruction
      1. Robust Fundamental Matrix Estimation
      2. 3D Reconstruction Example
      3. Extensions and More Than Two Views
        1. More views
        2. Bundle adjustment
        3. Self-calibration
    4. 5.4 Stereo Images
      1. Computing Disparity Maps
    5. Exercises
  9. 6. Clustering Images
    1. 6.1 K-Means Clustering
      1. The SciPy Clustering Package
      2. Clustering Images
      3. Visualizing the Images on Principal Components
      4. Clustering Pixels
    2. 6.2 Hierarchical Clustering
      1. Clustering Images
    3. 6.3 Spectral Clustering
    4. Exercises
  10. 7. Searching Images
    1. 7.1 Content-Based Image Retrieval
      1. Inspiration from Text Mining—The Vector Space Model
    2. 7.2 Visual Words
      1. Creating a Vocabulary
    3. 7.3 Indexing Images
      1. Setting Up the Database
      2. Adding Images
    4. 7.4 Searching the Database for Images
      1. Using the Index to Get Candidates
      2. Querying with an Image
      3. Benchmarking and Plotting the Results
    5. 7.5 Ranking Results Using Geometry
    6. 7.6 Building Demos and Web Applications
      1. Creating Web Applications with CherryPy
      2. Image Search Demo
    7. Exercises
  11. 8. Classifying Image Content
    1. 8.1 K-Nearest Neighbors
      1. A Simple 2D Example
      2. Dense SIFT as Image Feature
      3. Classifying Images—Hand Gesture Recognition
    2. 8.2 Bayes Classifier
      1. Using PCA to Reduce Dimensions
    3. 8.3 Support Vector Machines
      1. Using LibSVM
      2. Hand Gesture Recognition Again
    4. 8.4 Optical Character Recognition
      1. Training a Classifier
      2. Selecting Features
      3. Multi-Class SVM
      4. Extracting Cells and Recognizing Characters
      5. Rectifying Images
    5. Exercises
  12. 9. Image Segmentation
    1. 9.1 Graph Cuts
      1. Graphs from Images
      2. Segmentation with User Input
    2. 9.2 Segmentation Using Clustering
    3. 9.3 Variational Methods
    4. Exercises
  13. 10. OpenCV
    1. 10.1 The OpenCV Python Interface
    2. 10.2 OpenCV Basics
      1. Reading and Writing Images
      2. Color Spaces
      3. Displaying Images and Results
    3. 10.3 Processing Video
      1. Video Input
      2. Reading Video to NumPy Arrays
    4. 10.4 Tracking
      1. Optical Flow
      2. The Lucas-Kanade Algorithm
        1. Using the tracker
        2. Using generators
    5. 10.5 More Examples
      1. Inpainting
      2. Segmentation with the Watershed Transform
      3. Line Detection with a Hough Transform
    6. Exercises
  14. A. Installing Packages
    1. A.1 NumPy and SciPy
      1. Windows
      2. Mac OS X
      3. Linux
    2. A.2 Matplotlib
    3. A.3 PIL
    4. A.4 LibSVM
    5. A.5 OpenCV
      1. Windows and Unix
      2. Mac OS X
      3. Linux
    6. A.6 VLFeat
    7. A.7 PyGame
    8. A.8 PyOpenGL
    9. A.9 Pydot
    10. A.10 Python-graph
    11. A.11 Simplejson
    12. A.12 PySQLite
    13. A.13 CherryPy
  15. B. Image Datasets
    1. B.1 Flickr
    2. B.2 Panoramio
    3. B.3 Oxford Visual Geometry Group
    4. B.4 University of Kentucky Recognition Benchmark Images
    5. B.5 Other
      1. Prague Texture Segmentation Datagenerator and Benchmark
      2. MSR Cambridge Grab Cut Dataset
      3. Caltech 101
      4. Static Hand Posture Database
      5. Middlebury Stereo Datasets
  16. C. Image Credits
    1. C.1 Images from Flickr
    2. C.2 Other Images
    3. C.3 Illustrations
  17. D. References
  18. E. About the Author
  19. Index
  20. About the Author
  21. Colophon
  22. SPECIAL OFFER: Upgrade this ebook with O’Reilly
  23. Copyright

Product information

  • Title: Programming Computer Vision with Python
  • Author(s): Jan Erik Solem
  • Release date: June 2012
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781449316549