CPSC 525: Course Outline and Reading List

Instructor: Jim Little
January-April 2013

Course home page: http://www.cs.ubc.ca/~little/525

Textbook: While most of the course is based on original research papers, we will also consult the following textbook by Richard Szeliski. It is available for free on-line, or can be purchased in printed form.

Computer Vision: Algorithms and Applications by Richard Szeliski


The following is a tentative list of topics and readings for the course. It will be changed and updated as the course proceeds.

Introduction

The first class will provide an overview of the computer vision field and its applications.
  • Read Chapter 1 of Szeliski's book for an introduction to computer vision and a brief history of the field.

Stereo vision

Topics: Epipolar geometry and rectification. Correlation and feature matching. Discussion of the first assignment. Belief propagation.
  • Pascal Fua, "A parallel stereo algorithm that produces dense depth maps and preserves image features," Machine Vision and Applications, 6 (1993), 35--49. [PDF]
  • D. Scharstein and R. Szeliski. "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms," International Journal of Computer Vision, 47 (2002), pp. 7-42. [Web site with data and source code] [PDF]

  • Pedro Felzenszwalb and Daniel Huttenlocher, "Efficient Belief Propagation for Early Vision," Conference on Computer Vision and Pattern Recognition (CVPR), 2004. [Web site with source code] [PDF]

  • Examples of commercial systems: Point Grey Research; Tyzx; CogniTens

Image matching and recognition with invariant local features

Interest points. Rotation, scale, and illumination invariance. Image region descriptors. RANSAC. The Hough transform.
  • Section 4.1, Feature Detection and Matching, from Szeliski's book

  • David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110. [PDF]

  • M. Calonder, V. Lepetit, C. Strecha, and P. Fua, "BRIEF: Binary Robust Independent Elementary Features," European Conference on Computer Vision (ECCV), 2010. [PDF]

  • Articles from Wikipedia: RANSAC; The Hough Transform;

Image registration and 3D reconstruction

Non-linear least-squares with Gauss-Newton. Levenberg-Marquardt. Robust solutions. Solving for 3D structure and camera pose. Dense surface reconstruction.

Matching and recognition in large datasets

Scaling recognition to large image collections. K-means clustering algorithm. K-d trees. Approximate nearest-neighbour matching in high-dimensional spaces. FLANN.
  • David Nister and Henrik Stewenius, "Scalable recognition with a vocabulary tree," Conference on Computer Vision and Pattern Recognition, 2006. [PDF]

  • Marius Muja and David G. Lowe, "Fast approximate nearest neighbors with automatic algorithm configuration," International Conference on Computer Vision Theory and Applications (VISAPP), 2009. [PDF]; [Source code]

  • Articles from Wikipedia: K-means clustering; K-d trees

Learning to recognize object categories

Face detection. The AdaBoost alogorithm. Learning generative and discriminative models. The bag-of-features approach versus learned geometry. Object segmentation from recognition.
  • Paul Viola and Michael Jones, "Rapid object detection using a boosted cascade of simple features," Conference on Computer Vision and Pattern Recognition, 2001, pp. 511-518. [PDF]
    For background on AdaBoost, read Freund and Schapire, "A short introduction to boosting," JJSAI, 1999. [PDF]

  • Chapter 14, Recognition, from Szeliski's book

  • Li Fei-Fei, Rob Fergus, Antonio Torralba, "ICCV 2009 Short Course: Recognizing and Learning Object Categories." [Course page, including Matlab code]

  • Optional: Bastian Leibe, Edgar Seemann, and Bernt Schiele, "Pedestrian detection in crowded scenes," CVPR 2005, San Diego (June 2005). [PDF]

Motion tracking and interpretation

Measuring optical flow. Structure from motion. Kalman filter and estimation theory. Color histograms. Tracking with particle filters. Action recognition.
  • Andrew J. Davison, Ian Reid, Nicholas Molton and Olivier Stasse, "MonoSLAM: Real-Time Single Camera SLAM," IEEE PAMI, (June 2007). [PDF] [Davison's web site]
  • P. Pérez, C. Hue, J. Vermaak and M. Gangnet, "Color-based probabilistic tracking," European Conference on Computer Vision, ECCV 2002, Copenhagen, Denmark (June 2002). [PDF]

  • Alexei A. Efros, Alexander C. Berg, Greg Mori and Jitendra Malik, "Recognizing Action at a Distance," International Conference on Computer Vision, Nice, France (2003). [PDF]

  • Demos showing analysis and synthesis of human motion, from Nick Troje lab at Queen's University.

Neurophysiology of vision

Structure of the visual cortex. Higher-level neurophysiology of vision. "What" vs. "where" pathways in the brain. Models of recognition in the brain.
  • Simon A.J. Winder, "A brief survey of central mechanisms in primate visual perception," (2002). [PDF]

  • R. Quiroga, et al., "Invariant visual representation by single neurons in the human brain," Nature (2005). [PDF]

  • T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio, "Object recognition with cortex-like mechanisms," IEEE PAMI (2007). [PDF]

Scene and texture perception

Recognition of scenes. Texture perception. The gist. Recognition of low-resolution images.
  • S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," IEEE Conference on Computer Vision and Pattern Recognition, New York (June 2006). [PDF]

  • A. Torralba, R. Fergus, W. T. Freeman, "80 million tiny images: a large dataset for non-parametric object and scene recognition," PAMI, 30, 11 (2008). [PDF]

Colour vision

Colour spaces. Colour constancy. The use of colour for recognition.
  • Section 2.3.2, Color, from Szeliski's book

  • Brian Funt, Kobus Barnard and Lindsay Martin, "Is colour constancy good enough?" European Conference on Computer Vision, (1998), pp. 445-459. [PDF]

Recognition using shape and contours

Chamfer matching. Shape features. Combining contour and patch features.
  • Jamie Shotton, Andrew Blake, and Roberto Cipolla, "Contour-Based Learning for Object Detection," ICCV (2005). [PDF]

  • V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid, "Groups of Adjacent Contour Segments for Object Detection," INRIA Technical Report, Grenoble (September 2006). [PDF]

Project presentations

The final few classes will consist of project presentations.


This page is maintained by David Lowe.