|
CPSC 525: Course Outline and Reading List
Instructor:
David Lowe
January-April 2012
Course home page:
http://www.cs.ubc.ca/~lowe/525
Textbook: While most of the course is based on original research
papers, we will also consult the following textbook by Richard
Szeliski. It is available for free on-line, or can be purchased in
printed form.
-
Computer Vision: Algorithms and Applications by Richard Szeliski
The following is a tentative list of topics and readings for the
course. It will be changed and updated as the course proceeds.
Introduction
The first class will provide an overview of the computer vision field and
its applications.
-
Read Chapter 1 of Szeliski's book
for an introduction to computer vision and a brief history of the field.
Image matching and recognition with invariant local features
Interest points. Rotation, scale, and illumination invariance. Image region
descriptors. RANSAC. The Hough transform.
Section 4.1, Feature Detection and Matching, from
Szeliski's book
David G. Lowe,
"Distinctive image features from scale-invariant keypoints,"
International Journal of Computer Vision,
60, 2 (2004), pp. 91-110.
[PDF]
M. Calonder, V. Lepetit, C. Strecha, and P. Fua,
"BRIEF: Binary Robust Independent Elementary Features,"
European Conference on Computer Vision (ECCV), 2010.
[PDF]
Articles from Wikipedia:
RANSAC;
The Hough Transform;
Image registration and 3D reconstruction
Non-linear least-squares with Gauss-Newton.
Levenberg-Marquardt. Robust solutions. Solving for 3D structure and
camera pose. Dense surface reconstruction.
Section 6.1, Feature-based alignment, from
Szeliski's book
Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring
photo collections in 3D," ACM Transactions on Graphics (SIGGRAPH),
25(3), 2006, 835-846.
[PDF]
Optional:
Michael Goesele, Noah Snavely, Brian Curless, Hugues Hoppe, Steven M. Seitz,
"Multi-View Stereo for Community Photo Collections,"
ICCV (2007).
[PDF]
[Project web site]
Background reading:
Linear Least Squares;
Gauss-Newton Algorithm;
Levenberg-Marquardt;
Matching and recognition in large datasets
Scaling recognition to large image collections.
K-means clustering algorithm. K-d trees.
Approximate nearest-neighbour matching in high-dimensional spaces.
FLANN.
David Nister and Henrik Stewenius,
"Scalable recognition with a vocabulary tree,"
Conference on Computer Vision and Pattern Recognition, 2006.
[PDF]
Marius Muja and David G. Lowe,
"Fast approximate nearest neighbors with automatic algorithm configuration,"
International Conference on Computer
Vision Theory and Applications (VISAPP), 2009.
[PDF];
[Source code]
Articles from Wikipedia:
K-means clustering;
K-d trees
Learning to recognize object categories
Face detection. The AdaBoost alogorithm.
Learning generative and discriminative models. The bag-of-features approach
versus learned geometry. Object segmentation from recognition.
Paul Viola and Michael Jones,
"Rapid object detection using a boosted cascade of simple features,"
Conference on Computer Vision and Pattern Recognition, 2001,
pp. 511-518.
[PDF]
For background on AdaBoost, read Freund and Schapire,
"A short introduction to boosting," JJSAI, 1999.
[PDF]
Chapter 14, Recognition, from
Szeliski's book
Li Fei-Fei, Rob Fergus, Antonio Torralba,
"ICCV 2009 Short Course: Recognizing and Learning Object Categories."
[Course page, including Matlab code]
Optional:
Bastian Leibe, Edgar Seemann, and Bernt Schiele,
"Pedestrian detection in crowded scenes,"
CVPR 2005, San Diego (June 2005).
[PDF]
Motion tracking and interpretation
Measuring optical flow. Structure from motion. Kalman filter and
estimation theory. Color histograms. Tracking with particle filters.
Action recognition.
-
Andrew J. Davison, Ian Reid, Nicholas Molton and Olivier Stasse,
"MonoSLAM: Real-Time Single Camera SLAM,"
IEEE PAMI, (June 2007).
[PDF]
[Davison's web site]
P. Pérez, C. Hue, J. Vermaak and M. Gangnet,
"Color-based probabilistic tracking,"
European Conference on Computer Vision, ECCV 2002,
Copenhagen, Denmark (June 2002).
[PDF]
Alexei A. Efros, Alexander C. Berg, Greg Mori and Jitendra Malik,
"Recognizing Action at a Distance,"
International Conference on Computer Vision, Nice, France (2003).
[PDF]
Demos
showing analysis and synthesis of human motion, from
Nick Troje lab at Queen's University.
Neurophysiology of vision
Structure of the visual cortex. Higher-level neurophysiology of
vision. "What" vs. "where" pathways in the brain. Models of
recognition in the brain.
Simon A.J. Winder, "A brief survey of central mechanisms in primate
visual perception," (2002).
[PDF]
R. Quiroga, et al., "Invariant visual representation by single neurons in
the human brain," Nature (2005).
[PDF]
T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio,
"Object recognition with cortex-like mechanisms,"
IEEE PAMI (2007).
[PDF]
Scene and texture perception
Recognition of scenes. Texture perception. The gist. Recognition of
low-resolution images.
S. Lazebnik, C. Schmid, and J. Ponce,
"Beyond Bags of Features: Spatial Pyramid Matching for Recognizing
Natural Scene Categories,"
IEEE Conference on Computer Vision and Pattern Recognition,
New York (June 2006).
[PDF]
A. Torralba, R. Fergus, W. T. Freeman,
"80 million tiny images: a large dataset for non-parametric object and
scene recognition,"
PAMI, 30, 11 (2008).
[PDF]
Colour vision
Colour spaces. Colour constancy. The use of colour for recognition.
Section 2.3.2, Color, from
Szeliski's book
Brian Funt, Kobus Barnard and Lindsay Martin, "Is colour constancy
good enough?" European Conference on Computer Vision,
(1998), pp. 445-459.
[PDF]
Recognition using shape and contours
Chamfer matching. Shape features. Combining contour and patch features.
Jamie Shotton, Andrew Blake, and Roberto Cipolla,
"Contour-Based Learning for Object Detection,"
ICCV (2005).
[PDF]
V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid,
"Groups of Adjacent Contour Segments for Object Detection,"
INRIA Technical Report, Grenoble (September 2006).
[PDF]
Project presentations
The final few classes will consist of project presentations.
|