CVPR 2010 Tutorial on Learning multi-subspaces in computer vision
Description
Subspace methods are important for solving numerous problems in
machine learning, pattern recognition, and computer vision. While
subspace methods have been studied intensively (from PCA and beyond),
new results are emerging from the areas of sparse coding, matrix
factorization, and matrix completion. These results, in contrast with
more classical subspace learning approaches, deal with multiple
subspaces, of varying dimensions, automatically learned from the data.
This makes the model extremely non-linear and rich. Applications of
such methods range from computer vision (motion estimation, scene
analysis, etc) to data mining (collaborative filtering, gene
expression, etc).
The goal of this tutorial is to present the audience with a unifying
perspective of this problem, introducing the basic concepts and
connecting multi-subspace methods with sparse modeling, matrix
factorization, and dimensionality reduction. The presentation of the
theoretical foundations will be complemented with applications in
computer vision including motion segmentation, face recognition,
clustering, and registration.
Organizers
Yi Ma
Guillermo Sapiro
René Vidal
Keynote Speaker
Emmanuel Candés is a Professor of Mathematics, a Professor of
Statistics, and a member of the Institute of Computational and
Mathematical Engineering at Stanford University. He is also the Ronald
and Maxine Linde Professor of Applied and Computational Mathematics at
the California Institute of Technology (on leave). He received the
Ph.D. degree in statistics from Stanford University in 1998. His
research interests are in computational harmonic analysis, multiscale
analysis, mathematical optimization, statistical estimation and
detection with applications to the imaging sciences, signal
processing, scientific computing, and inverse problems.
Professor Candés received numerous awards, most notably the 2006
Alan T. Waterman Medal, which is the highest honor bestowed by the National
Science Foundation and which
recognizes the achievements of scientists who are no older than 35, or
not more than seven years beyond their doctorate. Other awards include
the 2008 Information Theory Society Paper Award, the 2005 James
H. Wilkinson Prize in Numerical Analysis and Scientific Computing
awarded by the Society of Industrial and Applied Mathematics (SIAM)
and the 2010 George Polya Prize awarded by SIAM. He has given over
40 plenary lectures at major international conferences.
Syllabus
-
Introduction (René Vidal - 5 minutes)
-
Robust Principal Component Analysis: Theory and Algorithms (Emanuel Candés - 45 minutes)
This part of this tutorial is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the L1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithmic approach for solving this optimization problem emphasizing the suitability of this approach for large scale problems.
-
Robust Principal Component Analysis: Applications (Yi Ma - 45
minutes)
This part of the tutorial will present several applications of Robust
PCA to various problems in image/video processing, computer vision,
and webdata analysis. In the area of video surveillance for example,
our methodology allows for the detection of objects in a cluttered
background, and in the area of face recognition, it offers a
principled way of removing shadows and specularities in images of
faces. Moreover, we will show how a special extension of the RPCA can
readily handle simultaneously transformation in domain and corruptions
in intensity, for various purposes such image alignment and
rectification.
-
Multi-Subspace Learning and Clustering via Sparse Representation (René Vidal - 45 minutes)
- Recovery of Block-Sparse Signals via Sparse Representation
- Clustering of Multiple Subspaces via Sparse Representation
- Applications to Motion and Video Segmentation
-
Collaborative Dictionary and Multi-Subspace Learning (Guillermo Sapiro - 45 minutes)
This part of the tutorial will show the need to work with learned
multiple subspaces and connect it with dictionary learning. After
presenting the underlying structure of the problem, we will primarily
concentrate on two examples illustrating both the need and the power
of multiple subspaces. The first one shows how to learn multiple PCAs
(or GMM) to obtain state-of-the-art image enhancement results at a
fraction of the computation cost of previous approaches. The second
one presents a hierarchical model where the multiple sub spaces are
exploited in a collaborative and hierarchical fashion, with
applications ranging from classification and clustering to source
separation.
- Supervised and Unsupervised Clustering via Subspaces and Learned Dictionaries
- Collaborative Hierarchical Sparse Models for Clasification and Reconstruction
- Applications to Object Recognition
- Discussion (15 minutes)
Prerequisites
The course is at the intermediate level which requires the audience to have some
basic knowledge in images, linear algebra, and optimization. Some knowledge
of introductory-level computer vision, image processing, and machine learning may increase
the appreciation of some of the theory and applications, but is not crucial.
The course will be sufficiently self-contained, making even beginning graduate students
appreciate the value of the framework.
Intended Audience
The intended audience are academicians, graduate students, and industrial researchers
who are interested in the state-of-the-art data modeling and machine learning techniques
for the modeling and segmenting data that are considered to be mixed, multi-modal, inhomogeneous,
heterogeneous, or hybrid. Audience with mathematical and theoretical inclination will
enjoy the course as much as the audience with practical tendency.
Tutorial History
This is the first time such a tutorial is delivered, since most of the topics of this tutorial
are being developed in the last year or so. Related tutorials include GPCA at
CVPR '07 and '08 (Ma and Vidal), sparse modeling at CVPR '09 (Ma) and sparse modeling at
ICCV '09 (Sapiro). Those tutorials, that had audience at the level of 50-100, include
some overlapping background material, but quite minimal. Participants from those tutorials will
receive over 80% of new material in this new tutorial.
The speakers have created semester long courses in the topics of this tutorial at their
respective institutions.
Funding
Limited funding is available from Microsoft Research Asia for students to partially cover their expenses for attending the course. Interested students should have their advisors send an e-mail to rvidal@jhu.edu by May 15th in order to apply for financial support.
Sponsors
The organizers acknowledge the financial support of Microsoft Research.