JHU Computer Vision Machine Learning

Project summary

Principal Component Analysis (PCA) refers to the problem of fitting a linear subspace S in R^D of unknown dimension d < D to N sample points {x_j}, j=1,...,N drawn from S. This problem shows up in a variety of applications in many fields, e.g., pattern recognition, data compression, regression, image processing, etc., and can be solved in a remarkably simple way from the singular value decomposition (SVD) of the (mean-subtracted) data matrix [x₁, x₂, ..., x_N] in R^{D x N}.

Generalized Principal Component Analysis is an extension of PCA where the data X={x_j}, j=1,...,N, is assumed to be drawn from a collection of n >= 1 different linear subspaces S_i in R^D, i=1,...,n of dimension d_i = dim(S_i), 0 < d_i < D. An example for n=2 and D=2 is shown in the figure below.

The task is to find the number of subspaces, their dimensions, a basis for each subspace, and the segmentation of the data, without knowing which data point belong to which subspace.

Most existing subspace clustering algorithms of the iterative algorithms try to find a solution by iterating between two steps: given an estimated clustering of the points, find best subspace for each group; given the subspaces, cluster the data accordingly. Unfortunately, this iterative procedure often converges to a local minimum, so the initialization is critical in order to find a good solution.

GPCA is an algebraic technique that avoids this problem and bypasses the iteration steps by fitting a single model (a polynomial) which represents the union of all subspaces (i.e., the model will be satisfied by a point belonging to anyone of the subspaces). This global model is then factorized into the single models to obtain a basis for each one of the subspaces.

PROBLEM DESCRIPTION

GPCA is an algebraic subspace clustering technique based on polynomial fitting and differentiation. The key idea is to represent the number of subspaces as the degree of a polynomial and encode the subspace bases as factors of these polynomials.

Basic GPCA Algorithm

For the sake of simplicity, assume that the data was distributed on a collection of two hyperplanes, each of one can be represented by the equation x^Tb_i=0. By multiplying the two equations for the two subspaces, we obtain

(x^Tb₁)(x^Tb₂)=p(x)=0

where p(x) is a homogeneous polynomial of degree two in the coordinates of x which is satisfied by all data points, regardless of which hyperplane they belong to. Therefore this polynomial can be considered as a global model representing all subspaces.

Fitting of the polynomial

As with any polynomial, p(x) may be expanded linearly in terms of some coefficients with respect to a basis of polynomials. Therefore, given enough data points from the subspaces, the coefficients of p(x) can be linearly estimated from the data by using a non-linear embedding (the Veronese map). Building upon our previous example, assume that x lives in R³, say x=[x,y,z]^T. Then, p(x) can be expanded as

p(x)=ax²+by²+cz²+dxy+eyz+fxz.

It is immediate to note that the polynomial is non-linear in the coordinates x,y,z but is linear in the coefficients a,b,c,d,e,f. The latter can be therefore retrived (up to a scale factor) by solving the homogeneous linear system

p(x_i)=0, i=1,...,n.

Basis extraction by differentiation

Now, the set of points x such that p(x)=0 can be thought of as a surface (in our previous example, p(x) is the equation of a conic in the projective plane P²). Moreover, the gradient of p(x), ∇p(x) must be orthogonal to this surface. Since the surface is a union of subspaces, the gradient at a point must be orthogonal to the subspace passing through that point. As a result, we may compute the normal vectors at a point as the gradient of p(x) at that point, as illustrated in the figure below.

Since all points in the same cluster should have the same (or similar) normal vectors, we can cluster the data by clustering these normal vectors. In fact, the angles between these normal vectors provide use with a similarity measure that can be used together with spectral clustering.

Extensions of GPCA

Clustering subspaces of different dimensions

In the case of hyperplanes, a single polynomial is enough, but in the case of subspaces of different dimensions, it is necessary to fit more polynomials. The gradient of each one of these polynomials at a point x will give possibly a different normal vector. All these normal vectors will form a basis for the complement of the subspace passing through x. The principal angles between these orthogonal subspaces may now be used to define a similarity measure, and clustering proceeds as before. More details can be found in the references.

Dealing with noise and outliers

GPCA is an algebraic technique which provide the exact algebraic solution in closed form when the data is perfect. With a moderate amount of noise, GPCA will still operate correctly, as the polynomials can be estimated using least squares. However, in the presence of large amounts of noise and outliers, the coefficients may be estimated incorrectly. For this reason, some variations of the main algorithm have been developed. Some of these employ spectral clustering, voting schemes, multivariate-trimming and influence functions to deal with non-perfect data. Please refer to the page from UIUC for more details and code samples on this topic.

Code

An implementation of GPCA in Matlab (with applications to motion segmentation) is also available from our group. See the Code page.

Applications of GPCA and publications

GPCA has been successfully applied in various field of computer vision. In the following we present a (non-exaustive) list of applications that are based on the ideas of GPCA.

Subspace clustering

[1]

R. Vidal.

Generalized Principal Component Analysis (GPCA): an Algebraic Geometric Approach to Subspace Clustering and Motion Segmentation.

Ph.D. Thesis, University of California at Berkeley, 2003.

[2]

R. Vidal, Y. Ma and S. Sastry.

Generalized Principal Component Analysis (GPCA).

IEEE Conference on Computer Vision and Pattern Recognition, 2003.

[3]

R. Vidal and Y. Ma and J. Piazzi.

A New GPCA Algorithm for Clustering Subspaces by Fitting, Differentiating and Dividing Polynomials.

IEEE Conference on Computer Vision and Pattern Recognition, 2004.

[4]

K. Huang, Y. Ma, R. Vidal

Minimum Effective Dimension for Mixtures of Subspaces: A Robust GPCA Algorithm and its Applications

IEEE Conference on Computer Vision and Pattern Recognition, 2004

[5]

R. Vidal, Y. Ma and S. Sastry.

Generalized Principal Component Analysis (GPCA)

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005.

[6]

R. Vidal

Combined central and subspace clustering for computer vision applications

International Conference on Machine Learning, 2006

[7]

H. Yi, D. Rajan, L.T. Chia

A ZGPCA Algorithm for Subspace Estimation

IEEE International Conference on Multimedia and Expo, 2007

[8]

D. Zhao, Z. Lin, X. Tang

Contextual Distance for Data Perception

IEEE International Conference on Computer Vision, 2007

[9]

Y. Ma, A. Yang, H. Derksen, R. Fossum

Estimation of Subspace Arrangements with Applications in Modeling and Segmenting Mixed Data,

Preprint, to appear in SIAM Review, 2007

Hilbert series of subspace arrangements

[10]

A. Yang, S. Rao, A. Wagner, Y. Ma, R. Fossum

Hilbert Functions and Applications to the Estimation of Subspace Arrangements

IEEE International Conference on Computer Vision, 2005

[11]

H. Derksen

Hilbert series of subspace arrangements

Journal of Pure and Applied Algebra, 2007

Motion segmentation

[12]

R. Vidal, S. Sastry

Segmentation of dynamic scenes from image intensities

IEEE Workshop on Vision and Motion Computing, 2002

[13]

R. Vidal, S. Sastry

Optimal segmentation of dynamic scenes from two perspective views

IEEE Conference on Computer Vision and Pattern Recognition, 2003

[14]

R. Vidal, R. Hartley

Motion segmentation with missing data using PowerFactorization and GPCA

IEEE Conference on Computer Vision and Pattern Recognition, 2004

[15]

F. Shi, J. Wang, J. Zhang, Y. Liu

Motion Segmentation of Multiple Translating Objects Using Line Correspondences

IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005

[16]

J. Yan, M. Pollefeys

A Factorization-Based Approach for Articulated Non-rigid Shape, Motion and Kinematic Chain Recovery

IEEE Conference on Computer Vision and Pattern Recognition, 2005

[17]

R. Vidal, Y. Ma

A Unified Algebraic Approach to 2-D and 3-D Motion Segmentation

Journal of Mathematical Imaging and Vision, 2006

[18]

R. Vidal, Y. Ma, S. Soatto, S. Sastry

Two-View Multibody Structure from Motion

International Journal of Computer Vision, 2006

[19]

R. Vidal

Segmentation of Dynamic Scenes Taken By A Moving Central Panoramic Camera

Imaging Beyond the Pinhole Camera, Springer, 2006

[20]

R. Tron, R. Vidal

A Benchmark for the Comparison of 3-D Motion Segmentation Algorithms

IEEE International Conference on Computer Vision and Pattern Recognition, 2007

[21]

R. Vidal

Three-View Multibody Structure from Motion

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007

[22]

J. Zhang, F. Shi, J. Wang, Y. Liu

3D Motion Segmentation from Straight-Line Optical Flow

Multimedia Content Analysis and Mining, Springer, 2007

Hybrid Systems

[23]

R. Vidal, S. Soatto, Y. Ma, S. Sastry

An algebraic geometric approach to the identification of a class of linear hybrid systems

IEEE Conference on Decision and Control, 2003

[24]

R. Vidal

Identification of PWARX hybrid models with unknown and possibly different orders

American Control Conference, 2004.

[25]

R. Vidal, B.D.O. Anderson

Recursive identification of switched ARX hybrid models: exponential convergence and persistence of excitation

IEEE Conference on Decision and Control, 2004

[26]

Y. Hashambhoy and R. Vidal

Recursive Identification of Switched ARX Models with Unknown Number of Models and Unknown Orders

IEEE Conference on Decision and Control, 2005

[27]

R. Vidal, S. Soatto, A. Chiuso

Applications of Hybrid System Identification in Computer Vision

European Control Conference, 2007

[28]

R. Vidal

Recursive Identification of Switched ARX Systems

Automatica, 2007

Dynamic Textures

[29]

R. Vidal, A. Ravichandran

Segmentation and Optical Flow for Multiple Moving Dynamic Textures

IEEE Conference on Computer Vision and Pattern Recognition, 2005

[30]

A. Ghoreyshi, R. Vidal

Segmenting dynamic textures with Ising descriptors, ARX models and level sets

Dynamical Vision Workshop in the European Conf. on Computer Vision, 2006

[31]

L. Cooper, J. Liu, K. Huang

Spatial segmentation of temporal texture using mixture linear models

Dynamical Vision, Springer, 2007

Image segmentation and compression

[32]

J. Ye, R. Janardan, Q. Li

GPCA: an efficient dimension reduction scheme for image compression and retrieval

ACM SIGKDD international conference on Knowledge discovery and data mining, 2004

[33]

W. Hong, J. Wright, K. Huang, Y. Ma

A Multi-Scale Hybrid Linear Model for Lossy Image Representation

IEEE International Conference on Computer Vision, 2005

[34]

R. Samur, V. Zagorodnov

Segmenting Small Regions in the Presence of Noise

IEEE International Conference on Image Processing, 2005

[35]

Y. Hu, D. Rajan, L.T. Chia

Scale adaptive visual attention detection by subspace analysis

International Conference on Multimedia, 2007

Biomedical imaging

[36]

A. Ravichandran, R. Vidal, H. Halperin

Segmenting a Beating Heart Using Polysegment and Spatial GPCA

IEEE International Symposium on Biomedical Imaging: Nano to Macro, 2006

Vision-Based Formation Control

[37]

R. Vidal, O. Shakernia and S. Sastry

Following the Flock: Distributed Formation Control with Omnidirectional Vision-Based Motion Segmentation and Visual Servoing.

IEEE Robotics and Automation Magazine, 2004

Please refer to the specific webpages on this same website for more details on this topic.