Learning Theory II: Modeling and Segmentation of Multivariate Mixed Data (BME 580.692, CS 600.462)
Instructor: Rene Vidal web e-mail

Phone: 410-516-7306

Class Hours: T-Th 4.30pm-6pm, Hodson 301

Office Hours: Mondays 5-6, 308B Clark Hall

Course Description
The aim of this two-semester course is to study the foundations of computational methods for the statistical and dynamical modeling of multivariate data. The emphasis of Learning Theory I is to use probability theory to build models of data in the framework of regression, classification, and data reduction. The emphasis of Learning Theory II is to use methods from algebraic geometry, probability theory and dynamical systems theory to build models of data in the framework of linear and polynomial algebra and dynamical systems theory. Topics will include nonlinear dimensionality reduction (PCA, LLE, Isomap), unsupervised learning (central clustering, subspace clustering, Generalized PCA), and estimation and identification of dynamical systems (Kalman filtering, subspace identification, hybrid system identification). We will apply these tools to model data from computer vision, biomedical imaging, neuroscience, and computational biology.
Announcements
First class will meet on Thursday 09/07. We can discuss change of time in the first class.
Course Syllabus
  1. Introduction (Chapter 1)
    • 09/07 Course overview
  2. Linear and Nonlinear Dimensionality Reduction (Chapter 2)
    • 09/12 Principal Component Analysis (PCA)
    • 09/14 Model Selection and Robust PCA
    • 09/19 Nonlinear and Kernel PCA
    • 09/21 Locally Linear Embedding
  3. Iterative Methods for Unsupervised Learning (Chapter 3)
    • 09/26-28 Central Clustering: K-means, Expectation Maximization (EM)
    • 10/03-05 Subspace Clustering: K-subspaces, EM for Mixtures of PCAs
  4. Algebraic Methods for Unsupervised Learning (Chapter 4)
    • 10/10-12 Line, plane, and hyperplane clustering
    • 10/17-19 Subspace Clustering: Generalized Principal Component Analysis (GPCA)
  5. Applications in Computer Vision
    • 10/24-26 3-D Motion Segmentation (Chapter 8)
    • 10/31-02 Spatial and Temporal Video Segmentation (Chapter 9)
  6. 11/7 Midterm
  7. Estimation and Segmentation of Hybrid Dynamical Models (Chapters 10-11)
    • 11/9 Linear systems: input/output (ARX) and state space (ARMA) representation
    • 11/14-16 State estimation: observability, observer design, Kalman filter
    • 11/21-28 Identification: linear parameter identification, subspace identification, recursive identification
    • 11/30 Identification of hybrid systems
  8. 12/5-7: Presentation of projects
References
R. Vidal, Y. Ma, and S. Sastry. Generalized Principal Component Analysis.
Grading Policy

Homework (30%): Homework problems will include both analytical exercises as well as programming assignments in MATLAB.

Midterm (30%): There will be one midterm on November 7th.

Project (40%): There will be a final project where each student will either apply techniques from the course to solve a real problem or solve an open research problem. Each student will submit a 1-page project description by October 19th (5%), a 3-page progress report by November 16th (5%), a 6-page final report by December 7th (15%), and give a 20 minute presentation on December 5th or 7th (15%).

Honor System
Homeworks, midterms and projects will be individual. The strength of the university depends on academic and personal integrity. In this course, you must be honest and truthful. Ethical violations include cheating on exams, plagiarism, reuse of assignments, improper use of the Internet and electronic devices, unauthorized collaboration, alteration of graded assignments, forgery and falsification, lying, facilitating academic dishonesty, and unfair competition. All these will be severely penalized.
Handouts
Homeworks
Please submit the code of your homework at Submit HW

  1. Homework 1: Linear Algebra. Due Thursday September 21st, 2006, beginning of class.
  2. Homework 2: Principal Component Analysis (PCA). Due Thursday September 28th, 2006, beginning of class.
    • Dataset (images and MATLAB functions).
      Errata corrige
      • The prototype of the function vector2image is [img]=vector2image(vectorimg,sz) instead of [vectorimg,sz]=vector2image(img).
      • Improved the clarity for the description of the function reconstruct.
      • In the second part (Experiments), question (f), "Set B" must be substituted with "Set A, Validation Set".
      Thank you for reporting these errors.
  3. Homework 3: Nonlinear Dimensionaliry Reduction (KPCA, LLE and Isomap). Due Thursday October 5th, 2006, beginning of class.
      Dataset (images and MATLAB functions).
      Clarifications
      • The code given for Matlab function handles was tested in Matlab 7.1. If your version is 6.5 or older, check help as you may have to use"feval" y=feval(kernel,x,var) .
      • When discussing classification rates, please give the percentage of correct classification in addition to any plots you may have.
  4. Homework 4: Central and Subspace Clustering (K-means, EM and K-subspaces). Due Thursday October 12th, 2006, beginning of class.
  5. Homework 5: EM, MPPCA, Polysegment Due Thursday October 19th, 2006, beginning of class.
  6. Homework 6: Generalized PCA Due Thursday October 26th, 2006, beginning of class.
  7. Homework 7: Applications of GPCA Due Thursday November 2nd, 2006, beginning of class.
  8. Homework 8: Linear and Hybrid Systems Due Thursday November 30th, 2006, beginning of class.
Midterms