ELEN E6886, Fall 2012
Sparse Representation and High-Dimensional Geometry

Tuesdays 7-9:30 PM
Mudd 253

Prof. John Wright
Email: johnwright@ee.columbia.edu
Office: 716 CEPSR
Office hours: Thursday 3-4 PM

The past few years have seen exciting developments in theory and algorithms for estimation in high-dimensional spaces. Beautiful theoretical results show that structured signals, such as sparse vectors and low-rank matrices, can be recovered from relatively small sets of linear observations. These results raise intriguing possibilities for addressing engineering problems in signal and image processing, and beyond.

The goal of this course is to provide students with the theoretical understanding, algorithmic tools, and implementation experience needed to use these tools to solve problems in their own area of interest, or even to begin doing novel work in this area.

Tentative syllabus

Student Project Presentations:

    Session I - Sampling, Communication and Optimization
   7 PM Monday Dec. 17, 707 CEPSR

Pablo Martinez-Nuevo, Sampling Sparse Bandlimited Signals at the Rate of Innovation 
Tugce Yazicigil, Compressed Sensing Spectrum Scanners
Tanbir Haque, A power efficient front-end employing a deterministic measurement matrix for compressed sensing spectrum scanners
Scott Newton, Effects of Quantization on Signal Recovery in Compressed Sensing Receivers
Alden Goldstein, Compressed Sensing Algorithms for Channel Estimation
Carlos Abad, L1 Edge and Trend Filtering with FISTA
Cun Mu, Solving max norm related optimization problems via ADMM
Wen-Hsiang Shaw, Coreset of Dictionary Learning

    Session II - Vision, Audio, and Biological data
   7 PM-? Tuesday Dec. 18, Mudd 253

Abdulkadir Elmas, Reconstruction of novel target genes of the regulatory proteins via sparse biclustering on microarray expression data
Cheng-Heng Yeh, Exploring Sparse Structure in Neural Connectivity of Fruit Fly Olfactory System
Dawen Liang, Nonparametric Bayesian Dictionary Learning for Machine Listening
Colin Raffel, Towards a perceptually-informed sparse coding of audio signals
Zhuo Chen, Exploration of the low rank and sparse structure in audio with the application of phoneme detection
Yaqing Mao, Texture Classification with Sparse Recovery
Mingyang Sun, Structured Sparsity and Occlusion
Jiawei Chen, Photometric Reconstruction with RPCA
Yan Wang, Propagating labels from ImageNet to 3D point clouds
Juan Liu, Low-Rank Estimation in 3D Urban Dataset

Readings and Lecture Notes:

        Lecture 1 - September 3 -- What is it all about? Motivating application examples
                                                     Underdetermined systems, sparsity, L0 minimization

        Introductory material:
            Donoho, Elad and Bruckstein - From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images, SIAM Review 2009
            Davenport, Duarte, Eldar and Kityniok - Introduction to Compressed Sensing, 2011
            Wright, Ma, Saipro, Mairal, Huang, Yan - Sparse Representation for Computer Vision and Pattern Recognition, Signal Processing Magazine 2010

        Hardness results for sparse recovery:
Natarajan - Sparse Approximate Solutions to Linear Systems, SIAM Journal on Computing 1995
                                    (You can access this through the Columbia Library -- log in using your uni and password).
            Amaldi and Kann - On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems
                                    Theoretical Computer Science, 1997

        For a review of convexity, see the Chapters 1 and 2 of Boyd and Vandenberghe's book.

        The material on spark and uniqueness of sparse solutions comes from 
            Donoho and Elad - Optimally Sparse Representation in General (nonorthogonal) Dictionaries via L1 Minimization, PNAS 2003
            See also, Gorodnitsky and Rao - Sparse Signal Reconstruction from Limited Data using FOCUSS - A Reweighted Minimum Norm Algorithm, IEEE TSP 1997

        Lecture notes! Are available on "New Courseworks". Log in using your uni, go to the ELEN 6886 tab, and go to "Files and Resources".

        Lecture 2 - September 11

        The material on coherence and L1 recovery comes from 
            Donoho and Elad - Optimally Sparse Representation in General (nonorthogonal) Dictionaries via L1 Minimization, PNAS 2003
            Gribonval and Nielsen - Sparse Representations in Unions of Bases, IEEE IT 2003

        The proof we saw in class comes from
            Fuchs - On Sparse Representations in Arbitrary Redundant Bases, IEEE IT 2004

        Some discussion of compressive sensing in seismic imaging:
            Herrmann, Friedlander and Yimlaz, Fighting the Curse of Dimensionality, Compressive Sensing in Exploration Seismology, 2012

        Lecture notes! Are available on "New Courseworks". Log in using your uni, go to the ELEN 6886 tab, and go to "Files and Resources".

        Lecture 3 - September 18

                  For some general discussion on the noisy case, two very influential papers -- one from statistics and one from signal processing -- are
                        Tibshirani - Regression shrinkage and selection via the Lasso, JRSS B 1996
                        Chen, Saunders and Donoho - Atomic Decomposition by Basis Pursuit, SIAM Rev. 1998

                  The restricted isometry property is discussed in more detail in
                        Candes and Tao - Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE IT 2006
                        Candes and Tao - Decoding by Linear Programming, IEEE IT 2005

                  For somewhat simpler proofs of the results we saw in class, see
                        Candes - The Restricted Isometry Property and Its Implications for Compressed Sensing, 2008

                  Lecture notes! Are available on "New Courseworks". Log in using your uni, go to the ELEN 6886 tab, and go to "Files and Resources".

        Lecture 4 - September 25

                  The Johnson-Lindenstrauss lemma is from
                        Johnson and Lindenstrauss - Extensions of Lipschitz mappings into Hilbert Space, Contemporary Mathematics, 1984
                            I haven't been able to locate the original article online... the discussion in class is fleshed out in the lecture notes. 
                            Dasgupta and Gupta - An Elementary Proof of a Theorem of Johnson and Lindenstrauss, RSA 2003

                        The JL property is useful in many situations, for example in finding approximate nearest neighbors:
                            Ailon and Chazelle - Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform, STOC 2006.

                   The material on geometric interpretations of sparse recovery
                            Donoho & Tanner - Counting Faces of Randomly Projected Polytopes when Projection Radically Lowers Dimension, 2008

                  The original papers on principal component analysis:
                        Pearson - On lines and planes best fit to systems of points in space, Philosophical Magazine, 1901
                        Hotelling - Analysis of a complex of statistical variables into principal components, Journal of Educational Psychology, 1933

                    A few applications of low-rank recovery in ...
                        Indexing articles:
                            Deerwester, Dumais, Furnas, Landauer, Harshman, Indexing by Latent Semantic Analysis, JASIS 1990
                        Photometric stereo:
                            Wu, Gannesh, Shi, Matsushita, Wang and Ma, Robust Photometric Stereo via Low-Rank Matrix Completion and Recovery, ACCV 2010
                        System identification:
                            See Fazel's thesis, Matrix Rank Minimization with Applications, 2002

        Lecture 5 - October 2

                    The nuclear norm heuristic:
                        Fazel, Hindi and Boyd - A Rank Minimization Heuristic with Application to Minimum-Order System Design, ACC 2001

                    Rank-RIP and recovery results:
                        Recht, Fazel and Parillo - Guaranteed Minimum Rank Solutions to Linear Matrix Equations via Nuclear Norm Minimization, SIAM Review 2010

                    Correctness of the nuclear norm for matrix completion:
                        Candes and Recht - Exact Matrix Completion via Convex Optimization, FOCM 2009
                        Gross - Recovering Low-rank Matrices from Few Coefficients in Any Basis, IEEE IT 2010

                    Pauli matrices and the RIP:
                         Liu - Universal Low-Rank Recovery from Pauli Measurements, 2011

                    Low-rank recovery with gross errors:
                        Candes, Li, Ma, Wright - Robust Principal Component Analysis? JACM 2011
                        Chandrasekaran, Sanghavi, Parrilo and Wilsky - Rank-Sparsity Incoherence for Matrix Decomposition, SIAM JO 2011

                    Gaussian graphical model selection:
                        Chandrasekaran, Parrilo and Wilsky - Latent Variable Graphical Model Selection via Convex Optimization, 2010

            October 9 - NO CLASS. We will make up this lecture in late October (details forthcoming).

            Lecture 6 - October 16 ... Algorithms I

                    Fast homotopy methods for L1:
                        Efron, Hastie, Johnstone and Tibshirani  - Least Angle Regression  AOS 2003
                        Donoho and Tsaig - Fast Solution of L1 Minimization Problems When the Solution May be Sparse 2006

                    Interior point methods
                        Kim, Koh, Boyd and Gorodnitsky - An Interior Point Mehtod for Large-Scale L1-Regularized Least Squares JSTSP 2007

                    Iterative soft thresholding, proximal gradient methods and an optimal first order method:
                        Beck and Teboulle - A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems - SJIS 2009

                    A unifying look different optimal gradient methods:
                        Tseng - On Accelerated Proximal Gradient Methods for Convex-Concave Optimization (2008)

                    Background on optimal methods and black box complexity:
                        Nemirovski - Efficient Methods in Convex Programming (lecture notes)
                        Nesterov - A Method of Solving a Convex Programming Problem with Convergence Rate O(1/k^2) - Soviet Math. Dokl 1983           

        Lecture 7 - October 23 ... Algorithms II  + starting structured sparsity

                    Recap on proximal methods (see October 13).

                    Augmented Lagrangian methods for L1 (aka Bregman iterative methods):
                        Yin, Osher, Goldfarb and Darbon - Bregman Iterative Algorithms for L1-Minimization with Application to Compressed Sensing - SJIS 2008

                    A survey on the Alternating Direction Method of Multipliers (ADMM):
                        Boyd, Parikh, Chu, Peleato and Eckstein - Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers - 2010

                    Greedy algorithms:
                        Mallat and Zhang - Matching Pursuits with Time-Frequency Dictionaries - TSP 1993

                    Non-uniform recovery by OMP:
                        Tropp and Gilbert - Signal Recovery from Random Measurements via Orthogonal Matching Pursuit - IT 2007

                    The Group Lasso:
                        Yuan and Lin - Model selection and estimation in regression with grouped variables, JRSS 2006

                    Incoherence and RIP results for group sparse signals:
                        Eldar and Mishali - Robust Recovery of Signals from a Structured Union of Subspaces, 2009

                    The multiple measurement vector problem:
                        Tropp - Algorithms for Simultaneous Sparse Approximation, 2005

                    Simple solutions when X is ideal-sparse, full rank:
                        Schmidt - Multiple Emitter Location and Signal Parameter Estimation, TAP 1986
                    Support recovery with multiple measurement vectors:
                        Obozinski, Wainwright, Jordan - Support Union Recovery in High-Dimensional  Multivariate Regression, AOS 2011

                    Group Lasso with overlapping groups:
                        Jennaton, Audibert and Bach - Structured Variable Selection with Sparsity Inducing Norms, JMLR 2011

                    From submodular set functions to structured sparsity:
                        Bach - Structured Sparsity-Inducing Norms Through Submodular Functions, 2011

        October 30 - NO CLASS. CU has canceled all classes and events due to Hurricane Sandy. Please stay safe and dry!

Administrative information:

Texts: There are no required texts. Students may find the following useful and enjoyable reading:
    Miki Elad: Sparse and Redundant Representations: From Theory to Applications in Image Processing  
                      Elad's book is available digitally for Columbia Students:  http://clio.cul.columbia.edu:7018/vwebv/holdingsInfo?bibId=8579104.
                        (Thanks, Chung-Heng for the link.)
    Jiri Matousek: Lectures on Discrete Geometry (especially the last 4 chapters)   
    Stephen Boyd and Lieven Vandenberghe: Convex Optimization

Grades: Grades will be given based on course participation and a final project. The project should explore in more depth some area not covered in class, and ideally should involve some novel work. The topic could be theory, application, or a mix. The project is open-ended: be creative and show you are thinking about the material! If you have any questions about the suitability of a potential project topic, please contact the instructor.  

We will set aside class time in October for project proposals. At the end of the semester, all students will be required to give a 20 minute presentation and submit a final project report. 

Additional resources:
        The Rice compressed sensing repository
        Nuit blanche (a blog on all things compressed sensing)
        Matrix recovery and face recognition by convex optimization