John Paisley

[ Main ]
[ Publications ]
[ Research ]
[ Teaching ]
[ Software ]

  Office: Mudd 422
  Phone: (212) 854-8024

  Mail Address:
  Columbia University
  500 W. 120th St., Suite 1300
  New York, NY 10027

Variational methods for approximate posterior inference
Integral-free (a.k.a. black-box) variational inference
I developed a stochastic gradient approach to variational inference that allows one to optimize the objective function without having to calculate integrals. Later called "black-box variational inference," this method forms unbiased approximations to the true gradient. To reduce variance I proposed using control variates, which are particularly well-suited to the variational inference problem. This can lead to easy, automatic variational inference in a wide variety of non-conjugate (or conjugate) models.
J. Paisley, D. Blei and M.I. Jordan. Variational Bayesian inference with stochastic search, Int. Conf. on Machine Learning (ICML), 2012. [PDF]
Scalable inference for Big Data
Another focus of my research is on scalable model inference. These methods build on a technique called "stochastic variational inference," which is a general framework for scalable Bayesian modeling that allows the algorithm to quickly converge to an approximate posterior distribution without sacrificing any data.

Having massive data sets means that we can learn greater structure in the data via more complex models. SVI allows for this increase in data and model size without an equally significant increase in computational burden. With my collaborators, I have exploited this fact to learn greater structure from data using the mixed-membership modeling framework.

I have studied scalable inference for a variety of model structures, for example tree-structured models, graph-based models, dictionary learning models and matrix factorization. I have applied these techniques to topic modeling, image processing, automatic tagging and time-evolving data.
Representative publications

M. Hoffman, D. Blei, C. Wang and J. Paisley. Stochastic variational inference, Journal of Machine Learning Research, vol. 14, pp. 1303-1347, 2013. [PDF]

A. Zhang, S. Gultekin and J. Paisley. Stochastic variational inference for the HDP-HMM, International Conference on Artificial Intelligence and Statistics (AISTATS), 2016. [PDF]

S. Sertoglu and J. Paisley. Scalable Bayesian nonparametric dictionary learning, European Signal Processing Conference (EUSIPCO), 2015. [PDF] (invited session paper)

S. Gultekin and J. Paisley. A collaborative Kalman filter for time-evolving dyadic processes, IEEE International Conference on Data Mining (ICDM), 2014. [PDF]

D. Liang, J. Paisley and D. Ellis. Codebook-based scalable music tagging with Poisson matrix factorization, International Society for Music Information Retrieval Conference, 2014. [PDF]

Batch inference

Stochastic inference

Bayesian models for text and images

A major focus of my research is on developing Bayesian models for various problems involving text and images. For example, I developed beta process factor analysis (BPFA), which can be thought of as a Bayesian nonparametric version of the popular K-SVD dictionary learning algorithm. BPFA has achieved state-of-the-art performance on image processing problems such as denoising and compressed sensing for MRI (toy example at right).

Topic models are another area of focus. My recent work includes developing structured models for large scale text data based on Dirichlet processes and Markov processes. I have also recently worked on applying Gaussian processes to manifold learning for both unsupervised and supervised problems.


Total variation (best)

BPFA (reconstructed part)

BPFA (denoised part)

Representative publications

J. Paisley and L. Carin.
Nonparametric factor analysis with beta process priors, International Conference on Machine Learning (ICML), 2009. [PDF]

A. Zhang and J. Paisley. Markov mixed membership models, International Conference on Machine Learning (ICML), 2015. [PDF]

D. Liang and J. Paisley. Landmarking manifolds with Gaussian processes, International Conference on Machine Learning (ICML), 2015. [PDF]

J. Paisley, C. Wang, D. Blei and M. Jordan. Nested hierarchical Dirichlet processes, IEEE Trans. on Pattern Analysis and Mach. Intell., vol. 37, no. 2, pp. 256-270, 2015. [PDF]

Y. Huang, J. Paisley, Q. Lin, X. Ding, X. Fu and X.P. Zhang. Bayesian nonparametric dictionary learning for compressed sensing MRI, IEEE Transactions on Image Processing, vol. 23, no. 12, pp. 5007-5019, 2014. [PDF]

Stochastic processes and Bayesian nonparametric theory

I am also interested in more theoretical ideas around stochastic processes and Bayesian nonparametrics. I developed a stick-breaking construction for the beta process, which gives a theoretically correct way to generate a size-biased sample of this infinite jump process, and made the connection to Poisson process theory.

Theoretical connections that I've made between the stick-breaking construction of the Dirichlet process and the hierarchical Dirichlet process have allowed for easy stochastic variational inference of Bayesian nonparametric topic models.

I am
also fascinated by the Poisson process and enjoy teaching about them and their connection to Bayesian nonparametrics.

Representative publications, preprints, course notes

J. Paisley. Course notes for Advanced Probabilistic Machine Learning, Columbia University, 2014. [PDF]

J. Paisley and M. Jordan. A constructive definition of the beta process, arXiv:1604.0068, 2016. [PDF]

J. Paisley, D. Blei and M.I. Jordan. Stick-breaking beta processes and the Poisson process, International Conference on Artificial Intelligence and Statistics (AISTATS), 2012. [PDF]

C. Wang, J. Paisley and D. Blei.
Online variational inference for the hierarchical Dirichlet process, International Conference on Artificial Intelligence and Statistics (AISTATS), 2011. [PDF]

Dots in spaces making measures...

sitting at tables...

and breaking sticks!