Selected Research Projects

Large amounts of multidimensional data in the form of multilinear arrays, or tensors, arise routinely in modern applications from such diverse fields as chemometrics, genomics, physics, psychology, and signal processing, among many others. There is a clear need to develop novel statistical methods, efficient computational algorithms, and fundamental mathematical theory to analyze and exploit information in these types of data.
Tensor Completion
Tensor Regression
Other Aspects


We have been developing statistical and computational tools to address difficulties in robust quantification and, more importantly, reproducibility for microscopic imaging. In particuar, we have worked on doing so for colocalization analysis, a supremely powerful technique for scientists who want to take full advantage of what optical microscopy has to offer. A software package implementing a general framework for colocalization analysis can be found here.


Variable selection is a classical problem in statistics. It is an essential tool to statistical model building as it results in more interpretable and therefore in practice more useful models. Variable selection has been studied and used extensively.
Structured Variable Selection
In practice, predictors are often related. Taking their relationship into account in variable selection is essential in constructing meaningful models. Notable examples include group variable selection or variable selection with both main effects and interactions.
Other Aspects of Variable Selection


One of the classical problems in multivariate statistics is to estimate the covariance matrix or its inverse. Given the large number of parameters involved, exploiting the sparse nature of the problem becomes critical.
Sparse Inverse Covariance and Gaussian Graphical Models


Sparse Covariance Matrices




Time Course Gene Expression Data Analysis
Among the first microarray experiments were those measuring expression over time, and time course experiments remain common. We have developed statistical approaches for analyzing time course gene expression data using hidden Markov models and state space model. The approaches were implemented in the EBarrays package in Bioconductor.
Other Aspects of Gene Expression Data Analysis



