Statistical Analysis of DNA Methylation

DNA methylation, as one important epigenetic change, is a molecular modification of DNA with the addition of a methyl group to the 5' position of cytosine in the context of a CpG dinucleotide that is crucial for normal development. We are developing statistical and computational methods to select differentially methylated sites between cancer patients and normal subjects and to classify different tumor types using DNA methylation data.

Integration of Omics Data

As any biological mechanism are building on multiple molecular phenomena, it is only through the understanding of the interplay within and between different layers of genomic structures can we fully understand phenotypic traits. It is therefore very important to develop powerful integrative analysis tools for multi-omics data to better understand biological processes.

Disease Mapping with Next-Generation Sequencing Data

Next generation sequencing technology has enabled the paradigm shift in genetic association studies from the common disease/common variant to common disease/rare variant hypothesis. We are developing new statistical and computational methods to map disease main effects, gene-gene interactions in post-GWAS era with sequence data, as well as to study design issues in designing optimal studies with sequence data.

Disease Mapping with Genome-wide Association Studies (GWAS)

We study different design issues and analysis issues in GWAS. We have developed an autozygosity mapping algorithm with GWAS and an optimal two-stage design with GWAS. We are continuing working on GWAS related problems.

Gene-Gene and Gene-Environment Interactions

It is known that most human traits are likely under the control of several genetic factors as well as environmental factors, which interact among each other. We study and develop statistical and computational methods to detect gene-gene and gene-environment interactions with different definitions under different study designs.


1. Frederica P. Perera, Columbia University, Department of Environmental Health Sciences

2. Regina M. Santella, Columbia University, Department of Environmental Health Sciences

3. Jurg Ott, Rockefeller University

4. Itsik Pe'er, Columbia University, Department of Computer Science

5. Suzanne M Leal, Baylor College of Medicine, Department of Molecular and Human Genetics

6. Zhaoxia Yu, University of California, Irvine, Department of Statistics

7. Yufeng Shen, Columbia University, Department of Biomedical Informatics

8. Xingguang Luo, Yale University, Department of Psychiatry