Sigma-P Method for Rare-Variant Analysis

SigmaP is a rare-variant method for detecting disease associations in case-control sequencing studies. The Sigma-P statistic aggregates the effects of multiple variant sites by computing a weighted sum of the log p-values per site. Each site is weighted by the inverse of its expected standard deviation (denoted by sigma) of the number of variants in controls. The method is robust against signal noise introduced by a large number of neutral variants and is effective for handling variants with opposite effects.

Reference: Cheung YH, Wang G, Leal SM, Wang S (2012) "A Fast and Noise-Resilient Approach to Detect Rare-Variant Associations with Deep Sequencing Data for Complex Disorders" Genetic Epidemiology, 36:675-685

Download: R Scripts and Sample Data

Penalized Conditional/Unconditional Logistic Regression - pclogit R package

pclogit is an R package for penalized conditional/unconditional logistic regression using a network-based peanlty for matched/unmatched case-control data with grouped or graph-constrained variables. The algorithm is efficient for fitting the regularization path and for providing selection probabilities of each predictor for the anaylsis of high-dimensional matched/unmatched case-control data. It uses cyclical coordinate descent in a pathwise fashion.

Reference: Sun H, Wang S (2012) "Penalized Logistic Regression for High-dimensional DNA Methylation Data with Case-Control Studies" Bioinformatics, 28:1368-1375

Reference: Sun H, Wang S (2013) "Network-based regularization for matched case-control analysis of high-dimensional DNA methylation data" Statistics in Medicine, 32:2127–2139

Downloads: Manual, pclogit.tar.gz (for Linux/Unix only)

Rare variants selection - rvsel R package

rvsel is an R package for rare variants selection with sequence data. The most outome-related rare variants are selected within a gene or a genetic region. The selection procedure is based on the power set of the subset of the rare variants.

Reference: Sun H, Wang S (2014) "A Power Set Based Statistical Selection Procedure to Locate Susceptible Rare Variants Associated with Complex Traits with Sequencing Data" Bioinformatics, 30:2317-2323

Downloads: Manual, rvsel_0.1.tar.gz (for Linux/Unix only)

NEpiC: a Network-assisted algorithm for Epigenetic studies using mean and variance Combined signals

We present a network-assisted algorithm, NEpiC, that combines both mean and variance signals in searching for differentially methylated sub-networks using the protein-protein interaction (PPI) network.

Reference: Ruan PF, Shen J, Santella RM, Zhou SG, Wang S (2016) "NEpiC: a Network-assisted algorithm for Epigenetic studies using mean and variance Combined signals" Nucleic Acid Research, in press

Download: R Scripts and Sample Data

Differentially methylated regions (DMR) detection algorithm with combined mean and variance signals

Most existing methods developed to identify differentially methylated loci (DML) use mean signals only, and only a few methods were developed to identify DML using both mean and variance signals, while all existing methods to detect differentially methylated regions (DMRs) focus on mean signals only. This R code is for the new DMR detection algorithm we proposed that uses mean and variance combined signals.

Reference: Wang Y, Wang S (2016) "Detection of differentially methylated regions with mean and variance combined signals" submitted

Download: R Scripts and Sample Data