Logo image
Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
Journal article   Open access   Peer reviewed

Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection

Yaohui Zeng and Patrick Breheny
Cancer Informatics, Vol.15, pp.179-187
09/15/2016
DOI: 10.4137/CIN.S40043
PMCID: PMC4992117
PMID: 27679461
pdf
Overlapping Group Logistic Regression with Applications to Geneti1.51 MBDownloadView
Published (Version of record)CC BY-NC V4.0 Open Access
url
https://doi.org/10.4137/CIN.S40043View
Published (Version of record)Cancer Informatics 2016:15 179–187.

Abstract

Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches have been largely absent from regression methods due to the challenges of dealing with overlapping pathways and the resulting lack of available software. The R package grpreg is widely used to fit group lasso and other group-penalized regression models; in this study, we develop an extension, grpregOverlap, to allow for overlapping group structure using a latent variable approach. We compare this approach to the ordinary lasso and to GSEA using both simulated and real data. We find that incorporation of prior pathway information can substantially improve the accuracy of gene expression classifiers, and we shed light on several ways in which hypothesis-testing approaches such as GSEA differ from regression approaches with respect to the analysis of pathway data.

Biostatistics Diabetes OAfund overlapping group lasso penalized logistic regression gene set enrichment analysis pathway selection

Details

Metrics

129 File views/ downloads
51 Record Views
Logo image