Journal article
Variable selection in the accelerated failure time model via the bridge method
Lifetime data analysis, Vol.16(2), pp.176-195
04/2010
DOI: 10.1007/s10985-009-9144-2
PMCID: PMC2989175
PMID: 20013308
Abstract
In high throughput genomic studies, an important goal is to identify a small number of genomic markers that are associated with development and progression of diseases. A representative example is microarray prognostic studies, where the goal is to identify genes whose expressions are associated with disease free or overall survival. Because of the high dimensionality of gene expression data, standard survival analysis techniques cannot be directly applied. In addition, among the thousands of genes surveyed, only a subset are disease-associated. Gene selection is needed along with estimation. In this article, we model the relationship between gene expressions and survival using the accelerated failure time (AFT) models. We use the bridge penalization for regularized estimation and gene selection. An efficient iterative computational algorithm is proposed. Tuning parameters are selected using V-fold cross validation. We use a resampling method to evaluate the prediction performance of bridge estimator and the relative stability of identified genes. We show that the proposed bridge estimator is selection consistent under appropriate conditions. Analysis of two lymphoma prognostic studies suggests that the bridge estimator can identify a small number of genes and can have better prediction performance than the Lasso.
Details
- Title: Subtitle
- Variable selection in the accelerated failure time model via the bridge method
- Creators
- Jian Huang - Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA, 52242, USA. jian-huang@uiowa.eduShuangge Ma
- Resource Type
- Journal article
- Publication Details
- Lifetime data analysis, Vol.16(2), pp.176-195
- DOI
- 10.1007/s10985-009-9144-2
- PMID
- 20013308
- PMCID
- PMC2989175
- NLM abbreviation
- Lifetime Data Anal
- ISSN
- 1572-9249
- eISSN
- 1572-9249
- Publisher
- United States
- Grant note
- R01 CA120988-03 / NCI NIH HHS R01 CA120988 / NCI NIH HHS
- Language
- English
- Date published
- 04/2010
- Academic Unit
- Statistics and Actuarial Science
- Record Identifier
- 9983986092102771
Metrics
29 Record Views