Journal article
Promoting similarity of model sparsity structures in integrative analysis of cancer genetic data
Statistics in medicine, Vol.36(3), pp.509-559
02/10/2017
DOI: 10.1002/sim.7138
PMCID: PMC5209260
PMID: 27667129
Abstract
In profiling studies, the analysis of a single dataset often leads to unsatisfactory results because of the small sample size. Multi-dataset analysis utilizes information of multiple independent datasets and outperforms single-dataset analysis. Among the available multi-dataset analysis methods, integrative analysis methods aggregate and analyze raw data and outperform meta-analysis methods, which analyze multiple datasets separately and then pool summary statistics. In this study, we conduct integrative analysis and marker selection under the heterogeneity structure, which allows different datasets to have overlapping but not necessarily identical sets of markers. Under certain scenarios, it is reasonable to expect some similarity of identified marker sets - or equivalently, similarity of model sparsity structures - across multiple datasets. However, the existing methods do not have a mechanism to explicitly promote such similarity. To tackle this problem, we develop a sparse boosting method. This method uses a BIC/HDBIC criterion to select weak learners in boosting and encourages sparsity. A new penalty is introduced to promote the similarity of model sparsity structures across datasets. The proposed method has a intuitive formulation and is broadly applicable and computationally affordable. In numerical studies, we analyze right censored survival data under the accelerated failure time model. Simulation shows that the proposed method outperforms alternative boosting and penalization methods with more accurate marker identification. The analysis of three breast cancer prognosis datasets shows that the proposed method can identify marker sets with increased similarity across datasets and improved prediction performance. Copyright (c) 2016 John Wiley & Sons, Ltd.
Details
- Title: Subtitle
- Promoting similarity of model sparsity structures in integrative analysis of cancer genetic data
- Creators
- Yuan Huang - Yale UniversityJin Liu - Duke-NUS Medical SchoolHuangdi Yi - Yale UniversityBen-Chang Shia - Taipei Medical UniversityShuangge Ma - Yale University
- Resource Type
- Journal article
- Publication Details
- Statistics in medicine, Vol.36(3), pp.509-559
- Publisher
- Wiley
- DOI
- 10.1002/sim.7138
- PMID
- 27667129
- PMCID
- PMC5209260
- ISSN
- 0277-6715
- eISSN
- 1097-0258
- Number of pages
- 51
- Grant note
- 71471152; 71201139; 71301162 / National Natural Science Foundation of China; National Natural Science Foundation of China (NSFC) WBS: R-913-200-098-263 / Duke-NUS Graduate Medical School; National University of Singapore 13ZD148; 13CTJ001 / National Social Science Foundation of China CA142774; CA016359 / NIH; United States Department of Health & Human Services; National Institutes of Health (NIH) - USA VA Cooperative Studies Program of the Department of Veterans Affairs, Office of Research and Development R01CA142774 / NATIONAL CANCER INSTITUTE; United States Department of Health & Human Services; National Institutes of Health (NIH) - USA; NIH National Cancer Institute (NCI) UL1TR001863 / NATIONAL CENTER FOR ADVANCING TRANSLATIONAL SCIENCES; United States Department of Health & Human Services; National Institutes of Health (NIH) - USA; NIH National Center for Advancing Translational Sciences (NCATS)
- Language
- English
- Date published
- 02/10/2017
- Academic Unit
- Biostatistics
- Record Identifier
- 9984363600702771
Metrics
12 Record Views