Feature screening rules and algorithms for efficient optimization of sparse regression models
Abstract
Details
- Title: Subtitle
- Feature screening rules and algorithms for efficient optimization of sparse regression models
- Creators
- Chuyi Wang
- Contributors
- Patrick Breheny (Advisor)Kung-Sik Chan (Committee Member)Jian Huang (Committee Member)Luke Tierney (Committee Member)Tianbao Yang (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Statistics
- Date degree season
- Summer 2021
- DOI
- 10.17077/etd.005893
- Publisher
- University of Iowa
- Number of pages
- ix, 99 pages
- Copyright
- Copyright 2021 Chuyi Wang
- Language
- English
- Description illustrations
- color illustrations
- Description bibliographic
- Includes bibliographical references (pages 97-99).
- Public Abstract (ETD)
Ultra high-dimensional data (data that collect a large number, i.e. millions, of features about subjects) has become a popular topic in many fields such as genetic studies, image recognition, and natural language processing, as these data sets are more and more easily available. Sparse penalized regression models are powerful ways to analyze this type of data because they can identify important features in the data. Efficient algorithms for these models, thus, have become valuable. Screening methods can identify features that will not be in the model and eliminate those features before solving the model and greatly reduce time and memory costs. This thesis focuses on developing more efficient screening methods and extending screening methods to more models.
First, we propose an adaptive hybrid screening algorithm framework that does screening adaptively along the path of tuning parameter values to reduce the computation costs of screening with little impact on its ability to eliminate features. Second, we derive a screening rule for lasso penalized Cox regression models, a powerful technique for identifying important features in predicting patient survival. Third, we derive a screening rule for the elastic net, which has good performance when features are correlated. Last, all the proposed methods are implemented and tested in the publicly available biglasso R package. They show significant improvement in efficiency compared to other existing methods.
- Academic Unit
- Statistics and Actuarial Science
- Record Identifier
- 9984124571802771