Projection-based inference and model selection for penalized regression
Abstract
Details
- Title: Subtitle
- Projection-based inference and model selection for penalized regression
- Creators
- Biyue Dai
- Contributors
- Patrick Breheny (Advisor)Yuan Huang (Committee Member)Mike Jones (Committee Member)Brian Smith (Committee Member)Dale Zimmerman (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Biostatistics
- Date degree season
- Autumn 2019
- DOI
- 10.17077/etd.005250
- Publisher
- University of Iowa
- Number of pages
- xii, 138 pages
- Copyright
- Copyright 2019 Biyue Dai
- Language
- English
- Description illustrations
- illustrations (some color)
- Description bibliographic
- Includes bibliographical references (pages 135-138)
- Public Abstract (ETD)
Nowadays researchers can collect and access data that have large numbers of variables. Data sets that have a large number of features and relatively few observations are referred to as high dimensional data. Building statistical models and making statistical inference from high dimensional data is out of the scope of well-developed classical statistical models such as the ordinary least squares. Penalized regression models have been one of the most popular methods in this field. This thesis aims at proposing novel approaches that are able to improve the predictive performance and inference of penalized models.
The first paper is devoted to developing a novel testing procedure, Projection Inference for Penalized Regression Estimator (PIPE). Based on model estimates from an initial penalized linear regression model, PIPE provides a computationally-efficient way to compute test statistics that can be used for false discovery rate control. In the second paper, I extend the PIPE procedure to accommodate binary outcomes with penalized logistic regression. For both linear and binary case, the validity of the proposed PIPE procedure is studied carefully through its theoretical properties and empirical performance.
In Chapter 4, two novel cross-validation approaches, cross-validated linear predictor and cross-validated deviance residuals are developed for Cox regression, where there is an inherent challenge to conduct cross-validation for the models built upon partial likelihood. Both approaches can be used to conduct model selection for penalized Cox Regression model. I assess those methods and compare them with two existing approaches in a comprehensive set of simulations. The cross-validated linear predictor approach has the best overall performance.
For all methods that are developed in this thesis, I illustrate their usage with real data sets that are considered as high dimensional data.
- Academic Unit
- Biostatistics
- Record Identifier
- 9983779899302771