Penalized regression methods are becoming increasingly popular in genome-wide association studies (GWAS) for identifying genetic markers associated with disease. However, standard penalized methods such as the LASSO do not take into account the possible linkage disequilibrium between adjacent markers. We propose a novel penalized approach for GWAS using a dense set of single nucleotide polymorphisms (SNPs). The proposed method uses the minimax concave penalty (MCP) for marker selection and incorporates linkage disequilibrium (LD) information by penalizing the difference of the genetic effects at adjacent SNPs with high correlation. A coordinate descent algorithm is derived to implement the proposed method. This algorithm is efficient and stable in dealing with a large number of SNPs. A multi-split method is used to calculate the p-values of the selected SNPs for assessing their significance. We refer to the proposed penalty function as the smoothed MCP (SMCP) and the proposed approach as the SMCP method. Performance of the proposed SMCP method and its comparison with a LASSO approach are evaluated through simulation studies, which demonstrate that the proposed method is more accurate in selecting associated SNPs. Its applicability to real data is illustrated using data from a GWAS on rheumatoid arthritis. Based on the idea of SMCP, we propose a new penalized method for group variable selection in GWAS with respect to the correlation between adjacent groups. The proposed method uses the group LASSO for encouraging group sparsity and a quadratic difference for adjacent group smoothing. We call it smoothed group LASSO, or SGL for short. Canonical correlations between two adjacent groups of SNPS are used as the weights in the quadratic difference penalty. Principal components are used to reduced dimensionality locally within groups. We derive a group coordinate descent algorithm for computing the solution path of the SGL. Simulation studies are used to evaluate the finite sample performance of the SGL and group LASSO. We also demonstrate its applicability on rheumatoid arthritis data.
Dissertation
Penalized methods in genome-wide association studies
University of Iowa
Doctor of Philosophy (PhD), University of Iowa
Summer 2011
DOI: 10.17077/etd.196wvd2m
Free to read and download, Open Access
Abstract
Details
- Title: Subtitle
- Penalized methods in genome-wide association studies
- Creators
- Jin Liu - University of Iowa
- Contributors
- Jian Huang (Advisor)Kai Wang (Advisor)Kung-Sik Chan (Committee Member)Aixin Tan (Committee Member)Dale Zimmerman (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Statistics
- Date degree season
- Summer 2011
- Publisher
- University of Iowa
- DOI
- 10.17077/etd.196wvd2m
- Number of pages
- ix, 150 pages
- Copyright
- Copyright 2011 Jin Liu
- Language
- English
- Description bibliographic
- Includes bibliographical references (pages 92-95).
- Academic Unit
- Statistics and Actuarial Science
- Record Identifier
- 9983777078402771
Metrics
2114 File views/ downloads
414 Record Views