Bayesian subgroup analysis in regression using mixture models
Abstract
Details
- Title: Subtitle
- Bayesian subgroup analysis in regression using mixture models
- Creators
- Yunju Im
- Contributors
- Aixin Tan (Advisor)Jian Huang (Advisor)Joyee Ghosh (Committee Member)Brian J Smith (Committee Member)Luke Tierney (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Statistics
- Date degree season
- Spring 2020
- DOI
- 10.17077/etd.005354
- Publisher
- University of Iowa
- Number of pages
- xi, 76 pages
- Copyright
- Copyright 2020 Yunju Im
- Language
- English
- Description illustrations
- illustrations (chiefly color)
- Description bibliographic
- Includes bibliographical references (pages 74-76).
- Public Abstract (ETD)
Regression has long been used to study the association between individuals and the variables of interest (e.g., medical treatments). In regression problems, there are many cases where individuals react differently to those variables of interest since individuals come from different latent subgroups. Identifying such latent subgroups is important, especially in the medical field, in the sense that it allows us to better estimate and understand group-specific treatment effects. However, recovering such latent subgroups is not an easy task. One of the challenges is that we do not know how many subgroups exist among individuals. The number of subgroups needs to be estimated. Second, even after the number of subgroups is estimated, it is not easy to determine which individual belongs to which subgroup.
To answer the questions above, our work adopts a Bayesian model based on a mixture of finite mixtures (MFM), for which the number of subgroups needs not be specified a priori and is modeled as a random variable. That is, our model lets data tell us how many subgroups are present in the observed sample. We further study the issue of prior specification, which is critical in any Bayesian modeling problem. We use the Bayes factor criterion to compare different priors, and develop an algorithm to search for the optimal one efficiently. Using simulated and real data, we demonstrate the advantage of the proposed model and its computing, compared to that of existing methods.
- Academic Unit
- Statistics and Actuarial Science
- Record Identifier
- 9983949694402771