Preprint
Bayesian Variable Selection in Multivariate Regression Under Collinearity in the Design Matrix
ArXiv.org
Cornell University
07/23/2025
DOI: 10.48550/arxiv.2507.17975
Abstract
We consider the problem of variable selection in Bayesian multivariate linear regression models, involving multiple response and predictor variables, under multivariate normal errors. In the absence of a known covariance structure, specifying a model with a non-diagonal covariance matrix is appealing. Modeling dependency in the random errors through a non-diagonal covariance matrix is generally expected to lead to improved estimation of the regression coefficients. In this article, we highlight an interesting exception: modeling the dependency in errors can significantly worsen both estimation and prediction. We demonstrate that Bayesian multi-outcome regression models using several popular variable selection priors can suffer from poor estimation properties in low-information settings--such as scenarios with weak signals, high correlation among predictors and responses, and small sample sizes. In such cases, the simultaneous estimation of all unknown parameters in the model becomes difficult when using a non-diagonal covariance matrix. Through simulation studies and a dataset with measurements from NIR spectroscopy, we illustrate that a two-step procedure--estimating the mean and the covariance matrix separately--can provide more accurate estimates in such cases. Thus, a potential solution to avoid the problem altogether is to routinely perform an additional analysis with a diagonal covariance matrix, even if the errors are expected to be correlated.
Details
- Title: Subtitle
- Bayesian Variable Selection in Multivariate Regression Under Collinearity in the Design Matrix
- Creators
- Joyee GhoshXun Li
- Resource Type
- Preprint
- Publication Details
- ArXiv.org
- DOI
- 10.48550/arxiv.2507.17975
- ISSN
- 2331-8422
- Publisher
- Cornell University; Ithaca, New York
- Language
- English
- Date posted
- 07/23/2025
- Academic Unit
- Statistics and Actuarial Science
- Record Identifier
- 9984865433702771
Metrics
4 Record Views