Logo image
Discrepancy-Based Model Selection Criteria Using Cross-Validation
Book chapter

Discrepancy-Based Model Selection Criteria Using Cross-Validation

Joseph E Cavanaugh, Simon L Davies and Andrew A Neath
Statistical Models and Methods for Biomedical and Technical Systems, pp.473-486
Statistics for Industry and Technology, Birkhäuser Boston
2008
DOI: 10.1007/978-0-8176-4619-6_33

View Online

Abstract

A model selection criterion is often formulated by constructing an approximately unbiased estimator of an expected discrepancy, a measure that gauges the separation between the true model and a fitted approximating model. The expected discrepancy reflects how well, on average, the fitted approximating model predicts “new” data generated under the true model. A related measure, the estimated discrepancy, reflects how well the fitted approximating model predicts the data at hand. In general, a model selection criterion consists of a goodness-of-fit term and a penalty term. The natural estimator of the expected discrepancy, the estimated discrepancy, corresponds to the goodness-of-fit term of the criterion. However, the estimated discrepancy yields an overly optimistic assessment of how effectively the fitted model predicts new data. It therefore serves as a negatively biased estimator of the expected discrepancy. Correcting for this bias leads to the penalty term. Cross-validation provides a technique for developing an estimator of an expected discrepancy which need not be adjusted for bias. The basic idea is to construct an empirical discrepancy that evaluates an approximating model by assessing how accurately each case-deleted fitted model predicts the deleted case. The preceding approach is illustrated in the linear regression framework by formulating estimators of the expected discrepancy based on Kullback’s I-divergence and the Gauss (error sum of squares) discrepancy. The traditional criteria that arise by augmenting the estimated discrepancy with a bias adjustment term are the Akaike information criterion and Mallows’ conceptual predictive statistic. A simulation study is presented.
AIC Mallows’ C PRESS

Details

Metrics

Logo image