Discrepancy-Based Model Selection Criteria Using Cross-Validation

Joseph E Cavanaugh; Simon L Davies; Andrew A Neath

doi:10.1007/978-0-8176-4619-6_33

Back

Book chapter

Discrepancy-Based Model Selection Criteria Using Cross-Validation

Joseph E Cavanaugh, Simon L Davies and Andrew A Neath

Statistical Models and Methods for Biomedical and Technical Systems, pp.473-486

Statistics for Industry and Technology, Birkhäuser Boston

2008

DOI: 10.1007/978-0-8176-4619-6_33

View Online

Abstract

A model selection criterion is often formulated by constructing an approximately unbiased estimator of an expected discrepancy, a measure that gauges the separation between the true model and a fitted approximating model. The expected discrepancy reflects how well, on average, the fitted approximating model predicts “new” data generated under the true model. A related measure, the estimated discrepancy, reflects how well the fitted approximating model predicts the data at hand. In general, a model selection criterion consists of a goodness-of-fit term and a penalty term. The natural estimator of the expected discrepancy, the estimated discrepancy, corresponds to the goodness-of-fit term of the criterion. However, the estimated discrepancy yields an overly optimistic assessment of how effectively the fitted model predicts new data. It therefore serves as a negatively biased estimator of the expected discrepancy. Correcting for this bias leads to the penalty term. Cross-validation provides a technique for developing an estimator of an expected discrepancy which need not be adjusted for bias. The basic idea is to construct an empirical discrepancy that evaluates an approximating model by assessing how accurately each case-deleted fitted model predicts the deleted case. The preceding approach is illustrated in the linear regression framework by formulating estimators of the expected discrepancy based on Kullback’s I-divergence and the Gauss (error sum of squares) discrepancy. The traditional criteria that arise by augmenting the estimated discrepancy with a bias adjustment term are the Akaike information criterion and Mallows’ conceptual predictive statistic. A simulation study is presented.

AIC

Mallows’ C

PRESS

Details

Title: Subtitle: Discrepancy-Based Model Selection Criteria Using Cross-Validation
Creators: Joseph E Cavanaugh - Department of Biostatistics, The University of Iowa, Iowa City, USA
Simon L Davies - Pfizer Global Research and Development, Pfizer, Inc., New York, USA
Andrew A Neath - Department of Mathematics and Statistics, Southern Illinois University, Edwardsville, USA
Resource Type: Book chapter
Publication Details: Statistical Models and Methods for Biomedical and Technical Systems, pp.473-486
Series: Statistics for Industry and Technology
DOI: 10.1007/978-0-8176-4619-6_33
Publisher: Birkhäuser Boston; Boston, MA
Language: English
Date published: 2008
Academic Unit: Statistics and Actuarial Science; Biostatistics; Injury Prevention Research Center
Record Identifier: 9984214688602771

Metrics

23 Record Views

2 Times Cited - Web of Science