Journal article
The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments
International journal of testing, Vol.18(1), pp.27-49
01/02/2018
DOI: 10.1080/15305058.2017.1361426
Abstract
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the quality of the human ratings used to train the AESEs is rarely examined. As a result, the impact of various rater effects (e.g., severity and centrality) on the quality of AESE-assigned scores is not known. In this study, we use data from a large-scale rater-mediated writing assessment to examine the impact of rater effects on the quality of AESE-assigned scores. Overall, the results suggest that if rater effects are present in the ratings used to train an AESE, the AESE scores may replicate these effects. Implications are discussed in terms of research and practice related to automated scoring.
Details
- Title: Subtitle
- The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments
- Creators
- Stefanie A. Wind - University of AlabamaEdward W. Wolfe - Educational Testing ServiceGeorge Engelhard - University of GeorgiaPeter Foltz - PearsonMark Rosenstein - Pearson
- Resource Type
- Journal article
- Publication Details
- International journal of testing, Vol.18(1), pp.27-49
- DOI
- 10.1080/15305058.2017.1361426
- ISSN
- 1530-5058
- eISSN
- 1532-7574
- Publisher
- Routledge
- Number of pages
- 23
- Language
- English
- Date published
- 01/02/2018
- Academic Unit
- Psychological and Quantitative Foundations
- Record Identifier
- 9985123698002771
Metrics
1 Record Views