Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation

Yoon Ah Song; Won-Chan Lee

doi:10.1080/08957347.2022.2067543

Back

Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation

Journal article

Peer reviewed

Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation

Yoon Ah Song and Won-Chan Lee

Applied measurement in education, Vol.35(2), pp.95-115

04/03/2022

DOI: 10.1080/08957347.2022.2067543

View Online

Abstract

This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of proficiency estimation of two IRT models (GPCM versus the hierarchical rater model, HRM) for double ratings. The main findings were as follows: (a) rater effects substantially reduced the accuracy of IRT proficiency estimation; (b) double ratings relieved the negative impact of rater effects on proficiency estimation and improved the accuracy relative to single ratings; (c) IRT estimators showed different patterns in the conditional accuracy; (d) as more items and a larger number of score categories were used, the accuracy of proficiency estimation improved; and (e) the HRM consistently showed better performance than the GPCM.

Details

Title: Subtitle: Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation
Creators: Yoon Ah Song - Center for Applied Linguistics
Won-Chan Lee - University of Iowa
Resource Type: Journal article
Publication Details: Applied measurement in education, Vol.35(2), pp.95-115
Publisher: Routledge
DOI: 10.1080/08957347.2022.2067543
ISSN: 0895-7347
eISSN: 1532-4818
Language: English
Date published: 04/03/2022
Academic Unit: Psychological and Quantitative Foundations
Record Identifier: 9984371299902771

Metrics

6 Record Views