Classification consistency and accuracy indices for simple structure multidimensional item response theory model

Huan Liu

doi:10.25820/etd.007510

Back

Classification consistency and accuracy indices for simple structure multidimensional item response theory model

Dissertation

Open access

Classification consistency and accuracy indices for simple structure multidimensional item response theory model

Huan Liu

University of Iowa

Doctor of Philosophy (PhD), University of Iowa

Spring 2024

DOI: 10.25820/etd.007510

Files and links (1)

pdf

Classification Consistency and Accuracy Indices for Simple Structure Multidimensional Item Response Theory Model2.46 MBDownload View

Free to read and download, Open Access

Abstract

In many large-scale testing programs, examinees are frequently categorized into different performance levels. These classifications are then used to make high-stakes decisions about examinees in contexts such as in licensure, certification, and educational assessments. Numerous approaches to estimating the consistency and accuracy of this classification process have been developed under the CTT and UIRT frameworks. However, the multidimensional framework, particularly on the composite theta score metric, remains less explored. This dissertation was designed to explore the estimation of classification consistency and accuracy indices for composite summed and theta scores within the SS-MIRT framework. To achieve this goal, five prevalent approaches from the UIRT framework have been extended to the SS-MIRT context, including the Lee, Rudner, Guo, Bayesian EAP, and Bayesian MCMC approaches. These adapted approaches were then applied to two real data sets under various conditions. Further, a simulation study was conducted to evaluate the performance of the first four approaches under diverse testing scenarios, considering factors such as dimensionality, test length, and cut score location. The principal findings of this investigation include: (1) All five adapted approaches demonstrated commendable performance, with the estimation of classification indices exhibiting significant consistency in magnitude and pattern across different conditions; (2) Approaches using the MLE estimator generally showed higher ABIAS and RMSE, but lower SE compared to those employing the EAP estimator; (3) Approaches applied to the composite summed score metric typically resulted in higher ABIAS, SE, and RMSE than those for the composite theta score metric; (4) The D and M methods, assuming a multivariate standard normal distribution, yielded nearly identical outcomes, whereas the P method showed variations; (5) An increase in the correlation between dimensions and test length generally led to a decrease in ABIAS, SE, and RMSE.

Details

Title: Subtitle: Classification consistency and accuracy indices for simple structure multidimensional item response theory model
Creators: Huan Liu
Contributors: Won-Chan Lee (Advisor)
Jonathan Templin (Committee Member)
Ariel Aloe (Committee Member)
Stella Kim (Committee Member)
Resource Type: Dissertation
Degree Awarded: Doctor of Philosophy (PhD), University of Iowa
Degree in: Psychological and Quantitative Foundations (Educational Measurement and Statistics)
Date degree season: Spring 2024
Publisher: University of Iowa
DOI: 10.25820/etd.007510
Number of pages: xiii, 148 pages
Language: English
Date submitted: 04/21/2024
Description illustrations: Illustrations, tables, graphs, charts
Description bibliographic: Includes bibliographical references (pages 124-128).
Public Abstract (ETD): In many large-scale testing programs, examinees are frequently categorized into different performance levels. These classifications are then used to make high-stakes decisions about examinees in contexts such as in licensure, certification, and educational assessments. Numerous approaches to estimating the consistency and accuracy of this classification process have been developed under the CTT and UIRT frameworks. However, the multidimensional framework, particularly on the composite theta score metric, remains less explored.

This dissertation was designed to explore the estimation of classification consistency and accuracy indices for composite summed and theta scores within the SS-MIRT framework. To achieve this goal, five prevalent approaches from the UIRT framework have been extended to the SS-MIRT context, including the Lee, Rudner, Guo, Bayesian EAP, and Bayesian MCMC approaches.

Results showed that all five adapted approaches demonstrated commendable performance, with the estimation of classification indices exhibiting significant consistency in magnitude and pattern across different conditions. It was also found that approaches using the MLE estimator generally showed higher ABIAS and RMSE, but lower SE compared to those employing the EAP estimator. Regarding the score metric, approaches applied to the composite summed score metric typically resulted in higher ABIAS, SE, and RMSE than those for the composite theta score metric. For methods to obtain marginal indices, the D and M methods yielded nearly identical outcomes, whereas the P method showed variations. The last interesting finding is that an increase in the correlation between dimensions and test length generally led to a decrease in ABIAS, SE, and RMSE.
Academic Unit: Psychological and Quantitative Foundations
Record Identifier: 9984647455802771

Metrics

2 File views/ downloads

8 Record Views