The impact of scoring later on mixed format adaptive testing

Jing Ma

doi:10.25820/etd.007616

Back

The impact of scoring later on mixed format adaptive testing

Dissertation

Open access

The impact of scoring later on mixed format adaptive testing

Jing Ma

University of Iowa

Doctor of Philosophy (PhD), University of Iowa

Summer 2024

DOI: 10.25820/etd.007616

Files and links (1)

pdf

JM_dissertation1.73 MBDownload View

Free to read and download, Open Access

Abstract

This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while scoring polytomous items later had a statistically significant impact, the magnitude was generally small. Measurement precision and test security indices were slightly higher when polytomous items were scored after the adaptive test. Classification accuracy indices were minimally affected. These findings suggest that the practical significance of the impact may be limited in operational settings. However, test developers should still consider factors such as test design, test length and other item configuration when implementing mixed-format adaptive testing.

Details

Title: Subtitle: The impact of scoring later on mixed format adaptive testing
Creators: Jing Ma
Contributors: Stephen B Dunbar (Advisor)
Anthony D Fina (Advisor)
Catherine J Welch (Committee Member)
Kathy L Schuh (Committee Member)
Resource Type: Dissertation
Degree Awarded: Doctor of Philosophy (PhD), University of Iowa
Degree in: Psychological and Quantitative Foundations (Educational Measurement and Statistics)
Date degree season: Summer 2024
Publisher: University of Iowa
DOI: 10.25820/etd.007616
Number of pages: xiii, 127 pages
Language: English
Date submitted: 07/15/2024
Description illustrations: illustrations, tables, graphs
Description bibliographic: Includes bibliographical references (pages 108-121).
Public Abstract (ETD): Adaptive testing has become increasingly popular in recent years, allowing for more efficient and precise assessments of examinees' abilities. However, the incorporation of polytomous items, such as constructed-response (CR) questions, present challenges for adaptive testing, particularly when these items cannot be scored immediately. This study investigates the impact of scoring CR items later on various aspects of mixed-format adaptive testing. Using the shadow test approach, this study explored the effects of scoring CR items later on measurement precision, classification accuracy, and test security across different adaptive test designs, test lengths, numbers and location of CR items. The findings suggest that while scoring CR items later has a negative impact on these outcomes, the magnitude of the impact is generally small. For measurement precision, scoring CR items later led to slightly higher measurement error in ability estimation, particularly for shorter tests with more polytomous items. Classification accuracy was minimally affected, with differences between rates when scoring CR items during the test or after the test rarely exceeding 1%. Test security indices, such as maximum item exposure rate and pairwise test overlap rate, were also slightly higher under the scoring later scenario, but the practical significance of these differences may be limited. The results have important implications for operational testing programs considering the use of mixed-format adaptive testing. While the impact of scoring later on measurement precision, classification accuracy, and test security appears to be small, test developers should still carefully consider factors such as test length, the number and vii location of CR items, and the choice of adaptive test design when implementing these assessments. By understanding the potential impact of scoring later and actioning to lessen that impact, testing programs can better improve the validity, reliability, and fairness of their assessments. This study contributes to the expanding body of research on mixed format adaptive testing and offers valuable insights for both practitioners and researchers. As the use of Artificial Intelligence (AI) scoring on CR items continues to grow, it is crucial to understand the implications of scoring CR items later and to develop strategies for optimizing the mixed format adaptive assessments in real-world contexts.
Academic Unit: Psychological and Quantitative Foundations
Record Identifier: 9984698152402771

Metrics

2 File views/ downloads

4 Record Views