The impact of item parameter drift on the precision of online calibration in computer adaptive test

Tianpeng Ye

doi:10.25820/etd.006668

Back

The impact of item parameter drift on the precision of online calibration in computer adaptive test

Dissertation

Open access

The impact of item parameter drift on the precision of online calibration in computer adaptive test

Tianpeng Ye

University of Iowa

Doctor of Philosophy (PhD), University of Iowa

Autumn 2022

DOI: 10.25820/etd.006668

Files and links (1)

pdf

FinalDeposit_TianpengYe_120620223.85 MBDownload View

Free to read and download, Open Access

Abstract

Item parameter drift (IPD) has always been a potential threat to the reliability and validity of educational measurement. Researchers have made continuous efforts to detect drifted items, to examine the impact of drift in the paper-pencil testing and extended the work to the computer adaptive testing. As computer adaptive testing becomes the new trend and online calibration is in increasing demand, the impact of item parameter drift in the operational item pool would not only affect the examinee's ability, but also distort the precision of pretest item parameter estimation. This paper investigated the impact of IPDs with the operational CAT items on the precision of online calibration via a simulation study. Specifically, five independent factors, online calibration method, drift amount, drift size, drift direction, and pool size were fully crossed and examined. Comparisons were made across all levels of factors based on three evaluation criteria. Results showed that the drift amount, drift direction, and online calibration were the three primary studied factors impacting the precision of pretest item parameter estimates when IPDs were present. The D-Tp online calibration method outperformed the random method when IPDs occurred to most difficult operational items whereas the random method is preferred under other levels of drift amount. In other words, the random method can be robust to IPDs under various conditions, even though the D-Tp method performs better when IPD is not present. When incorporating the online calibration into an operational testing program, it is recommended for a testing program to be aware of which types of potential IPDs are with the operational item pool, so that the appropriate online calibration method could be selected and implemented.

CAT

Item Parameter Drift

Online Calibration

Details

Title: Subtitle: The impact of item parameter drift on the precision of online calibration in computer adaptive test
Creators: Tianpeng Ye
Contributors: Brandon LeBeau (Advisor)
Deborah Harris (Committee Member)
Catherine Welch (Committee Member)
Stephen Dunbar (Committee Member)
Knute Carter (Committee Member)
Resource Type: Dissertation
Degree Awarded: Doctor of Philosophy (PhD), University of Iowa
Degree in: Psychological and Quantitative Foundations (Educational Measurement and Statistics)
Date degree season: Autumn 2022
DOI: 10.25820/etd.006668
Publisher: University of Iowa
Number of pages: ix, 108 pages
Language: English
Description illustrations: illustrations, graphs, tables
Description bibliographic: Includes bibliographical references (pages 84-96).
Public Abstract (ETD): With the rapid advancement of technology, the traditional testing format as paper-pencil has gradually been replaced by the computer adaptive format in K-12 education and continuing education. In the setting of computer adaptive testing, each examinee is tested and scored on different test forms which are tailored to the examinee. To immediately release the test score to examinees, it is required to obtain the psychometric statistics of each question prior to the test administration. In addition, newly developed questions which need piloting may be embedded into the adaptive test so that their psychometric statistics could be obtained and used in future tests. Therefore, potential changes/drifts in the statistics of existing questions could lead to inaccurate test scores as well as inaccurate statistics of new questions.

This paper investigated the impact of such drifts on the accuracy of the statistics of new items across five factors. In particular, the study compared two methods (random method vs. D-Tp method) which were used to obtain the statistics of the new questions. The results showed that each method could have its own advantage depending on which type of drifts occurred to the statistics of old questions. If the changes happened to the most difficult questions, the D-Tp method was preferred. Otherwise, the random method was preferred under the other studied conditions. It is recommended for a testing program to be aware of what types of changes are potentially undergoing with the statistics of existing questions and select the appropriate method to obtain the statistics of new items accordingly.
Academic Unit: Psychological and Quantitative Foundations
Record Identifier: 9984362958302771

Metrics

10 File views/ downloads

70 Record Views