Likelihood-based Statistics for Continuous and Discrete Responses With a Structure for the Item and Person Parameters (RR 06-06)
by Cees A. W. Glas and Wim J. van der Linden, University of Twente, Enschede, The Netherlands
Executive Summary
Likelihood-based statistical tests such as Lagrange multiplier tests are useful for testing the validity of a model against alternative models. In the current project, we used Lagrange multiplier (LM) tests to assess the validity of a hierarchical model for speed and accuracy on test items (questions). More specifically, the model was extended to allow for statistical tests of the assumption of subpopulation invariance. When such an assumption is not met, test takers of equal ability from different subpopulations (i.e., male test takers and female test takers) do not have equal probabilities of answering an item correctly; this is sometimes called differential item functioning (DIF).
The novel aspects of the two statistical tests developed in this project are that they allow us to check the subpopulation invariance assumption using both the responses from test takers on the items and their response times (RTs). This feature makes sense in that a test item may not show any differential functioning with respect to its probability of a correct response while requiring different amounts of time from test takers from different subpopulations, or vice versa.
The two statistical tests were evaluated in a computer simulation study. For test lengths in the range of 10–40 items, both tests were shown to realize their nominal Type I (i.e., false positive) error rates. In addition, they appeared to have good power (i.e., true positive rates) to detect cases of DIF. We also studied two simplified versions of the tests that were easier to calculate. However, these versions had inflated Type I errors and, as a result, showed less power to detect DIF.