Comparison Study of Item Preknowledge Detectors (RR 14-01)
When a test taker has prior knowledge about an administered test question (item), then this event is called item preknowledge, the test taker is called aberrant, and the item is called compromised. Item preknowledge negatively affects the corresponding testing program and its test score users (universities, companies, government organizations) because the scores produced for aberrant test takers will be invalid. The performance of eight statistics for detection of item preknowledge (five existing, two modified, and one new) was studied via computer simulations. Three major factors that could potentially influence the performance of the statistics were considered: (a) the type of test (adaptive, in which the next administered item is selected based on the test taker’s responses to previously administered items; or nonadaptive, in which all test takers are administered the same group of items); (b) distribution of the aberrant population (normal or uniform); and (c) noise in the information about compromised items (since different groups of aberrant test takers may have prior knowledge of different groups of items). The last factor demonstrated the highest negative impact on the performance of all of the statistics: the greater the noise, the lower the detection of item preknowledge. Several methods to address this problem are discussed.