IRT Modeling

The Effect of Item and Person Misfit on Selection Decisions: An Empirical Study (RR 15-05)

by Rob R. Meijer and Jorge N. Tendeiro, University of Groningen, Groningen, the Netherlands

Item response theory (IRT) is a mathematical model that is often applied in the development and analysis of educational and psychological assessments. Various IRT models exist, and practitioners must choose the model that is most appropriate for their particular assessment. Even when the most appropriate model is applied, the fit of the assessment data to the model is rarely perfect in practice. How serious, then, is model misfit for practical decision-making? In this study we analyze two empirical datasets with the aim of investigating the effect of removing misfitting items and misfitting item score patterns on the rank order of test takers according to their proficiency level score. Results for two different IRT models were compared. We found that the impact of removing misfitting items and item score patterns varied depending on the IRT model applied. This effect was more serious when selecting a small to moderate percentage of test takers from a group of test takers. When the percentage selected is larger, misfit is not important.

Back to report gallery

Additional reports in this collection

How Serious Is IRT Misfit for Practical Decision-Making?...

Item response theory (IRT) is a mathematical model used to support the development, analysis, and scoring of tests and questionnaires. For example, IRT allows for the description of item (i.e., question) characteristics, such as difficulty, as well as the proficiency level of test takers. Various IRT models are available, and choosing the most appropriate model for a particular test is essential. Since the fit of the test data to the chosen model is never perfect, measuring the fit of the model to the data is imperative.

The Effect of Item and Person Misfit on Selection Decisions: An Empirical Study (RR 15-05)

Request the full report

Additional reports in this collection

How Serious Is IRT Misfit for Practical Decision-Making?...