The Use of Person-Fit Scores in High-Stakes Educational Testing: How to Use Them and What They Tell Us (RR 14-03)
Several statistics used to detect inconsistent patterns of correct/incorrect answers to test questions (items) were evaluated based on data from one Analytical Reasoning (AR) and one Logical Reasoning (LR) section of the Law School Admission Test. Item score patterns were also evaluated based on gender and racial/ethnic subgroups. We showed that test takers who were consistently flagged by all statistics evaluated and for both the AR and the LR sections had relatively low scores, which may have been the result of extensive guessing. Gender group comparisons showed no inconsistent test-taking behavior between male and female test takers. However, we did find significant differences in item score patterns for one racial/ethnic subgroup compared to the other subgroups. This particular subgroup has a large proportion of test takers whose first language is not English. We conclude that the indices evaluated provide useful information that may be used to routinely monitor test-taking behavior and to enhance the interpretation of test scores.