Robust Text Similarity and Its Applications for the LSAT (RR 13-04)

Text similarity measurement provides a rich source of information and is increasingly being used in the development of new educational and psychological applications. However, due to the high-stakes nature of educational and psychological testing, it is imperative that a text similarity measure be stable (or robust) to avoid uncertainty in the data. The present research was sparked by this requirement. First, multiple sources of uncertainty that may affect the computation of semantic similarity between two texts are enumerated. Second, a method for achieving the requirement of a robust text similarity measure is proposed and then evaluated by applying it to data from the Law School Admission Test (LSAT). While further evaluation of the proposed method is warranted, the preliminary results were promising.

