The Positive Matching Index: A new similarity measure with optimal characteristics Authors: Daniel Andrés Dos Santosa, Reena Deutsch Keywords: Binary datam, Association coefficient, Jaccard index, Dice index, Similarity
Abstract:
Despite the many coefficients accounting for the resemblance between pairs of objects based on presence/absence data, no one measure shows optimal characteristics. In this work the Positive Matching Index (PMI) is proposed as a new measure of similarity between lists of attributes. PMI fulfills the Tulloss’ theoretical prerequisites for similarity coefficients, is easy to calculate and has an intrinsic meaning expressable into a natural language. PMI is bounded between 0 and 1 and represents the mean proportion of positive matches relative to the size of attribute lists, ranging this cardinality continuously from the smaller list to the larger one. PMI behaves correctly where alternative indices either fail, or only approximate to the desirable properties for a similarity index. Empirical examples associated to biomedical research are provided to show out performance of PMI in relation to standard indices such as Jaccard and Dice coefficients.
An index for people who don’t think a single measure for identity (URIs) is enough, say those in the natural sciences?