The Positive Matching Index: A new similarity measure with optimal characteristics

Thursday, November 18th, 2010

The Positive Matching Index: A new similarity measure with optimal characteristics Authors: Daniel Andrés Dos Santosa, Reena Deutsch Keywords: Binary datam, Association coefficient, Jaccard index, Dice index, Similarity


Despite the many coefficients accounting for the resemblance between pairs of objects based on presence/absence data, no one measure shows optimal characteristics. In this work the Positive Matching Index (PMI) is proposed as a new measure of similarity between lists of attributes. PMI fulfills the Tulloss’ theoretical prerequisites for similarity coefficients, is easy to calculate and has an intrinsic meaning expressable into a natural language. PMI is bounded between 0 and 1 and represents the mean proportion of positive matches relative to the size of attribute lists, ranging this cardinality continuously from the smaller list to the larger one. PMI behaves correctly where alternative indices either fail, or only approximate to the desirable properties for a similarity index. Empirical examples associated to biomedical research are provided to show out performance of PMI in relation to standard indices such as Jaccard and Dice coefficients.

An index for people who don’t think a single measure for identity (URIs) is enough, say those in the natural sciences?


Saturday, September 18th, 2010

SimMetrics. An extensible Java library of thirty (30) distance or similarity measures.

76 Binary Smilarity and Distance Measures

Saturday, September 11th, 2010

A Survey of Binary Similarity and Distance Measures Authors: Seung-Seok Choi, Sung-Hyuk Cha, Charles C. Tappert Keywords: binary similarity measure, binary distance measure, hierarchical clustering, classification, operational taxonomic unit. (Journal of Systemics, Cybernetics and Informatics, Vol. 8, No. 1, pp. 43-48, 2010)