elasticsearch-entity-resolution by Yann Barraud.
From the webpage:
This project is an interactive entity resolution plugin for Elasticsearch based on Duke. Basically, it uses Bayesian probabilities to compute probability. You can pretty much use it an interactive deduplication engine.
It is usable as is, though
cleaners
are not yet implemented.To understand basics, go to Duke project documentation.
A list of available comparators is available here.
Intereactive deduplication? Now that sounds very useful for topic map authoring.
Appropriate that I saw this in a Tweet by Duke‘s author, Lars Marius Garshol.