From the website:
DataCleaner is an Open Source application for analyzing, profiling, transforming and cleansing data. These activities help you administer and monitor your data quality. High quality data is key to making data useful and applicable to any modern business.
DataCleaner is the free alternative to software for master data management (MDM) methodologies, data warehousing (DW) projects, statistical research, preparation for extract-transform-load (ETL) activities and more.
Err, “…cleansing data.”? Did someone just call topic maps name? 😉
If it is important to eliminate duplicate data, everyone using duplicated data needs updates and relationships to it. Unless the duplicated data was the result of poor design or just wasting drive space.
This looks like an interesting project and certainly one were topic maps are clearly relevant as one possible output.