Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 6, 2014

A Methodology for Empirical Analysis of LOD Datasets

Filed under: Bioinformatics,Biomedical,LOD — Patrick Durusau @ 6:52 pm

A Methodology for Empirical Analysis of LOD Datasets by Vit Novacek.

Abstract:

CoCoE stands for Complexity, Coherence and Entropy, and presents an extensible methodology for empirical analysis of Linked Open Data (i.e., RDF graphs). CoCoE can offer answers to questions like: Is dataset A better than B for knowledge discovery since it is more complex and informative?, Is dataset X better than Y for simple value lookups due its flatter structure?, etc. In order to address such questions, we introduce a set of well-founded measures based on complementary notions from distributional semantics, network analysis and information theory. These measures are part of a specific implementation of the CoCoE methodology that is available for download. Last but not least, we illustrate CoCoE by its application to selected biomedical RDF datasets. (emphasis in original)

A deeply interesting work on the formal characteristics of LOD datasets but as we learned in Community detection in networks:… a relationship between a typology (another formal characteristic) and some hidden fact(s) may or may not exist.

Or to put it another way, formal characteristics are useful for rough evaluation of data sets but cannot replace a grounded actor considering their meaning. That would be you.

I first saw this in a tweet by Marin Dimitrov

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress