Communicating and resolving entity references by R.V. Guha.


Statements about entities occur everywhere, from newspapers and web pages to structured databases. Correlating references to entities across systems that use different identifiers or names for them is a widespread problem. In this paper, we show how shared knowledge between systems can be used to solve this problem. We present “reference by description”, a formal model for resolving references. We provide some results on the conditions under which a randomly chosen entity in one system can, with high probability, be mapped to the same entity in a different system.

An eye appointment is going to prevent me from reading this paper closely today.

From a quick scan, do you think Guha is making a distinction between entities and subjects (in the topic map sense)?

What do you make of literals having no identity beyond their encoding? (page 4, #3)

Redundant descriptions? (page 7) Would you say that defining a set of properties that must match would qualify? (Or even just additional subject indicators?)

Expect to see a lot more comments on this paper.


I first saw this in a tweet by Stefano Bertolo.

