Detecting Semantic Overlap and Discovering Precedents…

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 8, 2013

Detecting Semantic Overlap and Discovering Precedents…

Filed under: Diversity,Searching,Semantic Diversity,Semantic Inconsistency,Semantic Web,Semantics — Patrick Durusau @ 11:33 am

Detecting Semantic Overlap and Discovering Precedents in the Biodiversity Research Literature by Graeme Hirst, Nadia Talenty, and Sara Scharfz.

Abstract:

Scientiﬁc literature on biodiversity is longevous, but even when legacy publications are available online, researchers often fail to search it adequately or effectively for prior publications; consequently, new research may replicate, or fail to adequately take into account, previously published research. The mechanisms of the Semantic Web and methods developed in contemporary research in natural language processing could be used, in the near-term future, as the basis for a precedent-ﬁnding system that would take the text of an author’s early draft (or a submitted manuscript) and ﬁnd potentially related ideas in published work. Methods would include text-similarity metrics that take different terminologies, synonymy, paraphrase, discourse relations, and structure of argumentation into account.

Footnote one (1) of the paper gives an idea of the problem the authors face:

Natural history scientists work in fragmented, highly distributed and parochial communities, each with domain speciﬁc requirements and methodologies [Scoble 2008]. Their output is heterogeneous, high volume and typically of low impact, but with a citation half-life that may run into centuries” (Smith et al. 2009). “The cited half-life of publications in taxonomy is longer than in any other scientiﬁc discipline, and the decay rate is longer than in any scientiﬁc discipline” (Moritz 2005). Unfortunately, we have been unable to identify the study that is the basis for Moritz’s remark.

The paper explores in detail issues that have daunted various search techniques, when the material is available in electronic format at all.

The authors make a general proposal for addressing these issues, with mention of the Semantic Web but omit from their plan:

The other omission is semantic interpretation into a logical form, represented in XML, that draws on ontologies in the style of the original Berners-Lee, Hendler, and Lassila (2001) proposal for the Semantic Web. The problem with logical-form representation is that it implies a degree of precision in meaning that is not appropriate for the kind of matching we are proposing here. This is not to say that logical forms would be useless. On the contrary, they are employed by some approaches to paraphrase and textual entailment (section 4.1 above) and hence might appear in the system if only for that reason; but even so, they would form only one component of a broader and somewhat looser kind of semantic representation.

That’s the problem with the Semantic Web in a nutshell:

The problem with logical-form representation is that it implies a degree of precision in meaning that is not appropriate for the kind of matching we are proposing here.

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 8, 2013

Detecting Semantic Overlap and Discovering Precedents…

No Comments