Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 19, 2010

All Identifiers, All The Time – LOD As An Answer?

Filed under: Linked Data,LOD,RDA,Semantic Web,Subject Identity — Patrick Durusau @ 6:25 am

I am still musing over Thomas Neidhart’s comment:

To understand this identifier you would need implicit knowledge about the structure and nature of every possible identifier system in existence, and then you still do not know who has more information about it.

Aside from questions of universal identifier systems failing without exception in the past, which makes one wonder why this system should succeed, there are other questions.

Such as why would any system need to encounter every possible identifier system in existence?

That is the LOD effort has setup a strawman (apologies for the sexism) that it then proceeds to blow down.

If a subject has multiple identifiers in a set and my system recognizes only one out of three, what harm has come of the subject having the other two identifiers?

There is no processing overhead since by admission the system does not recognize the other identifier so it doesn’t process them.

The advantage being that some other system make recognize the subject on the basis of the other identifiers.

This post is a good example of that practice.

I had a category “Linked Data,” but I added a category this morning, “LOD,” just in case people search for it that way.

Why shouldn’t our computers adapt to how we use identifiers (multiple ones for the same subjects) rather than our attempting (and failing) to adapt to universal identifiers to make it easy for our computers?

November 18, 2010

A Direct Mapping of Relational Data to RDF

Filed under: Ambiguity,RDF,Semantic Web,Subject Identity — Patrick Durusau @ 7:15 pm

A Direct Mapping of Relational Data to RDF

A major step towards putting relational data “on the web.”

Identifying what that data means and providing a basis for reconciling it with other data remains to be addressed.

URIs and Identity

Filed under: Ambiguity,RDF,Semantic Web,Subject Identity,Topic Maps — Patrick Durusau @ 6:55 pm

If I read Halpin and others correctly, URIs identify the subjects they identify, except when they identify some other subject and it isn’t possible to know which of any number of subjects is being identified.

That is what I (and others) take as “ambiguity.”

Some readers have taken my comments to on URIs to be critical of RDF, which wasn’t my intent.

What I object to is the sentiment that everyone should use only URIs and then cherry pick any RDF graph that may result for identity purposes.

For example, in a family tree, there may be an entry: John Smith.

For which we can create: http://myfamilytree.smith.com/john_smith

That may resolve to an RDF graph but what properties in that graph identify a particular John Smith?

A “uniform” syntax for that “identifier” isn’t helpful if we all reach various conclusions about what properties in the graph to use for identification.

Or if we have different tests to evaluate the values of those properties.

Even with an RDF graph and rules for which properties to evaluate, we may still have ambiguity.

But rules for evaluation of RDF graphs for identity lessen the ambiguity.

All within the context, format, data model of RDF.

It does detract from URIs as identifiers but URIs as identifiers are no more viable than any single token as an identifier.

Sets of key/value pairs, which are made up of tokens, have the potential to lessen ambiguity, but not banish it.

November 16, 2010

In Defense of Ambiguity

Filed under: OWL,RDF,Semantic Web,Subject Identity — Patrick Durusau @ 5:49 pm

by Patrick J. Hayes and Harry Halpin was cited in David Booth’s article so like any academic, I had to go read the cited paper. 😉

Highly recommended.

The authors conclude:

Regardless of the details, the use of any technology in Web architecture to distinguish between access and reference, including our proposed ex:refersTo and ex:describedBy, does nothing more than allow the author of a URI to explain how they would like the URI to be used. Ultimately, there is nothing that Web architecture can do to prevent a URI from being used to refer to some thing non-accessible. However, at least having a clear and coherent device, such as a few RDF predicates, would allow the distinction to be made so the author could give guidance on what they believe best practice for their URI would be. This would vastly improve the situation from where it is today, where this distinction is impossible. The philosophical case for the distinction between reference and access is clear. The main advantage of Web architecture is that there is now a de facto universal identification scheme for accessing networked resources. With the Semantic Web, we can now extend this scheme to the wide world outside the Web by use of reference. By keeping the distinction between reference and access clear, the lemons of ambiguity can be turned into lemonade. Reference is inherently ambiguous, and ambiguity is not an error of communication, but fundamental to the success of communication both on and off the Web.

Sounds like the distinction between subject locators and identifiers that topic maps made long before this paper was written.

Resource Identity and Semantic Extensions: Making Sense of Ambiguity

Filed under: OWL,RDF,Semantic Web,Subject Identity — Patrick Durusau @ 5:29 pm

Resource Identity and Semantic Extensions: Making Sense of Ambiguity David Booth’s paper was cited by Bernard Vatant so I had to go take a look.

Bernard says: “The best analysis of the issue I’ve read so far.” I have to agree.

From the paper’s conclusion:

In general, a URI’s resource identity will necessarily be ambiguous. But this is not the end of the world. Rather, it means that while it may be unambiguous enough for one application, another application may require finer distinctions and thus consider it ambiguous. However, this ambiguity of resource identity can be precisely constrained by the use of URI declarations. Finally, a standard process is proposed for determining a URI’s resource identity.

Ambiguity is part and parcel of any system and the real question is how much can you tolerate?

For some systems that is quite a bit, for others, air traffic controllers come to mind, as little as possible.

Other identifiers are ambiguous as well.

Successful integration of data across systems depends on how well we deal with that ambiguity.

November 14, 2010

Linked Data Tutorial

Filed under: Linked Data,Semantic Web,Semantics — Patrick Durusau @ 9:36 am

Linked Data Tutorial: “A Practical Introduction by Dr. Michael Hausenblas..

A quick overview of “Linked Data” but not too quick to avoid pointing out some of its issues.

As slide 6 says of the principles: “Many things (deliberately?) kept blurry”

That blurriness carries over into the implementation and semantics of Linked Data.

Linking everything together in a higgly-piggly manner will lead to…, I assume everything being linked together in a higgly-piggly manner.

Once linked together perhaps that will drive refinement of the linking into something useful.

Questions:

  1. List examples of the use of Linked Data in libraries. (3-5 pages, citations/links)
  2. How would you use Linked Data in a library? (3-5 pages, no citations)
  3. What would you change about Linked Data practice or standards? (3-5 pages, citations)
  4. Finding aid on Linked Data for librarians. (3-5 pages, citations)

November 13, 2010

LIMES – LInk discovery framework for MEtric Spaces

Filed under: Linked Data,Semantic Web,Software — Patrick Durusau @ 7:46 am

LIMES – LInk discovery framework for MEtric Spaces

From the website:

LIMES is a link discovery framework for the Web of Data. It implements time-efficient approaches for large-scale link discovery based on the characteristics of metric spaces. It is easily configurable via a web interface. It can also be downloaded as standalone tool for carrying out link discovery locally.

LIMES detects “duplicates” in a single source or between sources by use of string metrics.

The current version of LIMES supports exclusively the string metrics Levenshtein, QGrams, BlockDistance and Euclidian as implements by the SimMetrics library. Further metrics will be included in following versions.

An interesting approach to use as a topic map authoring aid.

Questions:

  1. Using the online LIMES interface, develop and run five (5) link discovery requests. Name and save the result files. Upload them to your class project directory. Be prepared to discuss your requests and results in class.
  2. Sign up to be discussion leader for one of the algorithms supported by LIMES. Prepare a two (2) page summary for the class on your algorithm.
  3. What suggestions would you have for the project on its current UI?
  4. Use LIMES to augment your topic map authoring. Comments? (3-5 pages, no citations)
  5. In an actual run, I got the following as owl:sameAs – http://bio2rdf.org/mesh:D016889 and http://data.linkedct.org/page/condition/4398. Your evaluation? You may follow any links you find to make your evaluation. (2-3 pages, include URLs for other locations that you visit)

November 12, 2010

LOD, Semantic Ambiguity and Topic Maps

Filed under: Authoring Topic Maps,Linked Data,Semantic Web,Topic Maps — Patrick Durusau @ 6:23 pm

The semantic ambiguity of linked data has been a hot topic of discussion of late.

Not only of what linked data links to but of linked data itself!

If you have invested a lot in linked data efforts, don’t panic!

Topic maps, even using XTM/CTM syntaxes, to say nothing of more exotic models, can reduce any semantic ambiguity using occurrences.

If and when it is necessary.

Quite serious, “if and when necessary.”

Err, “if and when necessary” meaning when it is important enough for someone to pay for the disambiguation.

Ambiguity between buyers and sellers of women’s shoes or lingerie probably abounds, but unless someone is willing to pay the freight for disambiguation, it isn’t my concern.

Linked data is exposing the ambiguity of the Semantic Web.

Being unable to solve the semantic ambiguity it exposes, linked data is creating opportunities for topic maps!

Maybe we should send the W3C a fruit basket or something?

November 9, 2010

ONTOLOGIES AND SOCIAL SEMANTIC WEB FOR INTELLIGENT EDUCATIONAL SYSTEMS (SWEL)

Filed under: Conferences,Ontology,Semantic Web — Patrick Durusau @ 8:04 pm

ONTOLOGIES AND SOCIAL SEMANTIC WEB FOR INTELLIGENT EDUCATIONAL SYSTEMS (SWEL)

Paper deadline: 22 November 2010

Announcement:

Ontologies, the Semantic Web, and the Social Semantic Web offer a new perspective on intelligent educational systems by providing intelligent access to and management of Web information and semantically richer modeling of the applications and their users. This allows for supporting more adequate and accurate representations of learners, their learning goals, learning material and contexts of its use, as well as more efficient access and navigation through learning resources. The goal is to advance intelligent educational systems, so as to achieve improved e-learning efficiency, flexibility and adaptation for single users and communities of users (learners, instructors, courseware authors, etc). This special track follows the workshop series “Ontologies and Semantic Web for e-Learning”- SWEL which was conducted successfully from 2002-2009 at different hosting conferences (http://compsci.wssu.edu/iis/swel/).

BTW, I stole this from a post by Darina Dicheva to the topicmapmail list. CFP: SWEL Special Track at FLAIRS-24 – two weeks to the deadline!

Whose Logic Binds A Topic Map?

Filed under: Authoring Topic Maps,Semantic Web,TMDM,TMRM,Topic Maps — Patrick Durusau @ 7:15 am

An exchange with Lars Heuer over what the TMRM should say about “ako” and “isa” (see: A Guide to Publishing Linked Data Without Redirects brings up an important but often unspoken issue.

The current draft of the Topic Maps Reference Model (TMRM) says that subclass-superclass relationships are reflexive and transitive. Moreover, “isa” relationships, are non-reflexive and transitive.

Which is all well and good, assuming that accords with your definition of subclass-superclass and isa. The Topic Maps Data Model (TMDM) on the other hand defines “isa” as non-transitive.

Either one is a legitimate choice and I will cover the resolution of that difference elsewhere.

My point here is to ask: “Whose logic binds a topic map?”

My impression is that here and in the Semantic Web, logical frameworks are being created, into which users are supposed to fit their data.

As a user I would take serious exception to fitting my data into someone else’s world view (read logic).

That the real question isn’t it?

Whether IT/SW dictates to users the logic that will bind their data or if users get to define their own “logics?”

Given the popularity of tagging and folksonomies, user “logics” look like the better bet.

November 8, 2010

BibBase and Beyond

Filed under: BibTeX,OWL,RDF,Semantic Web — Patrick Durusau @ 8:38 am

BibBase is an effort to store BibTeX information as RDF triples. For the data, see: BibBase data.

As of 8 November 2010, there are 6178 publications.

Interesting I suppose but the real question is how to enable researchers using BibTeX to disambiguate their terminology as part of their BibTeX entry?

Has to be as easy as BibTeX and consistent with usage patterns in the communities that use it. If you hope for adoption.

Not hard to imagine a helper application that runs through a set of BibTeX entries and suggest 1998 ACM Computing Classification System or 2010 Mathematics Subject Classification entries. Entries which the author could accept or reject.

Not the fine grained, concept by concept (read subject by subject) analysis of a document that I would like to see, but it’s a start.

ISWC 2010 Data and Demos

Filed under: Linked Data,RDF,Semantic Web,SPARQL — Patrick Durusau @ 6:27 am

ISWC 2010 Data and Demos.

Data and demos from the International Semantic Web Conference 2010. Includes links to prior data sets and browsers that work with the data sets.

Data sets are always important as well as being able to gauge the current state of semantic software.

Ambiguity and Linked Data URIs

Filed under: Ambiguity,Linked Data,Marketing,RDF,Semantic Web,Topic Maps — Patrick Durusau @ 6:14 am

I like the proposal by Ian Davis to avoid the 303 cloud while try to fix the mistake of confusing identifiers with addresses in an address space.

Linked data URIs are already known to be subject to the same issues of ambiguity as any other naming convention.

All naming conventions are subject to ambiguity and “expanded” naming conventions, such as a list of properties in a topic map, may make the ambiguity a bit more manageable.

That depends on a presumption that if more information is added and a user advised of it, the risk of ambiguity will be reduced.

But the user needs to be able to use the additional information. What if the additional information is to distinguish two concepts in calculus and the reader is innocent of even basic algebra?

That is that say ambiguity can be overcome only in particular contexts.

But overcoming ambiguity in a particular context may be enough. Such as:

  • Interchange between intelligence agencies
  • Interchange between audited entities and their auditors (GAO, SEC, Federal Reserve (or their foreign equivalents))
  • Interchange between manufacturers and distributors

None of those are the golden age of seamless knowledge sharing and universal democratization of decision making or even scheduling tennis matches sort of applications.

They are applications that can reduce incremental costs, improve overall efficiency and perhaps contribute to achievement of organizational goals.

Perhaps that is enough.

A Guide to Publishing Linked Data Without Redirects – Post

Filed under: Linked Data,RDF,Semantic Web — Patrick Durusau @ 5:34 am

A Guide to Publishing Linked Data Without Redirects is a proposal by Ian Davis to avoid the 303 while distinguishing between “things” and their descriptions.

A step in the right direction.

November 4, 2010

Is 303 Really Necessary? – Blog Post

Filed under: Linked Data,RDF,Semantic Web,Uncategorized — Patrick Durusau @ 9:46 am

Is 303 Really Necessary?.

Ian Davis details at length why 303’s are unnecessary and offers an interesting alternative.

Read the comments as well.

November 3, 2010

The Semantic Web Garden of Eden

Filed under: Marketing,RDF,Semantic Web,Topic Maps — Patrick Durusau @ 6:48 pm

The Garden of Eden:

[2:19] And out of the ground the LORD God formed every beast of the field, and every fowl of the air; and brought them unto Adam to see what he would call them: and whatsoever Adam called every living creature, that was the name thereof….[1]

As the number of Adams and Eves multiplied, so did the names of things.

Multiple names for the same things, different things with the same names.

Ambiguity had entered the world.

The Semantic Web Garden of Eden sought to banish ambiguity:

…by an RDF statement having…URIrefs are used to identify not only the subject of the original statement, but also the predicate and object, instead of using the words “creator” and “John Smith” [2]

As the number of URIs multipled, so did the URIs of things.

Multiple URIs for the same things, different things with the same URIs.

Ambiguity remains in the world.

******
[1] Genesis 2:19
[2] RDF Primer, 2.2 RDF Model, http://www.w3.org/TR/rdf-primer/

Weaknesses In Linked Data

Filed under: Linked Data,RDF,Semantic Web — Patrick Durusau @ 6:47 pm

A Partnership between Structured Data and Ontotext to address weaknesses in linked data framed it this way:

Volumes of linked data on the Web are growing. This growth is exposing three key weaknesses:

  1. inadequate semantics for how to link disparate information together that recognizes inherently different contexts and viewpoints and (often) approximate mappings
  2. misapplication of many linking predicates, such as owl:sameAs, and
  3. a lack of coherent reference concepts by which to aggregate and organize this linkable content.

The amount of linked data is trivial compared to the total volume of digital data.

Makes me wonder about the “only the web will scale argument.”

Questions:

  1. How do these three “key weaknesses” compared to current barriers to semantic integration? (3-5 pages, no citations)
  2. “inadequate semantics?” What’s wrong with the semantics we have now? Or is the point that formal semantics are inadequate? (discussion)
  3. “coherent reference concepts?” How would you recognize one if you saw it? (3-5 pages, no citations)

November 1, 2010

Rule Markup Initiative

Filed under: RDF,RuleML,Semantic Web — Patrick Durusau @ 4:48 pm

Rule Markup Initiative

From the website:

The RuleML Initiative is an international non-profit organization covering all aspects of Web rules and their interoperation, with a Structure and Technical Groups that center on RuleML specification, tool, and application development. Around RuleML, an open network of individuals and groups from both industry and academia has emerged, having a shared interest in modern rule topics, including the interoperation of Semantic Web rules. The RuleML Initiative has been collaborating with OASIS on Legal XML, Policy RuleML, and related efforts since 2004. The Initiative has further been interacting with the developers of ISO Common Logic (CL), which became an International Standard, First edition, in October 2007. RuleML is also a member of OMG, contributing to its Semantics of Business Vocabulary and Business Rules (SBVR), which went into Version 1.0 in January 2008, and to its Production Rule Representation (PRR), which went into Version 1.0 in December 2009. Moreover, participants of the RuleML Initiative have supported the development of the W3C Rule Interchange Format (RIF), which attained Recommendation status in June 2010. The annual RuleML Symposium has taken the lead in bringing together delegates from industry and academia who share this interest focus in Web rules.

Questions:

  1. Does the use of ISO Common Logic insure interoperability? Why/Why not? (discussion)
  2. How would you define interoperability? (3-5 pages, no citations)
  3. Can rules insure your definition of interoperability? (discussion)
  4. Are rules subject to the same semantic drift as data? Why/Why not?(3-5 pages, no citations)

October 31, 2010

R2RML: RDB to RDF Mapping Language

Filed under: RDF,Semantic Web — Patrick Durusau @ 8:13 pm

R2RML: RDB to RDF Mapping Language

Abstract:

This document describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author’s choice. R2RML mappings are themselves RDF graphs and written down in Turtle syntax. R2RML enables different types of mapping implementations: processors could, for example, offer a virtual SPARQL endpoint over the mapped relational data, or generate RDF dumps, or offer a Linked Data interface.

First draft from the RDB2RDF working group.

Questions:

  1. Select a table from two (or three) databases in a common area with different schemas.
  2. Convert the tables using the latest version of this proposal to RDF datasets.
  3. On what basis would you integrate the resulting RDF datasets into a single RDF dataset?

October 30, 2010

Sense and Reference on the Web

Filed under: Semantic Web,Semantics,Subject Identity — Patrick Durusau @ 10:01 am

Sense and Reference on the Web is Harry Halpin’s thesis seeking to answer the question: “What does a Uniform Resource Identifier (URI) mean?”

Abstract:

This thesis builds a foundation for the philosophy of the Web by examining the crucial question: What does a Uniform Resource Identifier (URI) mean? Does it have a sense, and can it refer to things? A philosophical and historical introduction to the Web explains the primary purpose of the Web as a universal information space for naming and accessing information via URIs. A terminology, based on distinctions in philosophy, is employed to define precisely what is meant by information, language, representation, and reference. These terms are then employed to create a foundational ontology and principles of Web architecture. From this perspective, the Semantic Web is then viewed as the application of the principles of Web architecture to knowledge representation. However, the classical philosophical problems of sense and reference that have been the source of debate within the philosophy of language return. Three main positions are inspected: the logicist position, as exemplified by the descriptivist theory of reference and the first-generation Semantic Web, the direct reference position, as exemplified by Putnam and Kripke’s causal theory of reference and the second-generation Linked Data initiative, and a Wittgensteinian position that views the Semantic Web as yet another public language. After identifying the public language position as the most promising, a solution of using people’s everyday use of search engines as relevance feedback is proposed as a Wittgensteinian way to determine sense of URIs. This solution is then evaluated on a sample of the Semantic Web discovered by via using queries from a hypertext search engine query log. The results are evaluated and the technique of using relevance feedback from hypertext Web searches to determine relevant Semantic Web URIs in response to user queries is shown to considerably improve baseline performance. Future work for the Web that follows from our argument and experiments is detailed, and outlines of a future philosophy of the Web laid out.

Questions:

  1. Choose a non-Web reference system.
  2. What is the nature of those references? (3-5 pages, with citations)
  3. Compare those references to URIs.
  4. How are those references and URIs the same/different? (3-5 pages, with citations)
  5. Evaluate Halpin’s use of Wittgenstein. (5-10 pages, with citations)

October 29, 2010

Semantic Web Summit East – November 16-17, 2010 Boston

Filed under: Conferences,Semantic Web,Semantics — Patrick Durusau @ 4:19 am

Semantic Web Summit East – November 16-17, 2010.

The range of “semantic” for this conference is broader than “Semantic Web.” Check the presentations to see what I mean.

Useful for the business case about semantics, contacts and semantic success stories.

BTW, June 5-9 is the Semantic Web Summit West, San Francisco.

October 28, 2010

LDSpider

Filed under: Linked Data,Search Engines,Searching,Semantic Web — Patrick Durusau @ 5:11 am

LDSpider.

From the website:

The LDSpider project aims to build a web crawling framework for the linked data web. Requirements and challenges for crawling the linked data web are different from regular web crawling, thus this projects offer a web crawler adapted to traverse and harvest sources and instances from the linked data web. We offer a single jar which can be easily integrated into own applications.

Features:

  • Content Handlers for different formats
  • Different crawling strategies
  • Crawling scope
  • Output formats

Content handlers, crawling strategies, crawling scope, output formats, all standard crawling features. Adapted to linked data formats but those formats should be accessible to any crawler.

A welcome addition since we are all going to encounter linked data but I am missing what is different?

If you see it, please post a comment.

Questions:

  1. What semantic requirements should a web crawler have?
  2. How does this web crawler compare to your requirements?
  3. What one capacity would you add to this crawler?
  4. What other web crawlers should be used for comparison?

October 22, 2010

Linking Enterprise Data

Filed under: Knowledge Management,Linked Data,Semantic Web — Patrick Durusau @ 5:53 am

Linking Enterprise Data, ed. by David Wood. The full text is available in HTML.

Table of Contents:

  • Part I Why Link Enterprise Data?
    • Semantic Web and the Linked Data Enterprise, Dean Allemang
    • The Role of Community-Driven Data Curation for Enterprises, Edward Curry, Andre Freitas, and Sean O’Riain
  • Part II Approval and Support of Linked Data Projects
    • Preparing for a Linked Data Enterprise, Bernadette Hyland
    • Selling and Building Linked Data: Drive Value and Gain Momentum, Kristen Harris
  • Part III Techniques for Linking Enterprise Data
    • Enhancing Enterprise 2.0 Ecosystems Using Semantic Web and Linked Data Technologies: The SemSLATES Approach, Alexandre Passant, Philippe Laublet, John G. Breslin and Stefan Decker
    • Linking XBRL Financial Data, Roberto García and Rosa Gil
    • Scalable Reasoning Techniques for Semantic Enterprise Data, Reza B’Far
    • Reliable and Persistent Identification of Linked Data Elements, David Wood

Comments to follow.

October 20, 2010

8th Extended Semantic Web Conference: May 29 – June 2 2011 Heraklion, Greece

Filed under: Conferences,Ontology,OWL,Semantic Web,Semantics,SPARQL — Patrick Durusau @ 3:15 am

8th Extended Semantic Web Conference: May 29 – June 2 2011 Heraklion, Greece

Important Dates

See ESWC 2010 for range of content.

October 17, 2010

IEEE Computer Society Technical Committee on Semantic Computing (TCSEM)

The IEEE Computer Society Technical Committee on Semantic Computing (TCSEM)

addresses the derivation and matching of the semantics of computational content to that of naturally expressed user intentions in order to retrieve, manage, manipulate or even create content, where “content” may be anything including video, audio, text, software, hardware, network, process, etc.

Being organized by Phillip C-Y Sheu (UC Irvine), psheu@uci.edu, Phone: +1 949 824 2660. Volunteers are needed for both organizational and technical committees.

This is a good way to meet people, make a positive contribution and, have a lot of fun.

October 11, 2010

Semantic Drift: An RDF Answer (sort-of)

Filed under: RDF,Semantic Web,Subject Identity — Patrick Durusau @ 7:27 am

As promised last week, there are RDF researchers working on issues related to semantic drift.

An interesting approach can be found in: Entity Reference Resolution via Spreading Activation on RDF-Graphs Authors(s): Joachim Kleb, Andreas Abecker

Abstract:

The use of natural language identifiers as reference for ontology elements—in addition to the URIs required by the Semantic Web standards—is of utmost importance because of their predominance in the human everyday life, i.e.speech or print media. Depending on the context, different names can be chosen for one and the same element, and the same element can be referenced by different names. Here homonymy and synonymy are the main cause of ambiguity in perceiving which concrete unique ontology element ought to be referenced by a specific natural language identifier describing an entity. We propose a novel method to resolve entity references under the aspect of ambiguity which explores only formal background knowledge represented in RDF graph structures. The key idea of our domain independent approach is to build an entity network with the most likely referenced ontology elements by constructing steiner graphs based on spreading activation. In addition to exploiting complex graph structures, we devise a new ranking technique that characterises the likelihood of entities in this network, i.e. interpretation contexts. Experiments in a highly polysemic domain show the ability of the algorithm to retrieve the correct ontology elements in almost all cases.

It is the situating of a concept in a context (not assignment of a URI) that enables the correct result in a polysemic domain.

This doesn’t directly model semantic drift but does represent anchoring a term in a particular context.

The questions that divides semantic technologies are:

  • Who throws the anchor?
  • Who governs the anchors?
  • Can there be more than one anchor?
  • What about “my” anchor?
  • …and others

More on those anon.

October 8, 2010

Semantic Drift and Linked Data/Semantic Web

Filed under: Linked Data,OWL,Semantic Web,Subject Identity — Patrick Durusau @ 10:28 am

Overloading OWL sameAs starts with:

Description: General Issue: owl:sameAs is being used in the linked data community in a way that is inconsistent with its semantics.

Read the document but in summary: People use OWL sameAs to mean different things.

I don’t see how their usage can be “inconsistent with its semantics.”

Words don’t possess self-executing semantics that bind us. Rather the other way round I think.

If OWL sameAs had some “original” semantic, it changed by the process of semantic drift.

Semantic drift is where the semantics of a token changes over time or across communities due to its use by people.

URIs or tokens may be “stable,” but the evidence is that the semantics of URIs or tokens are not.

The question is how to manage changing, emerging, drifting semantics? (Not a question answered by a static semantic model of URI based identity.)

PS: RDF researchers have recognized semantic drift and have proposed solutions for addressing it. More on that anon.

Questions:

  • Select a classification more than 30 years old and randomly select one book for each 5 year period for the last 30 years. What (if any) semantic drift do you see in the use of this classification?
  • Exchange your list with a classmate. Do you agree/disagree with their evaluation? Why?
  • Repeat the exercise in #1 and #2 but use a classification where you can find books between 30 and 60 years ago. Select one book per 5 year period.

Library Linked Data: Call for Use Cases

Filed under: Linked Data,Semantic Web — Patrick Durusau @ 6:11 am

Library Linked Data: Call for Use Cases

Just a quick reminder that the call for use cases from the W3C Library Linked Data Incubator Group ends on 15 October 2010.

The mailing archives may be of interest: public-lld@w3.org.

I’m not a fan of “Linked Data” but it will be encountered by topic map authors and so we need to follow its development.

October 6, 2010

The RelFinder user interface: interactive exploration of relationships between objects of interest

Filed under: Associations,Interface Research/Design,RDF,Semantic Web,Software — Patrick Durusau @ 7:00 am

The RelFinder user interface: interactive exploration of relationships between objects of interest Authors: Steffen Lohmann, Philipp Heim, Timo Stegemann, Jürgen Ziegler Keywords: dbpedia, decision support, graph visualization, linked data, relationship discovery, relationship web, semantic user interfaces, semantic web, sparql, visual exploration

Abstract:

Being aware of the relationships that exist between objects of interest is crucial in many situations. The RelFinder user interface helps to get an overview: Even large amounts of relationships can be visualized, filtered, and analyzed by the user. Common concepts of knowledge representation are exploited in order to support interactive exploration both on the level of global filters and single relationships. The RelFinder is easy-to-use and works on every RDF knowledge base that provides standardized SPARQL access

Software: RelFinder

RelFinder presents a way to leverage data already in RDF for the creation of associations in topic maps.

Or to explore data already available in RDF.

Exploration of relationships is important for “data” but even more important for the syntaxes that contain data.

Such as equivalence between subjects represented by syntax tokens.

October 5, 2010

Grist For Topic Map Mills: German National Library – Authority Files

Filed under: Dataset,Linked Data,RDF,Semantic Web — Patrick Durusau @ 6:09 am

German National Library – Authority Files (linked data)

A post from Lars Svensson announced the release of authority files from the German National Library:

The German National Library (DNB) has published the German library authority files as linked data. The dataset consists of 1.8 Mill differentiated persons from the PND (Personennamendatei, Name authority file), 187.000 subject headings from the SWD (Schlagwortnormdatei, Subject headings authority file), 1.3 Mill corporate bodies from the GKD (Gemeinsame Körperschaftsdatei, Corporate Body Authority file), and 51,000 classes from the German translation of the Dewey Decimal Classification (DDC).

Library students should take particular note of the subject heading and Dewey Decimal Classification materials.

For topic mappers, another set of identifiers that can be mapped between the data sets shown by data cloud as well those that don’t use URIs as identifiers (the vast majority of data).

This will also be of interest to the linked data community.

« Newer PostsOlder Posts »

Powered by WordPress