Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 30, 2011

Semantic Web Dog Food (There’s a fly in my
bowl.)

Filed under: Conferences,OWL,RDF,RDFa,Semantic Web — Patrick Durusau @ 6:59 pm

Semantic Web Dog Food

From the website:

Welcome to the Semantic Web Conference Corpus – a.k.a. the Semantic Web Dog Food Corpus! Here you can browse and search information on papers that were presented, people who attended, and other things that have to do with the main conferences and workshops in the area of Semantic Web research.

We currently have information about

  • 2133 papers,
  • 5020 people and
  • 1273 organisations at
  • 20 conferences and
  • 132 workshops,

and a total of 126886 unique triples in our database!

The numbers looked low to me until I read in the FAQ:

This is not just a site for ISWC [International Semantic Web Conference] and ESWC [European Semantic Web Conference] though. We hope that, in time, other metadata sets relating to Semantic Web activity will be hosted here — additional bibliographic data, test sets, community ontologies and so on.

This illustrates a persistent problem of the Semantic Web. This site has one way to encode the semantics of these papers, people, conferences and workshops. Other sources of semantic data on these papers, people, conferences and workshops may well use other ways to encode those semantics. And every group has what it feels are compelling reasons for following its choices and not the choices of others. Assuming they are even aware of the choices of others. (Discovery being another problem but I won’t talk about that now.)

The previous semantic diversity of natural language is now represented by a semantic diversity of ontologies and URIs. Now our computers can more rapidly and reliably detect that we are using different vocabularies. The SW seems like a lot of work for such a result. Particularly since we continue to use diverse vocabularies and more diverse vocabularies continue to arise.

The SW solution, using OWL Full:

5.2.1 owl:sameAs

The built-in OWL property owl:sameAs links an individual to an individual. Such an owl:sameAs statement indicates that two URI references actually refer to the same thing: the individuals have the same “identity”.

For individuals such as “people” this notion is relatively easy to understand. For example, we could state that the following two URI references actually refer to the same person:

<rdf:Description rdf:about="#William_Jefferson_Clinton">
<owl:sameAs rdf:resource="#BillClinton"/>
</rdf:Description>

The owl:sameAs statements are often used in defining mappings between ontologies. It is unrealistic to assume everybody will use the same name to refer to individuals. That would require some grand design, which is contrary to the spirit of the web.

In OWL Full, where a class can be treated as instances of (meta)classes, we can use the owl:sameAs construct to define class equality, thus indicating that two concepts have the same intensional meaning. An example:

<owl:Class rdf:ID="FootballTeam">
<owl:sameAs rdf:resource="http://sports.org/US#SoccerTeam"/>
</owl:Class>

One could imagine this axiom to be part of a European sports ontology. The two classes are treated here as individuals, in this case as instances of the class owl:Class. This allows us to state that the class FootballTeam in some European sports ontology denotes the same concept as the class SoccerTeam in some American sports ontology. Note the difference with the statement:

<footballTeam owl:equivalentClass us:soccerTeam />

which states that the two classes have the same class extension, but are not (necessarily) the same concepts.

Anyone see a problem? Other than requiring the use of OWL Full?

The absence of any basis for “…denotes the same concept as….?” I can’t safely reuse this axiom because I don’t know on what basis its author made such a claim. The URIs may provide further information that may satisfy me the axiom is correct but that still leaves me in the dark as to why the author of the axiom thought it to be correct. Overly precise for football/soccer ontologies you say but what of drug interaction ontologies? Or ontologies that govern highly sensitive intelligence data?

So we repeat semantic diversity, create maps to overcome the repeated semantic diversity and the maps we create have no explicit basis for the mappings they represent. Tell me again why this was a good idea?

4 Comments

  1. You don’t need to support OWL Full to use owl:sameAs. Even OWL Lite supports owl:sameAs, and many tools which don’t support any other OWL features at all at least support owl:sameAs.

    It’s actually a quite nice dataset. I’ve used it to test Duke and come up with a pile of owl:sameAs-statements between people.

    Comment by larsga@garshol.priv.no — May 31, 2011 @ 2:44 am

  2. Oh, I didn’t mean to imply the dataset wasn’t useful. I was explaining why it wasn’t larger.

    On OWL, I was using OWL Reference, http://www.w3.org/TR/owl-ref/, which I quoted above.

    Ah, reading more closely, that is class equality. Is owl:sameAs otherwise used for individuals?

    The bare statement of equality doesn’t provide any basis for a subsequent user to evaluate the use of owl:sameAs. I don’t have the reference handy but remember reading a W3C discussion about the varying uses of owl:sameAs. That seems problematic to me. You?

    Comment by Patrick Durusau — May 31, 2011 @ 9:26 am

  3. > Ah, reading more closely, that is class equality. Is owl:sameAs otherwise used
    > for individuals?

    Correct. OWL is in generally extremely strict about crossing the class/individual boundary, basically because this restriction is necessary to stay within Description Logic.

    > The bare statement of equality doesn’t provide any basis for a subsequent
    > user to evaluate the use of owl:sameAs. I don’t have the reference handy but
    > remember reading a W3C discussion about the varying uses of owl:sameAs.
    > That seems problematic to me. You?

    Quite frankly, in an open and uncontrolled environment I think the use of all properties will be found to vary in problematic ways. In the corporate environment where I work now I find they’re used that way even within a single system (well, columns in this case, but same principle).

    So I don’t think this is necessarily a problem with owl:sameAs in itself.

    In the worst case, you can always ignore sameAs statements from sources you don’t trust.

    Comment by larsga@garshol.priv.no — May 31, 2011 @ 1:04 pm

  4. But the worse case maybe your own.

    Recall that semantic mapping in the GTE telecommunications case, without the database creators, would have taken twelve person years. Auditable Reconciliation

    That was an initial mapping of database elements to each other.

    How is evaluation of an opaque mapping any different? If the mapping creators are still around we can ask them, otherwise spend the twelve person years to evaluate the mapping?

    Granting that no set of properties is perfect but we know that opaque mappings are automatically beyond our power to evaluate on the face of the mapping.

    Good point about excluding sameAs statements from untrusted sources but how do I evaluate the sameAs statements from sources I do trust?

    Comment by Patrick Durusau — June 1, 2011 @ 6:26 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress