Semantic Web Dog Food
From the website:
Welcome to the Semantic Web Conference Corpus – a.k.a. the Semantic Web Dog Food Corpus! Here you can browse and search information on papers that were presented, people who attended, and other things that have to do with the main conferences and workshops in the area of Semantic Web research.
We currently have information about
- 2133 papers,
- 5020 people and
- 1273 organisations at
- 20 conferences and
- 132 workshops,
and a total of 126886 unique triples in our database!
The numbers looked low to me until I read in the FAQ:
This is not just a site for ISWC [International Semantic Web Conference] and ESWC [European Semantic Web Conference] though. We hope that, in time, other metadata sets relating to Semantic Web activity will be hosted here — additional bibliographic data, test sets, community ontologies and so on.
This illustrates a persistent problem of the Semantic Web. This site has one way to encode the semantics of these papers, people, conferences and workshops. Other sources of semantic data on these papers, people, conferences and workshops may well use other ways to encode those semantics. And every group has what it feels are compelling reasons for following its choices and not the choices of others. Assuming they are even aware of the choices of others. (Discovery being another problem but I won’t talk about that now.)
The previous semantic diversity of natural language is now represented by a semantic diversity of ontologies and URIs. Now our computers can more rapidly and reliably detect that we are using different vocabularies. The SW seems like a lot of work for such a result. Particularly since we continue to use diverse vocabularies and more diverse vocabularies continue to arise.
The SW solution, using OWL Full:
5.2.1 owl:sameAs
The built-in OWL property owl:sameAs links an individual to an individual. Such an owl:sameAs statement indicates that two URI references actually refer to the same thing: the individuals have the same “identity”.
For individuals such as “people” this notion is relatively easy to understand. For example, we could state that the following two URI references actually refer to the same person:
<rdf:Description rdf:about="#William_Jefferson_Clinton">
<owl:sameAs rdf:resource="#BillClinton"/>
</rdf:Description>
The owl:sameAs statements are often used in defining mappings between ontologies. It is unrealistic to assume everybody will use the same name to refer to individuals. That would require some grand design, which is contrary to the spirit of the web.
In OWL Full, where a class can be treated as instances of (meta)classes, we can use the owl:sameAs construct to define class equality, thus indicating that two concepts have the same intensional meaning. An example:
<owl:Class rdf:ID="FootballTeam">
<owl:sameAs rdf:resource="http://sports.org/US#SoccerTeam"/>
</owl:Class>
One could imagine this axiom to be part of a European sports ontology. The two classes are treated here as individuals, in this case as instances of the class owl:Class. This allows us to state that the class FootballTeam in some European sports ontology denotes the same concept as the class SoccerTeam in some American sports ontology. Note the difference with the statement:
<footballTeam owl:equivalentClass us:soccerTeam />
which states that the two classes have the same class extension, but are not (necessarily) the same concepts.
Anyone see a problem? Other than requiring the use of OWL Full?
The absence of any basis for “…denotes the same concept as….?” I can’t safely reuse this axiom because I don’t know on what basis its author made such a claim. The URIs may provide further information that may satisfy me the axiom is correct but that still leaves me in the dark as to why the author of the axiom thought it to be correct. Overly precise for football/soccer ontologies you say but what of drug interaction ontologies? Or ontologies that govern highly sensitive intelligence data?
So we repeat semantic diversity, create maps to overcome the repeated semantic diversity and the maps we create have no explicit basis for the mappings they represent. Tell me again why this was a good idea?