Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 3, 2011

Automated extraction of domain-specific clinical ontologies – Weds Oct. 5th

Filed under: Bioinformatics,Biomedical,Ontology,SNOMED — Patrick Durusau @ 7:09 pm

Automated extraction of domain-specific clinical ontologies by Chimezie Ogbuji from Case Western Research University School of Medicine. 10 AM PT Weds Oct. 5, 2011.

Full NCBO Webinar schedule: http://www.bioontology.org/webinar-series

ABSTRACT:

A significant set of challenges in the use of large, source ontologies in the medical domain include: automated translation, customization of source ontologies, and performance issues associated with the use of logical reasoning systems to interpret the meaning of a domain captured in a formal knowledge representation.

SNOMED-CT and FMA are two reference ontologies that cover much of the domain of clinical medicine and motivate a better means for the re-use of such ontologies. In this presentation, the author will present a set of automated methods (and tools) for segmenting, merging, and surveying modules extracted from these ontologies for a specific domain.

I’m interested generally but in particular about the merging aspects, for obvious reasons. Another reason to be interested is some research I encountered recently on “outliers” in reasoning systems. Apparently there is a class of reasoning systems that simply “fall over” if they encounter a concept they recognize (or “think” they do) only to find it has some property (what makes it an “outlier”) that they don’t. Seems rather fragile to me but I haven’t finished running it to ground. Curious how these methods and tools handle the “outlier” issue.

SPEAKER BIO:

Chimezie is a senior research associate in the Clinical Investigations Department of the Case Western Research University School of Medicine where he is responsible for managing, developing, and implementing Clinical and Translational Science Collaborative (CTSC) projects as well as clinical, biomedical, and administrative informatics projects for the Case Comprehensive Cancer Center.

His research interests are in applied ontology, knowledge representation, content repository infrastructure, and medical informatics. He has a BS in computer engineering from the University of Illinois and is a part-time PhD student in the Case Western School of Engineering. He most recently appeared as a guest editor in IEEE Internet Computing’s special issue on Personal Health Records in the August 2011 edition.

DETAILS:

——————————————————-
To join the online meeting (Now from mobile devices!)
——————————————————-
1. Go to https://stanford.webex.com/stanford/j.php?ED=107799137&UID=0&PW=NNjE3OWYzODk3&RT=MiM0
2. If requested, enter your name and email address.
3. If a password is required, enter the meeting password: ncbo
4. Click “Join”.

——————————————————-
To join the audio conference only
——————————————————-
To receive a call back, provide your phone number when you join the meeting, or call the number below and enter the access code.
Call-in toll number (US/Canada): 1-650-429-3300
Global call-in numbers: https://stanford.webex.com/stanford/globalcallin.php?serviceType=MC&ED=107799137&tollFree=0

Access code:926 719 478

October 2, 2011

DMO (Data Mining Ontology) Foundry

Filed under: Data Mining,Ontology — Patrick Durusau @ 6:37 pm

Email from Agnieszka Lawrynowicz advises:

We are happy to announce the opening of the DMO (Data Mining Ontology) Foundry (http://www.dmo-foundry.org/), an initiative designed to promote the development of ontologies representing the data mining domain. The DMO Foundry will gather the most significant ontologies concerning data mining and the different algorithms and resources that have been developed to support the knowledge discovery process.

Each ontology in the DMO Foundry is freely available for browsing and open discussion, as well as collaborative development, by data mining specialists all over the world. We cordially welcome all interested researchers and practitioners to join the initiative. To find out how you can participate in ontology development, click on the “How to join” tab at the top of the DMO-Foundry page.

To access and navigate an ontology, and contribute to it, click on the “Ontologies” tab, then on your selected ontology and its OWL Browser tool. As you browse, you can click on the “Comment” button to share your insights, criticisms, and suggestions on the concept or relation you are currently exploring. For more general comments, go the the “Forum” tab and post a message to initiate a discussion thread. Please note that until the end of March 2012, this site is being road-tested on the Data Mining OPtimization (DMOP) Ontology developed in the EU FP7 ICT project e-LICO (2009-2012). We are in contact with authors of other DM ontologies, but if you are developing a relevant ontology that you think we are not aware of, please set up a post in the Forum. You are also invited to contact us by writing to an email address info@dmo-foundry.org.

Sad to say but they have omitted topic maps from their ontology. I am writing up a post for the authors. At a minimum, the terms with PSIs at http://psi.topicmaps.org. Others?

This sounds like a link I need to forward to the astronomy folks I mentioned in > 100 New KDD Models/Methods Appear Every Month. Could at least use the class listing as a starter set for mining journal literature.

September 30, 2011

Advice regarding future directions for Protégé

Filed under: Ontology,OWL,Protégé — Patrick Durusau @ 7:03 pm

Advice regarding future directions for Protégé

Mark Munsen, Principal Investigator, The Protégé Project, posted the following request to the protege-users mailing list:

I am writing to seek your advice regarding future directions for the Protégé Project. As you know, all the work that we perform on the Protégé suite of tools is supported by external funding, nearly all from federal research grants. We currently are seeking additional grant support to migrate some of the features that are available in Protégé Version 3 to Protégé Version 4. We believe that this migration is important, as only Protégé 4 supports the full OWL 2 standard, and we appreciate that many members of our user community are asking to use certain capabilities currently unique to Protégé 3 with OWL 2 ontologies in Protégé 4.

To help the Protégé team in setting priorities, and to help us make the case to our potential funders that enhancement of Protégé 4 is warranted, we’d be grateful if you could please fill out the brief survey at the following URL:

http://www.surveymonkey.com/s/ProtegeDirections

It will not take more than a few minutes for you to give us feedback that will be influential in setting our future goals. If we can document strong community support for implementing certain Protégé 3 features in Protégé 4, then we will be in a much stronger position to make the case to our funders to initiate the required work.

The entire Protégé team is looking forward to your opinions. Please be sure to forward this message to colleagues who use Protégé who may not subscribe to these mailing lists so that we can obtain as much feedback as possible.

Many thanks for your help and support.

Please participate in this survey (there are only 7 questions, one of which is optional) and ask others to participate as well.

September 23, 2011

Models, Relativity & Reality

Filed under: Ontology — Patrick Durusau @ 6:30 pm

Particles Appear to Travel Faster Than Light: OPERA Experiment Reports Anomaly in Flight Time of Neutrinos from Science Daily.

From the post:

Scientists with the OPERA experiment, which observes a neutrino beam from CERN 730 km away at Italy’s INFN Gran Sasso Laboratory, are presenting surprising new results (in a seminar at CERN on Sept. 23, 2011) that appear to show neutrinos traveling faster than light.

The OPERA result is based on the observation of over 15000 neutrino events measured at Gran Sasso, and appears to indicate that the neutrinos travel at a velocity 20 parts per million above the speed of light, nature’s cosmic speed limit. Given the potential far-reaching consequences of such a result, independent measurements are needed before the effect can either be refuted or firmly established. This is why the OPERA collaboration has decided to open the result to broader scrutiny. The collaboration’s result is available on the preprint server arXiv (http://arxiv.org/list/hep-ex/new).

It will take weeks, months or perhaps years to confirm or refute these findings. And that lies firmly in the province of high-energy physics. So why mention it here?

Whatever the outcome, I take this as a reminder that we create models of reality. Relativity, both special and general are such models. Useful models but then so were Newtonian physics (which remain useful by the way).

Our ontologies, data structures, identification systems, etc., are all models. The only thing that separates them, one from another, is whether they are useful for some specified purpose or not.

Foundations for Ontology

Filed under: Ontology — Patrick Durusau @ 6:10 pm

Foundations for Ontology by John Sowa.

A highly amusing combination of recent slides by John Sowa on issues surrounding the use and construction of ontologies.

I particularly enjoyed slide 4 Prospects for a Universal Ontology: Attempts to create a universal classification of concepts which lists just the highlights of attempts at universal classification systems.

Lots of references to other Sowa presentations/publications. And others.

Which slide do you like best?

FYI, my test for any ontology is its usefulness for some specified purpose. That allows us to clear away all the factionalism over notation, foundations, and most theoretical issues. The remaining questions are: What do you want to do?, How does this ontology help you do it?, What will it cost?. Those are empirical questions that don’t require a review of Western Civilization to answer.

September 13, 2011

3rd Canadian Semantic Web Symposium

Filed under: Biomedical,Concept Detection,Ontology,Semantic Web — Patrick Durusau @ 7:17 pm

CSWS2011: The 3rd Canadian Semantic Web Symposium Proceedings of the 3rd Canadian Semantic Web Symposium
Vancouver, British Columbia, Canada, August 5, 2011

An interesting set of papers! I suppose I can be forgiven for looking at the text mining (Hassanpour & Das) and heterogeneous information systems (Khan, Doucette, and Cohen) papers first. 😉 More comments to follow on those.

What are your favorite papers in this batch and why?

The whole proceedings can also be downloaded as a single PDF file.

Edited by:

Christopher J. O. Baker *
Helen Chen **
Ebrahim Bagheri ***
Weichang Du ****

* University of New Brunswick, Saint John, NB, Canada, Department of Computer Science & Applied Statistics
** University of Waterloo, Waterloo, ON, Canada, School of Public Health and Health Systems
*** Athabasca University, School of Computing and Information Systems
**** University of New Brunswick, NB, Canada, Faculty of Computer Science

Table of Contents

Full Paper

  1. The Social Semantic Subweb of Virtual Patient Support Groups
    Harold Boley, Omair Shafiq, Derek Smith, Taylor Osmun
  2. Leveraging SADI Semantic Web Services to Exploit Fish Ecotoxicology Data
    Matthew M. Hindle, Alexandre Riazanov, Edward S. Goudreau, Christopher J. Martyniuk, Christopher J. O. Baker
  3. Short Paper

  4. Towards Evaluating the Impact of Semantic Support for Curating the Fungus Scientic Literature
    Marie-Jean Meurs, Caitlin Murphy, Nona Naderi, Ingo Morgenstern, Carolina Cantu, Shary Semarjit, Greg Butler, Justin Powlowski, Adrian Tsang, René Witte
  5. Ontology based Text Mining of Concept Definitions in Biomedical Literature
    Saeed Hassanpour, Amar K. Das
  6. Social and Semantic Computing in Support of Citizen Science
    Joel Sachs, Tim Finin
  7. Unresolved Issues in Ontology Learning
    Amal Zouaq, Dragan Gaševic, Marek Hatala
  8. Poster

  9. Towards Integration of Semantically Enabled Service Families in the Cloud
    Marko Boškovic, Ebrahim Bagheri, Georg Grossmann, Dragan Gaševic, Markus Stumptner
  10. SADI for GMOD: Semantic Web Services for Model Organism Databases
    Ben Vandervalk, Michel Dumontier, E Luke McCarthy, Mark D Wilkinson
  11. An Ontological Approach for Querying Distributed Heterogeneous Information Systems
    Atif Khan, John A. Doucette, Robin Cohen

Please see the CSWS2011 website for further details.

September 12, 2011

QUDT – Quantities, Units, Dimensions and Data Types in OWL and XML

Filed under: Data Types,Dimensions,Ontology,OWL,Quantities,Units — Patrick Durusau @ 8:29 pm

QUDT – Quantities, Units, Dimensions and Data Types in OWL and XML

From background:

The QUDT Ontologies, and derived XML Vocabularies, are being developed by TopQuadrant and NASA. Originally, they were developed for the NASA Exploration Initiatives Ontology Models (NExIOM) project, a Constellation Program initiative at the AMES Research Center (ARC). The goals of the QUDT ontology are twofold:

  • to provide a unified model of, measurable quantities, units for measuring different kinds of quantities, the numerical values of quantities in different units of measure and the data structures and data types used to store and manipulate these objects in software;
  • to populate the model with the instance data (quantities, units, quantity values, etc.) required to meet the life-cycle needs of the Constellation Program engineering community.

If you are looking for measurements, this would be one place to start.

September 11, 2011

New Challenges in Distributed Information Filtering and Retrieval

New Challenges in Distributed Information Filtering and Retrieval

Proceedings of the 5th International Workshop on New Challenges in Distributed Information Filtering and Retrieval
Palermo, Italy, September 17, 2011.

Edited by:

Cristian Lai – CRS4, Loc. Piscina Manna, Building 1 – 09010 Pula (CA), Italy

Giovanni Semeraro – Dept. of Computer Science, University of Bari, Aldo Moro, Via E. Orabona, 4, 70125 Bari, Italy

Eloisa Vargiu – Dept. of Electrical and Electronic Engineering, University of Cagliari, Piazza d’Armi, 09123 Cagliari, Italy

Table of Contents:

  1. Experimenting Text Summarization on Multimodal Aggregation
    Giuliano Armano, Alessandro Giuliani, Alberto Messina, Maurizio Montagnuolo, Eloisa Vargiu
  2. From Tags to Emotions: Ontology-driven Sentimental Analysis in the Social Semantic Web
    Matteo Baldoni, Cristina Baroglio, Viviana Patti, Paolo Rena
  3. A Multi-Agent Decision Support System for Dynamic Supply Chain Organization
    Luca Greco, Liliana Lo Presti, Agnese Augello, Giuseppe Lo Re, Marco La Cascia, Salvatore Gaglio
  4. A Formalism for Temporal Annotation and Reasoning of Complex Events in Natural Language
    Francesco Mele, Antonio Sorgente
  5. Interaction Mining: the new Frontier of Call Center Analytics
    Vincenzo Pallotta, Rodolfo Delmonte, Lammert Vrieling, David Walker
  6. Context-Aware Recommender Systems: A Comparison Of Three Approaches
    Umberto Panniello, Michele Gorgoglione
  7. A Multi-Agent System for Information Semantic Sharing
    Agostino Poggi, Michele Tomaiuolo
  8. Temporal characterization of the requests to Wikipedia
    Antonio J. Reinoso, Jesus M. Gonzalez-Barahona, Rocio Muñoz-Mansilla, Israel Herraiz
  9. From Logical Forms to SPARQL Query with GETARUN
    Rocco Tripodi, Rodolfo Delmonte
  10. ImageHunter: a Novel Tool for Relevance Feedback in Content Based Image Retrieval
    Roberto Tronci, Gabriele Murgia, Maurizio Pili, Luca Piras, Giorgio Giacinto

September 8, 2011

Press.net News Ontologies & rNews

Filed under: Linked Data,LOD,Ontology — Patrick Durusau @ 5:58 pm

Press.net News Ontologies

From the webpage:

The news ontology is comprised of several ontologies, which describe assets (text, images, video) and the events and entities (people, places, organisations, abstract concepts etc.) that appear in news content. The asset model is the representation of news content as digital assets created by a news provider (e.g. text, images, video and data such as csv files). The domain model is the representation of the ‘real world’ which is the subject of news. There are simple entities, which we have labelled with the amorphous term of ‘stuff‘ and complex entities. Currently, the only complex entity the ontology is concerned with is events. The term stuff has been used to include abstract and intangible concepts (e.g. Feminism, Love, Hate Crime etc.) as well as tangible things (e.g. Lord Ashdown, Fiat Punto, Queens Park Rangers).

Assets (news content) are about things in the world (the domain model). The connection between assets and the entities that appear in them is made using tags. Assets are further holistically categorised using classification schemes (e.g. IPTC Media Topic Codes, Schema.org Vocabulary or Press Association Categorisations).

No sooner had I seen that on the LOD list, than Stephanie Corlosquet pointed out rNews, another ontology for news.

From the rNews webpage:

rNews is a proposed standard for using RDFa to annotate news-specific metadata in HTML documents. The rNews proposal has been developed by the IPTC, a consortium of the world’s major news agencies, news publishers and news industry vendors. rNews is currently in draft form and the IPTC welcomes feedback on how to improve the standard in the rNews Forum.

I am sure there are others.

Although I rather like stuff as an alternative to SUMO’s thing or was that Cyc?

The point being that mapping strategies, when the expense can be justified, are the “answer” to the semantic diversity and richness of human discourse.

Bioportal 3.2

Filed under: Bioinformatics,Biomedical,Ontology — Patrick Durusau @ 5:50 pm

Bioportal 3.2

From the announcement:

The National Center for Biomedical Ontology is pleased to announce the release of BioPortal 3.2.

New features include updates to the Web interface and Web services:

Added Ontology Recommender feature, http://bioportal.bioontology.org/recommender
Added support for access control for viewing ontologies
Added link to subscribe to BioPortal Notes emails
Synchronized “Jump To” feature with ontology parsing and display
Added documentation on Ontology Groups
Annotator Web service – disabled use of “longest only” parameter when also selecting “ontologies to expand” parameter
Removed the metric “Number of classes without an author”
Handling of obsolete terms, part 1 – term name is grayed out and element is returned in Web service response for obsolete terms from OBO and RRF ontologies. This feature will be extended to cover OWL ontologies in a subsequent release.

Bug Fix

Fixed calculation of “Classes with no definition” metric
Added re-direct from old BioPortal URL format to new URL format to provide working links from archived search results

Firefox Extension for NCBO API Key:

To make it easier to test Web service calls from your browser, we have released the NCBO API Key Firefox Extension. This extension will automatically add your API Key to NCBO REST URLs any time you visit them in Firefox. The extension is available at Mozilla’s Add-On site. To use the extension, follow the installation directions, restart Firefox, and add your API Key into the “Options” dialog menu on the Add-Ons management screen. After that, the extension will automatically append your stored API Key any time you visit http://rest.bioontology.org.

Upcoming software license change:

The next release of NCBO software will be under the two-clause BSD license rather than under the currently used three-clause BSD license. This change should not affect anyone’s use of NCBO software and this change is to a less restrictive license. More information about these licenses is available at the site: http://www.opensource.org/licenses. Please contact support at bioontology.org with any questions concerning this change.

Even if you aren’t active in the bioontology area, you need to spend some time with this site.

September 4, 2011

Semantic Integration in the IFF

Filed under: Category Theory,Ontology — Patrick Durusau @ 7:20 pm

Semantic Integration in the IFF by Robert E. Kent

Abstract:

The IEEE P1600.1 Standard Upper Ontology (SUO) project aims to specify an upper ontology that will provide a structure and a set of general concepts upon which domain ontologies could be constructed. The Information Flow Framework (IFF), which is being developed under the auspices of the SUO Working Group, represents the structural aspect of the SUO. The IFF is based on category theory. Semantic integration of object-level ontologies in the IFF is represented with its fusion construction. The IFF maintains ontologies using powerful composition primitives, which includes the fusion construction.

Comments: Presented at the Semantic Integration Workshop of the 2nd International Semantic Web Conference (ISWC2003), Sanibel Island, Florida, October 20, 2003.

IFF = Information Flow Framework. From, Barwise, Jon and Jerry Seligman. Information Flow: The Logic of Distributed Systems. Cambridge Tracts in Theoretical Computer Science 44. Cambridge University Press. 1997.

Historical document at this point but interesting none the less. Describes a category theory view of semantic integration.

September 3, 2011

Schema VOAG

Filed under: Attribution,Governance,Ontology — Patrick Durusau @ 6:46 pm

Schema VOAG

From the website:

VOAG stands for “Vocabulary Of Attribution and Governance”. The ontology is intended to specify licensing, attribution, provenance and governance of an ontology. VOAG captures many common license types and their restrictions. Where a license requires attribution, VOAG provides resources that allow the attribution should be made. Provenance is defined in terms of source and pedigree. A miminal model of governance is provided based on how issues, releases and changes are managed. VOAG does not import, but makes uses of some concepts from VOID (http://vocab.deri.ie/void), notably void:Dataset.

July 6, 2011

SERIMI

Filed under: Ontology,RDF — Patrick Durusau @ 2:13 pm

SERIMI (version 0.9), a tool for automatic RDF data interlinking

From the announcement:

SERIMI matches instances between a source and a target dataset, without prior knowledge of the data, domain or schema of these datasets. Experiments conducted with benchmark collections demonstrate that our approach considerably outperforms published state-of-the-art automatic approaches for solving the interlinking problem in the Linked Data Cloud. An updated reference alignment between Dailymed[1] and TCM[2] that can be used as a golden set is also available for download.

[1] http://code.google.com/p/junsbriefcase/wiki/TGDdataset
[2] http://www4.wiwiss.fu-berlin.de/dailymed/

For the details, see: SERIMI-TECH-REPORT-v2.pdf.

Just skimmed the paper before posting. Deeply interesting work based on Tversky’s contrast model. “Tversky, A. (1977). Features of similarity. Psychological Review 84 (4), 327–352.” As of today, Tversky’s work has been cited 1598 times so it will take a while to look through the subsequent work.

June 13, 2011

Why Schema.org Will Win

Filed under: Ontology,OWL,RDF,Schema,Semantic Web — Patrick Durusau @ 7:04 pm

It isn’t hard to see why schema.org is going to win out over “other” semantic web efforts.

The first paragraph at the schema.org website says why:

This site provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google and Yahoo! rely on this markup to improve the display of search results, making it easier for people to find the right web pages.

  • Easy: Uses HTML tags
  • Immediate Utility: Recognized by Bing, Google and Yahoo!
  • Immediate Payoff: People can find the right web pages (your web pages)

Ironic that when HTML came up the scene, any number of hypertext engines offered more complex and useful approaches to hypertext.

But the advantages of HTML were:

  • Easy: Used simple tags
  • Immediate Utility: Useful to the author
  • Immediate Payoff: Joins hypertext network for others to find (your web pages)

I think the third advantage in each case is the crucial one. We are vain enough that making our information more findable is a real incentive, if there is a reasonable expectation of it being found. Today or tomorrow. Not ten years from now.

June 9, 2011

Ontologies As Low-Lying Subjects

Filed under: Mapping,Ontology — Patrick Durusau @ 6:35 pm

While writing up a call for papers on “integration” of ontologies, it occurred to me that ontologies are really low lying subjects for topic maps.

Any text corpus or database is going to require extraction of its content as a first step.

Your second step is going to be processing that extracted content to identify subjects.

Your third step is going to be creating topics and associations between topics, along with the properties of topics.

Your fourth step, depending on the purpose of your topic map, will be to create pointers back into the content for users (occurrences).

And finally, your fifth step, is going to be fashioning the interface your users will use for the topic map.

Compare those steps to topic mapping ontologies:

Your first step isn’t extraction of the data because while ontologies may exist in some format, they are free standing sets of subjects.

Your second step won’t be to identify subjects because the ontology already has subjects identified. (Yes, there are other subjects you could identify but this is the low-lying fruit version).

You avoid the third step because subjects in an ontology already have properties and relationships to other subjects.

You don’t need pointers because the entire ontology fits into your topic map, so no fourth step.

You have a familiar interface, the ontology itself, which leaves you with no fifth step.

Well, that’s a slight exaggeration. 😉

You do need the third step where subjects in the ontology get properties in addition to the ones they have in their respective ontologies. Those added properties enable the same subjects in different ontologies to merge. If their respective properties are also subjects in the ontology, that is they can have properties, you should be able to merge those properties as well.

I realize that the originators of some ontologies may disagree with the mappings but that really isn’t the appropriate question. The question is whether users find the mapping useful for some particular purpose. I am not sure what other test one would have?

June 6, 2011

OBML 2011 – 3. Workshop of
Ontologies in Biomedicine and Life Sciences

Filed under: Biomedical,Conferences,Ontology — Patrick Durusau @ 2:00 pm

OBML 2011 – 3. Workshop of Ontologies in Biomedicine and Life Sciences

Important Dates

Submission of papers June 30, 2011
Notification of review results August 10, 2011
Deadline for revised versions September 9, 2011
Workshop October 6-7, 2011

Goals of the OBML

The series “Ontologies in Biomedicine and Life Sciences” (OBML workshop) was initiated by the workgroup for OBML of the German Society for Computer Science in 2009. The OBML aims to bring together scientists who are working in this area to exchange ideas and discuss new results, to start collaborations and to initiate new projects. The OBML workshop is held once annually and deals with all fundamental aspects of biomedical ontologies as well as additional “hot” topics.

Submissions are requested especially for the following topics:

  • Ontologies and terminologies in biology, medicine, and clinical research;
  • Ontologies for knowledge representation, methods of reasoning, integration and interoperability of ontologies;
  • Methods and tools for the construction and management of ontologies; and 
  • Applications of the Semantic Web in biomedicine and the life sciences.

The focus of the OBML-2011 is Phenotype ontologies in medicine and biomedical research

“Integration” and “interoperability,” it sounds like they are singing the topic map song! 😉

May 31, 2011

Biomedical Annotation…Webinar
1 June 2011 – 10 AM PT (17:00 GMT)

Filed under: Bioinformatics,Biomedical,Ontology — Patrick Durusau @ 6:42 pm

Biomedical Annotation by Humans and computers in a Keyword-driven world

From the website:

Abstract:

As part of our project with the NCBO we have been curating expression experiments housed in NCBI’s GEO data base and annotating a variety of rat-related records using the NCBO Annotator and more recently, mining data from the NCBO Resource Index. The annotation pipelines and curation tools that we have built have demonstrated some strengths and shortfalls of automated ontology annotation. Similarly our manual curation of these records highlights areas where human involvement could be improved to better address the fact that we are living in the Google era where findability is King.

Speaker Bio:

Simon Twigger currently splits his time between being an Assistant Professor in the Human and Molecular Genetics Center at the Medical College of Wisconsin in Milwaukee and exploring the iPhone and iPad as mobile platforms for education and interaction. At MCW he has been an investigator on the Rat Genome Database project for the past 10 years, he worked with the Gene Ontology project and has been active in the BioCuration community as co-organizer of the past three International BioCuration meetings. He is the former Director of Bioinformatics for the MCW Proteomics Center and was previously the Biomedical Informatics Key Function Director for the MCW Clinical & Translational Science Institute. He is a Semantic web enthusiast and is eagerly awaiting the rapture of Web 3.0 when all the data will be taken up into the Linked Data cloud and its true potential realized.

Annotation, useful annotation anyway, is based on recognition of the subject of annotation. Should prove to be an interesting presentation.


Notes from the webinar:

(My personal notes while viewing the webinar in real time. The webinar controls in all cases of conflict. Posted to interest others in viewing the stored version of the webinar.)

Rat Genome Database: http://rgd.mcw.edu / interesting questions that researchers ask / Where to find answers, PubMed 20 million+ citations, almost 1 per minute / search is the critical thing – in all interfaces / “Being able to find information is of great importance to researchers.” / NCBO Annotator www.bioontology.org/wiki/index.php/Annotator_Web_service / records annotated – curated the raw annotations – manual effort needed to track it down – / rat strain synonyms has issues / work flow description / mouse gut maps to course (ex. of mapping issue) / Linking annotations to data / RatMine faceted-search + lucene text indexing , interesting widgets / – Driving “Biological” Problem Part 2 – 55.6 % of researchers rarely use archival databases, 56.0% rarely use published literature / 3rd International biocurator meeting Amos Bairoch – “trying to second guess what the authors really did and found.” / post-publication effort to make content be found. different from academic model where publication simply floats along. / illustration of where the annotation path fails and the consequences of that failure. / very cool visualization of how annotations can be visualized and the value thereof / put in keywords and don’t care about it being found (paper) , NCBO Resource Index could be a “semantic warehouse” of connections, websites: gminer.mcw.edu, github.com/mcwbbc/, bioportal.bioontology.org, simont -at- mcw.edu @simon_t

May 23, 2011

ISO initiative OntoIOp (Ontology interoperability)

Filed under: Interoperability,Ontology — Patrick Durusau @ 7:46 pm

ISO initiative OntoIOp (Ontology interoperability)

Prof. Dr. Till Mossakowsk post the following note to the ontolog-forum today:

Dear all,

we are currently involved in a new ISO standardisation initiative concerned with ontology interoperability.

This initiative is somehow orthogonal and complementary to Common Logic, because the topic is interoperability. This means interoperability both among ontologies (i.e. concering matching, alignment, and suitable means to write these down) as well as among ontology languages (e.g. OWL, UML, Common Logic, or F-logic, and translations among these). The idea is to have all these languages as part of a meta-standard, such that ontology designers can bring in their ontologies verbatim as they are, and yet relate them to other ontologies (e.g. check that an OWL version of some ontology is entailed by its first-order formulation).

The first official meeting for this is already mid next month in Seoul, and we now quickly have to move forward getting some countries into the boat. It will be essential to have experts from all relevant communities involved in this effort.

If you are interested in this initiative, the rough draft [1] for the standard and a related paper [2] will give you some more info. Please have a look and let me know what you think. We also look for people who want to officially take part in the development of the standard, either actively or just by voting on behalf of your national standardisation body.

All the best,
Till

[1] http://www.dfki.de/sks/till/papers/OntoIOp.pdf
[2] http://www.dfki.de/sks/till/papers/ontotrans.pdf

I haven’t had time to review the documents but given the time frame wanted to bring this to your attention sooner rather than later.

When you have reviewed the documents, comments welcome.

May 1, 2011

Ontology: A Practical Guide

Filed under: Ontology — Patrick Durusau @ 5:24 pm

Ontology: A Practical Guide by Adam Pease.

From the announcement:

This new book reports on a decade of work developing SUMO and its associated tools, models and domain ontologies. Written for a wide audience, it should be accessible to anyone with a general computer science background. It includes introductions to topics such as formal theorem proving and the properties of different formal knowledge representation languages.

The book is suitable as a self-study guide for the professional, student or researcher. It also includes a number of exercises with selected answers, making it appropriate as a textbook for a senior year or graduate level course in AI knowledge representation.

Adam Pease, is the Technical Editor for the Suggested Upper Merged Ontology (SUMO) project.

I am waiting for my copy to arrive! More comments to follow.

March 27, 2011

Ontology Driven Implementation of Semantic Services for the Enterprise Environment (ODISSEE) Workshop

Filed under: Conferences,Ontology — Patrick Durusau @ 3:13 pm

Ontology Driven Implementation of Semantic Services for the Enterprise Environment (ODISSEE) Workshop

April 12-13, 2011 · 8:30 a.m. – 4:30 p.m.

From the website:

Alion Science and Technology and the National Center for Ontological Research (NCOR, University at Buffalo) will host a two-day “Ontology Driven Implementation of Semantic Services for the Enterprise Environment (ODISSEE)” Workshop. ODISSEE aims to foster awareness of and collaboration between disparate information-sharing efforts across the US Government. The workshop will feature individual presentations on information-sharing development, as well as panel sessions on ontology and data vocabulary. This workshop supports the Joint Planning and Development Office (JPDO) information sharing initiatives. Information sharing is at the heart of the transformation from the current state of the National Airspace System (NAS) to NextGen capabilities in 2025 in areas such as unmanned aircraft systems, integrated surveillance and weather.

WORKSHOP OBJECTIVES:

  • Identify and catalogue the various semantic technology efforts across the Federal government.
  • Identify, evaluate, and catalogue standard information-exchange models, such as Universal Core (UCore) and National Information Exchange Model (NIEM) and semantic models of common domains, including time, geography, and events.
  • Explore the use of ontologies to enable information exchanges within a service-oriented architecture (SOA), improve discoverability of services, and align disparate data standards and message models.
  • Coordinate ontology development across diverse Communities of Interest (COIs) to ensure extensibility, interoperability, and reusability.

You guessed from the title this was a government based workshop. Yes? 😉

Looks like a good opportunity to at least meet some of the players in this activity space.

Topic maps certainly qualify as an information-exchange model so that could be one starting point for conversation.

Others?

March 4, 2011

Castles Made of Sand or Blowing in the Wind?

Filed under: Ontology — Patrick Durusau @ 3:30 pm

Products Types Ontology

I haven’t covered the 300,000 product descriptions offered by this site because I could not choose a blog title between “castles made of sand” and “blowing in the wind.”

From the webpage:

Your idea sucks: What you call an ontology is no ontology, because it lacks an axiomatic theory.

First, this is not question but a statement. Second, yes, you are absolutely right: Besides the rdfs:subClassOf axiom, we don’t have any formal semantics for each class. Third: Your ontology lacks

  • social grounding (ours: constant challenging by millions of reviews and revisions),
  • ….

The line: social grounding (ours: constant challenging by millions of reviews and revisions) captures the problem doesn’t it?

Your classes are constantly changing and so I won’t know if your class tomorrow means the same thing when I used it today. (Hence, the “castles made of sand” line as a possible header.)

Yes?

But, social grounding is at work on both ends, that is my use of an identification has a social grounding.

So we have an uncertain/changing meaning to your classes, being applied to and equally uncertain/changing meaning to my application of your class. (Hence, the “blowing in the wind” line as a possible header.)

There is other information about each of the “300,000” (is that a possible movie title?) classes, but we don’t know what information has to match to identify a particular class. Or to tell others why we used one class and not another.

Appreciate the social grounding but identifiers without more leave sand moving under our feet and don’t enable us to make meaningful statements about our choices of identifiers.

*****

PS: Show of hands your preference for “Castles Made of Sand” or “Blowing in the Wind” as a title.

PPS: Best of luck with the axiomatic critics. Axioms are all they have, how does that go, “…tis an ill-favored thing, Sir, but mine own”? Something like that.

February 18, 2011

10th International Semantic Web Conference (ISWC 2011) -Call for Papers

Filed under: Conferences,Ontology,Semantic Web — Patrick Durusau @ 5:15 am

10th International Semantic Web Conference (ISWC 2011) – Call for Papers

The 10th International Semantic Web Conference (ISWC 2011) will be in Bonn, Germany, Otober 23-27, 2011.

From the call:

Key Topics

  • Management of Semantic Web Data
  • Natural Language Processing
  • Ontologies and Semantics
  • Semantic Web Engineering
  • Social Semantic Web
  • User Interfaces to the Semantic Web
  • Applications of the Semantic Web

Tracks and Due Dates:

Research Papers
http://iswc2011.semanticweb.org/calls/research-papers/

Semantic Web In Use
http://iswc2011.semanticweb.org/calls/semantic-web-in-use/

Posters and Demos
http://iswc2011.semanticweb.org/calls/posters-and-demos/

Doctoral Consortium
http://iswc2011.semanticweb.org/calls/doctoral-consortium/

Tutorials http://iswc2011.semanticweb.org/calls/tutorials/

Workshops http://iswc2011.semanticweb.org/calls/workshops/

Semantic Web Challenge http://iswc2011.semanticweb.org/calls/semantic-web-challenge/

Linked Data-a-thon
http://iswc2011.semanticweb.org/calls/linked-data-a-thon/

DataLift

Filed under: Dataset,Linked Data,Ontology,RDF — Patrick Durusau @ 5:12 am

DataLift

The DataLift project will no doubt produce some useful tools and output but reading its self-description:

The project will provide tools allowing to facilitate each step of the publication process:

  1. selecting ontologies for publishing data
  2. converting data to the appropriate format (RDF using the selected ontology)
  3. publishing the linked data
  4. interlinking data with other data sources

I am struck by how futile the effort sounds in the face of petabytes of data flow, changing semantics of that data and changing semantics of other data, with which it might be interlinked.

The nearest imagery I can come up with is trying to direct the flow of a tsunami with a roll of paper towels.

It is certainly brave (I forgo usage of the other term) to try but ultimately isn’t very productive.

First, any scheme that start with conversion to a particular format is an automatic loser.

The source format is itself composed of subjects that are discarded by the conversion process.

Moreover, what if we disagree about the conversion?

Remember all the semantic diversity that gave rise to this problem? Where did it get off to?

Second, the interlinking step introduces brittleness into the process.

Both in terms of the ontology that any particular data must follow but also in terms of resolution of any linkage.

Other data sources can only be linked in if they use the correct ontology and format. And that assumes they are reachable.

I hope the project does well, but at best it will result in another semantic flavor to be integrated using topic maps.

*****
PS: The use of data heaven betrays the religious nature of the Linked Data movement. I don’t object to Linked Data. What I object to is the missionary conversion aspects of Linked Data.

February 15, 2011

UMBEL – Reference Concept Ontology and Vocabulary 1.0

Filed under: Ontology,UMBEL,Vocabularies — Patrick Durusau @ 1:41 pm

UMBEL – Reference Concept Ontology and Vocabulary 1.0 has been released!

From the website:

This is the official Web site for the UMBEL Vocabulary and Reference Concept Ontology (namespace: umbel). UMBEL is the Upper Mapping and Binding Exchange Layer, designed to help content interoperate on the Web.

UMBEL provides two valuable functions:

  • First, it is a vocabulary for the construction of concept-based domain ontologies, designed to act as references for the linking and mapping of external content, and
  • Second, it is its own broad, general reference structure of 28,000 concepts, which provides a scaffolding to orient other datasets and domain vocabularies.

The mappings in Annex F: Mapping with UMBL are with owl:sameAs and umbel:isLike.

I would prefer more specific reasons for mapping. Particular given the varying use of owl:sameAs. Could mean just about anything.

Still, this is a valuable data set, although I would use it for mappings with more specific reasoning disclosed as part of the mapping.

PS: It has a really cool logo!

February 11, 2011

Sowa on Watson

Filed under: Cyc,Ontology,Semantic Web,Subject Identifiers,Subject Identity — Patrick Durusau @ 6:43 am

John Sowa’s posting on Watson merits reproduction in its entirety (lite editing to make it format for easy reading):

Peter,

Thanks for the reminder:

Dave Ferrucci gave a talk on UIMA (the Unstructured Information Management Architecture) back in May-2006, entitled: “Putting the Semantics in the Semantic Web: An overview of UIMA and its role in Accelerating the Semantic Revolution”

I recommend that readers compare Ferrucci’s talk about UIMA in 2006 with his talk about the Watson system and Jeopardy in 2011. In less than 5 years, they built Watson on the UIMA foundation, which contained a reasonable amount of NLP tools, a modest ontology, and some useful tools for knowledge acquisition. During that time, they added quite a bit of machine learning, reasoning, statistics, and heuristics. But most of all, they added terabytes of documents.

For the record, following are Ferrucci’s slides from 2006:

http://ontolog.cim3.net/file/resource/presentation/DavidFerrucci_20060511/UIMA-SemanticWeb–DavidFerrucci_20060511.pdf

Following is the talk that explains the slides:

http://ontolog.cim3.net/file/resource/presentation/DavidFerrucci_20060511/UIMA-SemanticWeb–DavidFerrucci_20060511_Recording-2914992-460237.mp3

And following is his recent talk about the DeepQA project for building and extending that foundation for Jeopardy:

http://www-943.ibm.com/innovation/us/watson/watson-for-a-smarter-planet/building-a-jeopardy-champion/how-watson-works.html

Compared to Ferrucci’s talks, the PBS Nova program was a disappointment. It didn’t get into any technical detail, but it did have a few cameo appearances from AI researchers. Terry Winograd and Pat Winston, for example, said that the problem of language understanding is hard.

But I thought that Marvin Minsky and Doug Lenat said more with their tone of voice than with their words. My interpretation (which could, of course, be wrong) is that both of them were seething with jealousy that IBM built a system that was competing with Jeopardy champions on national TV — and without their help.

In any case, the Watson project shows that terabytes of documents are far more important for commonsense reasoning than the millions of formal axioms in Cyc. That does not mean that the Cyc ontology is useless, but it undermines the original assumptions for the Cyc project: commonsense reasoning requires a huge knowledge base of hand-coded axioms together with a powerful inference engine.

An important observation by Ferrucci: The URIs of the Semantic Web are *not* useful for processing natural languages — not for ordinary documents, not for scientific documents, and especially not for Jeopardy questions:

1. For scientific documents, words like ‘H2O’ are excellent URIs. Adding an http address in front of them is pointless.

2. A word like ‘water’, which is sometimes a synonym for ‘H2O’, has an open-ended number of senses and microsenses.

3. Even if every microsense could be precisely defined and cataloged on the WWW, that wouldn’t help determine which one is appropriate for any particular context.

4. Any attempt to force human being(s) to specify or select a precise sense cannot succeed unless *every* human understands and consistently selects the correct sense at *every* possible occasion.

5. Given that point #4 is impossible to enforce and dangerous to assume, any software that uses URIs will have to verify that the selected sense is appropriate to the context.

6. Therefore, URIs found “in the wild” on the WWW can never be assumed to be correct unless they have been guaranteed to be correct by a trusted source.

These points taken together imply that annotations on documents can’t be trusted unless (a) they have been generated by your own system or (b) they were generated by a system which is at least as trustworthy as your own and which has been verified to be 100% compatible with yours.

In summary, the underlying assumptions for both Cyc and the Semantic Web need to be reconsidered.

You can see the post at: http://ontolog.cim3.net/forum/ontolog-forum/2011-02/msg00114.html

I don’t always agree with Sowa but he has written extensively on conceptual graphs, knowledge representation and ontological matters. See http://www.jfsowa.com/

I missed the local showing but found the video at: Smartest Machine on Earth.

You will find a link to an interview with Minsky at that same location.

I don’t know that I would describe Minsky as “…seething with jealousy….”

While I enjoy Jeopardy and it is certainly more cerebral than say American Idol, I think Minsky is right in seeing the Watson effort as something other than artificial intelligence.

Q: In 2011, who was the only non-sentient contestant on the TV show Jeopardy?

A: What is IBM’s Watson?

January 11, 2011

1st International Workshop on Semantic
Publication (SePublica 2011)

Filed under: Conferences,Ontology,OWL,RDF,Semantic Web,SPARQL — Patrick Durusau @ 7:24 pm

1st International Workshop on Semantic Publication (SePublica 2011) in connection with 8th Extended Semantic Web Conference (ESWC 2011), May 29th or 30th, Hersonissos, Crete, Greece.

From the Call for Papers:

The CHALLENGE of the Semantic Web is to allow the Web to move from a dissemination platform to an interactive platform for networked information. The Semantic Web promises to “fundamentally change our experience of the Web”.

In spite of improvements in the distribution, accessibility and retrieval of information, little has changed in the publishing industry so far. The Web has succeeded as a dissemination platform for scientific and non-scientific papers, news, and communication in general; however, most of that information remains locked up in discrete documents, which are poorly interconnected to one another and to the Web.

The connectivity tissues provided by RDF technology and the Social Web have barely made an impact on scientific communication nor on ebook publishing, neither on the format of publications, nor on repositories and digital libraries. The worst problem is in accessing and reusing the computable data which the literature represents and describes.

No, I am not going to say that topic maps are the magic bullet that will solve all those issues or the ones listed in their Questions and Topics of Interest.

What I do think topic maps bring to the table is an awareness that semantic interoperability isn’t primarily a format or computational problem.

Every new (and impliedly universal) format or model simply compounds the semantic interoperability problem.

By creating yet more formats and/or models between which semantic interoperability has to be designed.

Starting with the question of what subjects need to be identified and how they are identified now could lead to a viable, local semantic interoperability solution.

What more could a client want?

Local semantic interoperability solutions can form the basis for spreading semantic interoperability, one solution at a time.

*****
PS: Forgot the important dates:

Paper/Demo Submission Deadline: February 28, 23:59 Hawaii Time

Acceptance Notification: April 1

Camera Ready Version: April 15

SePublica Workshop: May 29 or May 30 (to be announced)

January 7, 2011

User Performance Using An Ontology-Driven Information Retrieval (ONTOIR) System

Filed under: Ontology,Topic Maps — Patrick Durusau @ 3:32 pm

User Performance Using An Ontology-Driven Information Retrieval (ONTOIR) System Authors: Myongho Yi

Also published by VDM Verlag User Performance Using An Ontology-Driven Information Retrieval (ONTOIR) System

Abstract:

Enhancing the representation and relationship of information through ontology is a promising alternative approach for knowledge organization. This improved knowledge organization is vital for collocation of information and effective and efficient searching. This study concerned the testing of user performance when searching an ontology-driven information retrieval (ONTOIR) system that shows explicit relationships among resources. The study explores the possibilities of improving user performance in searching for information. The goal was to examine whether or not ontology enhances user performance in terms of recall and search time. The experiment was conducted with 40 participants to evaluate and compare the differences in user performance (recall and search time) between an ontology-driven information retrieval system and a traditional, thesaurus-driven information retrieval system.

Better recall and shorter search time were found when conducting relationship-based queries in an ontology-driven information retrieval system as compared to a thesaurus-based system. Further studies comparing user performance with a cluster-based search engine and an ontology-driven information retrieval system are needed.

FYI, the first link is $89.00 less than the second one.

A topic map used to deliver an ontology driven search and navigation of information interface.

December 15, 2010

Graph Databases – Intro Slide Deck – Ontologies – Connectedness

Filed under: Graphs,Ontology — Patrick Durusau @ 8:27 am

Graph (Theory and Databases) is a nice overview of graph theory and databases by Pere Urbón-Bayes. I saw it first at: Alex Popescu’s site.

I do have a quibble about slide 14 with the usual graph showing progress towards Everything connected.

To your lower left are ontologies, RDF, Linked Data, Tagging, moving I suppose from less to more connected.

There is only one problem: Everything is already connected.

It doesn’t need electronic or other information systems for connections.

What is at issue is the representation of connections in electronic information systems.

The reason I emphasize that point is that all representations in electronic information systems are partial representations of some connections.

And as far as that goes, all representations do better with some aspects of connections than others.

For example, I don’t think that ontologies are further up the connection line than folksonomies.

Depends on your particular requirements as to which one is more connected

December 14, 2010

NKE: Navigational Knowledge Engineering

Filed under: Authoring Topic Maps,Ontology,Subject Identity,Topic Maps — Patrick Durusau @ 5:36 pm

NKE: Navigational Knowledge Engineering

From the website:

Although structured data is becoming widely available, no other methodology – to the best of our knowledge – is currently able to scale up and provide light-weight knowledge engineering for a massive user base. Using NKE, data providers can publish flat data on the Web without extensively engineering structure upfront, but rather observe how structure is created on the fly by interested users, who navigate the knowledge base and at the same time also benefit from using it. The vision of NKE is to produce ontologies as a result of users navigating through a system. This way, NKE reduces the costs for creating expressive knowledge by disguising it as navigation. (emphasis in original)

This methodology may or may not succeed but it demonstrates a great deal of imagination.

Now imagine a similar concept but built around subject identity.

Where known ambiguities offer a user a choice of subjects to identify.

Or where there are different ways to identify a subject. The harder case.

Questions:

  1. Read the paper/run the demo. Comments, suggestions? (3-5 pages, no citations)
  2. How would you adapt this approach to the identification of subjects? (3-5 pages, no citations)
  3. What data set would you suggest for a test case using the technique you describe in #2? Why is that data set a good test? (3-5 pages, pointers to the data set)

CFP: 10th International Workshop on Web Semantics (WebS 2011),

Filed under: Conferences,Ontology — Patrick Durusau @ 8:32 am

CFP: 10th International Workshop on Web Semantics (WebS 2011)

The 10th International Workshop on Web Semantics (WebS 2011) will be held in conjunction with 22nd International Workshop on Database and Expert Systems Applications DEXA), to be held on 29 August – 02 September 2011 in Toulouse, France.

From the email post:

The special topic “Reliability of ontologies” aims on detecting reusable ontologies and measuring the reliability of possible reusable ontology candidates. How can we measure the reliability and the usability of ontologies? Which adaptations of state-of-the-art ontology engineering methodologies are necessary to support modeling reusable ontologies? What measurements for defining and comparing ontologies can be used and how could ontology repositories use them? These are some of the open research questions to be addressed by papers dedicated to this year’s special topic.

Important dates:

  • Paper submission: March 04, 2011
  • Notification of acceptance: May 16, 2011
  • Webs 2011 Workshop: 29 August – 02 September, 2011

Questions:

  1. What role could topic maps play in answering/exploring the questions for this workshop? (3-5 pages, citations)
  2. (if after the workshop) How would a topic map solution differ from the solution offered by the paper you have chosen from those presented? (3-5 pages, citations)
  3. (after the workshop, extra credit) Create a topic map of the program committee, the presenters, the affiliations of the presenters, with a visual display of the same.*

*I don’t know what you will find, if anything. It is something I have always been curious about but obviously not curious enough to do the analysis.

« Newer PostsOlder Posts »

Powered by WordPress