Measuring the Evolution of Ontology Complexity:…

Measuring the Evolution of Ontology Complexity: The Gene Ontology Case Study by Olivier Dameron, Charles Bettembourg, Nolwenn Le Meur. (Dameron O, Bettembourg C, Le Meur N (2013) Measuring the Evolution of Ontology Complexity: The Gene Ontology Case Study. PLoS ONE 8(10): e75993. doi:10.1371/journal.pone.0075993)

Abstract:

Ontologies support automatic sharing, combination and analysis of life sciences data. They undergo regular curation and enrichment. We studied the impact of an ontology evolution on its structural complexity. As a case study we used the sixty monthly releases between January 2008 and December 2012 of the Gene Ontology and its three independent branches, i.e. biological processes (BP), cellular components (CC) and molecular functions (MF). For each case, we measured complexity by computing metrics related to the size, the nodes connectivity and the hierarchical structure.

The number of classes and relations increased monotonously for each branch, with different growth rates. BP and CC had similar connectivity, superior to that of MF. Connectivity increased monotonously for BP, decreased for CC and remained stable for MF, with a marked increase for the three branches in November and December 2012. Hierarchy-related measures showed that CC and MF had similar proportions of leaves, average depths and average heights. BP had a lower proportion of leaves, and a higher average depth and average height. For BP and MF, the late 2012 increase of connectivity resulted in an increase of the average depth and average height and a decrease of the proportion of leaves, indicating that a major enrichment effort of the intermediate-level hierarchy occurred.

The variation of the number of classes and relations in an ontology does not provide enough information about the evolution of its complexity. However, connectivity and hierarchy-related metrics revealed different patterns of values as well as of evolution for the three branches of the Gene Ontology. CC was similar to BP in terms of connectivity, and similar to MF in terms of hierarchy. Overall, BP complexity increased, CC was refined with the addition of leaves providing a finer level of annotations but decreasing slightly its complexity, and MF complexity remained stable.

Prospective ontology authors and ontology authors need to read this paper carefully.

Over a period of only four years, the ontologies studied in this paper evolved.

Which is a good thing, because the understandings that underpinned the original ontologies changed over those four years.

The lesson here being that for all of their apparent fixity, a useful ontology is no more fixed than authors who create and maintain it and the users who use it.

At any point in time an ontology may be “fixed” for some purpose or in some view, but that is a snapshot in time, not an eternal view.

As ontologies evolve, so must the mappings that bind them with and to other ontologies.

A blind mapping, simple juxtaposition of terms from ontologies is one form of mapping. A form that makes maintenance a difficult and chancy affair.

If on the other hand, each term had properties that supported the recorded mapping, any maintainer could follow enunciated rules for maintenance of that mapping.

Blind mapping: Pay the cost of mapping every time ontology mappings become out of synchronization enough to pinch (or lead to disaster).

Sustainable mapping: Pay the full cost of mapping once and then maintain the mapping.

What’s your comfort level with risk?

  • Discovery of a “smoking gun” memo on tests of consumer products.
  • Inappropriate access to spending or financial records.
  • Preservation of inappropriate emails.
  • etc.

What are you not able to find with an unmaintained ontology?

Comments are closed.