Standards and Infrastructure for Innovation Data Exchange by Laurel L. Haak, David Baker, Donna K. Ginther, Gregg J. Gordon, Matthew A. Probus, Nirmala Kannankutty and Bruce A. Weinberg. (Science 12 October 2012: Vol. 338 no. 6104 pp. 196-197 DOI: 10.1126/science.1221840)
Appropriate that post number six thousand (6000) should report an article on data exchange standards.
But the article seems to be at war with itself.
Consider:
There is no single database solution. Data sets are too large, confidentiality issues will limit access, and parties with proprietary components are unlikely to participate in a single-provider solution. Security and licensing require flexible access. Users must be able to attach and integrate new information.
Unified standards for exchanging data could enable a Web-based distributed network, combining local and cloud storage and providing public-access data and tools, private workspace “sandboxes,” and versions of data to support parallel analysis. This infrastructure will likely concentrate existing resources, attract new ones, and maximize benefits from coordination and interoperability while minimizing resource drain and top-down control.
As quickly as the authors say “[t]here is no single database solution.”, they take a deep breath and outline the case for a uniform data sharing structure.
If there is no “single database solution,” it stands to reason there is no single infrastructure for sharing data. The same diversity that blocks the single database, impedes the single exchange infrastructure.
We need standards, but rather than unending quests for enlightened permanence, we should focus on temporary standards, to be replaced by other temporary standards, when circumstances or needs change.
A narrow range required to demonstrate benefits from temporary standards is a plus as well. A standard enabling data integration between departments at a hospital, one department at a time, will show benefits (if there are any to be had), far sooner than a standard that requires universal adoption prior to any benefits appearing.
The Topic Maps Data Model (TMDM) is an example of a narrow range standard.
While the TMDM can be extended, in its original form, subjects are reliably identified using IRI’s (along with data about those subjects). All that is required is that one or more parties use IRIs as identifiers, and not even the same IRIs.
The TMDM framework enables one or more parties to use their own IRIs and data practices, without prior agreement, and still have reliable merging of their data.
I think it is the without prior agreement part that distinguishes the Topic Maps Data Model from other data interchange standards.
We can skip all the tiresome discussion about who has the better name/terminology/taxonomy/ontology for subject X and get down to data interchange.
Data interchange is interesting, but what we find following data interchange is even more so.
More on that to follow, sooner rather than later, in the next six thousand posts.
(See the Donations link. Your encouragement would be greatly appreciated.)