Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 24, 2013

From data to analysis:… [Data Integration For a Purpose]

From data to analysis: linking NWChem and Avogadro with the syntax and semantics of Chemical Markup Language by Wibe A de Jong, Andrew M Walker and Marcus D Hanwell. (Journal of Cheminformatics 2013, 5:25 doi:10.1186/1758-2946-5-25)

Abstract:

Background

Multidisciplinary integrated research requires the ability to couple the diverse sets of data obtained from a range of complex experiments and computer simulations. Integrating data requires semantically rich information. In this paper an end-to-end use of semantically rich data in computational chemistry is demonstrated utilizing the Chemical Markup Language (CML) framework. Semantically rich data is generated by the NWChem computational chemistry software with the FoX library and utilized by the Avogadro molecular editor for analysis and visualization.

Results

The NWChem computational chemistry software has been modified and coupled to the FoX library to write CML compliant XML data files. The FoX library was expanded to represent the lexical input files and molecular orbitals used by the computational chemistry software. Draft dictionary entries and a format for molecular orbitals within CML CompChem were developed. The Avogadro application was extended to read in CML data, and display molecular geometry and electronic structure in the GUI allowing for an end-to-end solution where Avogadro can create input structures, generate input files, NWChem can run the calculation and Avogadro can then read in and analyse the CML output produced. The developments outlined in this paper will be made available in future releases of NWChem, FoX, and Avogadro.

Conclusions

The production of CML compliant XML files for computational chemistry software such as NWChem can be accomplished relatively easily using the FoX library. The CML data can be read in by a newly developed reader in Avogadro and analysed or visualized in various ways. A community-based effort is needed to further develop the CML CompChem convention and dictionary. This will enable the long-term goal of allowing a researcher to run simple “Google-style” searches of chemistry and physics and have the results of computational calculations returned in a comprehensible form alongside articles from the published literature.

Aside from its obvious importance for cheminformatics, I think there is another lesson in this article.

Integration of data required “…semantically rich information…, but just as importantly, integration was not a goal in and of itself.

Integration was only part of a workflow that had other goals.

No doubt some topic maps are useful as end products of integrated data, but what of cases where integration is part of a workflow?

Think of the non-reusable data integration mappings that are offered by many enterprise integration packages.

August 12, 2012

The Semantics of Chemical Markup Language (CML) for Computational Chemistry : CompChem

Filed under: Chemical Markup Language (CML),Cheminformatics,CompChem — Patrick Durusau @ 12:52 pm

The Semantics of Chemical Markup Language (CML) for Computational Chemistry : CompChem by Weerapong Phadungsukanan, Markus Kraft, Joe A Townsend and Peter Murray-Rust (Journal of Cheminformatics 2012, 4:15 doi:10.1186/1758-2946-4-15)

Abstract (provisional):

This paper introduces a subdomain chemistry format for storing computational chemistry data called CompChem. It has been developed based on the design, concepts and methodologies of Chemical Markup Language (CML) by adding computational chemistry semantics on top of the CML Schema. The format allows a wide range of ab initio quantum chemistry calculations of individual molecules to be stored. These calculations include, for example, single point energy calculation, molecular geometry optimization, and vibrational frequency analysis. The paper also describes the supporting infrastructure, such as processing software, dictionaries, validation tools and database repository. In addition, some of the challenges and difficulties in developing common computational chemistry dictionaries are being discussed. The uses of CompChem are illustrated on two practical applications.

Important contribution if you are working with computational chemistry semantics.

Also important for its demonstration of the value of dictionaries and not trying to be all inclusive.

Integrate the data you have at hand and make allowance for the yet to be known.

Besides, there is always the next topic map that may consume the first with new merging rules.

CompChem Convention http://www.xml-cml.org/convention/compchem

CompChem dictionary http://www.xml-cml.org/dictionary/compchem/

CompChem validation stylesheet https://bitbucket.org/wwmm/cml-specs

CMLValidator http://bitbucket.org/cml/cmllite-validator-code

Chemical Markup Language (CML) http://www.xml-cml.org

Powered by WordPress