MoleculaRnetworks: An integrated graph theoretic and data mining tool to explore solvent organization in molecular simulation by Barbara Logan Mooney, L. René Corrales and Aurora E. Clark.
Abstract:
This work discusses scripts for processing molecular simulations data written using the software package R: A Language and Environment for Statistical Computing. These scripts, named moleculaRnetworks, are intended for the geometric and solvent network analysis of aqueous solutes and can be extended to other H-bonded solvents. New algorithms, several of which are based on graph theory, that interrogate the solvent environment about a solute are presented and described. This includes a novel method for identifying the geometric shape adopted by the solvent in the immediate vicinity of the solute and an exploratory approach for describing H-bonding, both based on the PageRank algorithm of Google search fame. The moleculaRnetworks codes include a preprocessor, which distills simulation trajectories into physicochemical data arrays, and an interactive analysis script that enables statistical, trend, and correlation analysis, and other data mining. The goal of these scripts is to increase access to the wealth of structural and dynamical information that can be obtained from molecular simulations. © 2012 Wiley Periodicals, Inc.
Data mining, graph theory, PageRank, something for everyone in this article!
Not to mention innovative use of PageRank with non-WWW data.