April « 2014 « Another Word For It

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 1, 2014

Molpher:…

Filed under: Cheminformatics,Modeling,Science — Patrick Durusau @ 6:39 pm

Molpher: a software framework for systematic chemical space exploration by David Hoksza, Petr Škoda, Milan Voršilák and Daniel Svozil.

Abstract:

Background

Chemical space is virtual space occupied by all chemically meaningful organic compounds. It is an important concept in contemporary chemoinformatics research, and its systematic exploration is vital to the discovery of either novel drugs or new tools for chemical biology.

Results

In this paper, we describe Molpher, an open-source framework for the systematic exploration of chemical space. Through a process we term ‘molecular morphing’, Molpher produces a path of structurally-related compounds. This path is generated by the iterative application of so-called ‘morphing operators’ that represent simple structural changes, such as the addition or removal of an atom or a bond. Molpher incorporates an optimized parallel exploration algorithm, compound logging and a two-dimensional visualization of the exploration process. Its feature set can be easily extended by implementing additional morphing operators, chemical fingerprints, similarity measures and visualization methods. Molpher not only offers an intuitive graphical user interface, but also can be run in batch mode. This enables users to easily incorporate molecular morphing into their existing drug discovery pipelines.

Conclusions

Molpher is an open-source software framework for the design of virtual chemical libraries focused on a particular mechanistic class of compounds. These libraries, represented by a morphing path and its surroundings, provide valuable starting data for future in silico and in vitro experiments. Molpher is highly extensible and can be easily incorporated into any existing computational drug design pipeline.

Beyond its obvious importance for cheminformatics, this paper offers another example of “semantic impedance:”

While virtual chemical space is very large, only a small fraction of it has been reported in actual chemical databases so far. For example, PubChem contains data for 49.1 million chemical compounds [17] and Chemical Abstracts consists of over 84.3 million organic and inorganic substances [18] (numbers as of 12. 3. 2014). Thus, the navigation of chemical space is a very important area of chemoinformatics research [19,20]. Because chemical space is usually defined using various sets of descriptors [21], a major problem is the lack of invariance of chemical space [22,23]. Depending on the descriptors and distance measures used [24], different chemical spaces show different compound distributions. Unfortunately, no generally applicable representation of invariant chemical space has yet been reported [25].

OK, so how much further is there to go with these various descriptors?

The article describes estimates of the size of chemical space this way:

Chemical space is populated by all chemically meaningful and stable organic compounds [1-3]. It is an important concept in contemporary chemoinformatics research [4,5], and its exploration leads to the discovery of either novel drugs [2] or new tools for chemical biology [6,7]. It is agreed that chemical space is huge, but no accurate approximation of its size exists. Even if only drug-like molecules are taken into account, size estimates vary [8] between 10²³[9] and 10¹⁰⁰[10] compounds. However, smaller numbers have also been reported. For example, based on the growth of a number of organic compounds in chemical databases, Drew et al.[11] deduced the size of chemical space to be 3.4 × 10⁹. By assigning all possible combinations of atomic species to the same three-dimensional geometry, Ogata et al. [12] estimated the size of chemical space to be between 10⁸ and 10¹⁹. Also, by analyzing known organic substituents, the size of accessible chemical space was assessed as between 10²⁰ and 10²⁴[9].

Such estimates have been put into context by Reymond et al., who produced all molecules that can exist up to a certain number of heavy atoms in their Chemical Universe Databases: GDB-11 [13,14] (2.64 × 10⁷ molecules with up to 11 heavy atoms); GDB-13 [15] (9.7 × 10⁸ molecules with up to 13 heavy atoms); and GDB-17 [16] (1.7 × 10¹¹ compounds with up to 17 heavy atoms). The GDB-17 database was then used to approximate the number of possible drug-like molecules as 10³³[8].

To give you an easy basis for comparison: possible drug-like molecules at 10³³, versus number of stars in galaxies in the observable universe at 10²⁴.

That’s an impressive number of possible drug like molecules. 10⁹ more than stars in the observable universe (est.).

I can’t imagine that having diverse descriptors is assisting in the search to complete the chemical space. And from the description, it doesn’t sound like semantic convergence in one the horizon.

Mapping between the existing systems would be a major undertaking but the longer exploration goes on without such a mapping, the problem is only going to get worse.

Comments Off

Functional programming books overview

Filed under: Books,Functional Programming — Patrick Durusau @ 4:20 pm

Functional programming books overview by Alex Ott.

From the webpage:

The first variant of this article was published in the first issue of Russian magazine "Practice of functional programming", but I decided to continue to maintain it, as more books were released (Russian version of this article also includes description of books published in Russian). You can leave comments and suggestions in the comment widget on this page, or send them to me via e-mail (Updates to this page usually happening not so often — every 2-3 months).

Descriptions for the books are relatively short — just to give an overview of the book’s topics, otherwise this article will become too big. For some of books there are more detailed reviews published in my blog. You can also follow my reviews on Goodreads.

If you will order some of these books, please (if possible), use links from this page — this allows me to buy new books and add them to review.

If your bookshelf isn’t already bulging with functional programming books or if you want to test its limits, this is a very good site to visit and to recommend others visit.

The listing here is more than enough for holidays and birthdays into the foreseeable future.

Comments Off

…Browser History as a Favicon Tapestry (NSFW?)

Filed under: Graphics,Visualization — Patrick Durusau @ 2:47 pm

Browser Plugin Maps Your Browser History as a Favicon Tapestry by Andrew Vande Moere.

From the post:

Iconic History [shan-huang.com] by Carnegie Mellon University interaction design student Shan Huang is as simple as it is beautifully revealing.
…

See Andrew’s post for more details.

Depending on the websites you visit and their favicons, this may or may not be safe for work.

As a visualization it makes me curious, what if you could track the “focus” of applications in use so a similar display could be generated for the apps you use in a day, a week, etc.?

Would bring new meaning to the question: What have you been working on today? 😉

Comments Off

« Newer Posts