Integrating the Biological Universe

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 4, 2013

Integrating the Biological Universe

Filed under: Bioinformatics,Integration — Patrick Durusau @ 7:37 pm

Integrating the Biological Universe by Yasset Perez-Riverol & Roberto Vera.

From the post:

Integrating biological data is perhaps one of the most daunting tasks any bioinformatician has to face. From a cursory look, it is easy to see two major obstacles standing in the way: (i) the sheer amount of existing data, and (ii) the staggering variety of resources and data types used by the different groups working in the field (reviewed at [1]). In fact, the topic of data integration has a long-standing history in computational biology and bioinformatics. A comprehensive picture of this problem can be found in recent papers [2], but this short comment will serve to illustrate some of the hurdles of data integration and as a not-so-shameless plug for our contribution towards a solution.

“Reflecting the data-driven nature of modern biology, databases have grown considerably both in size and number during the last decade. The exact number of databases is difficult to ascertain. While not exhaustive, the 2011 Nucleic Acids Research (NAR) online database collection lists 1330 published biodatabases (1), and estimates derived from the ELIXIR database provider survey suggest an approximate annual growth rate of ∼12% (2). Globally, the numbers are likely to be significantly higher than those mentioned in the online collection, not least because many are unpublished, or not published in the NAR database issue.” [1]

Which lead me to:

JBioWH: an open-source Java framework for bioinformatics data integration:

Abstract:

The Java BioWareHouse (JBioWH) project is an open-source platform-independent programming framework that allows a user to build his/her own integrated database from the most popular data sources. JBioWH can be used for intensive querying of multiple data sources and the creation of streamlined task-specific data sets on local PCs. JBioWH is based on a MySQL relational database scheme and includes JAVA API parser functions for retrieving data from 20 public databases (e.g. NCBI, KEGG, etc.). It also includes a client desktop application for (non-programmer) users to query data. In addition, JBioWH can be tailored for use in specific circumstances, including the handling of massive queries for high-throughput analyses or CPU intensive calculations. The framework is provided with complete documentation and application examples and it can be downloaded from the Project Web site at http://code.google.com/p/jbiowh. A MySQL server is available for demonstration purposes at hydrax.icgeb.trieste.it:3307.

Database URL: http://code.google.com/p/jbiowh

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 4, 2013

Integrating the Biological Universe

No Comments