Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 10, 2012

Mapping solution to heterogeneous data sources

Filed under: Bioinformatics,Biomedical,Genome,Heterogeneous Data,Mapping — Patrick Durusau @ 2:21 pm

dbSNO: a database of cysteine S-nitrosylation by Tzong-Yi Lee, Yi-Ju Chen, Cheng-Tsung Lu, Wei-Chieh Ching, Yu-Chuan Teng, Hsien-Da Huang and Yu-Ju Chen. (Bioinformatics (2012) 28 (17): 2293-2295. doi: 10.1093/bioinformatics/bts436)

OK, the title doesn’t jump out and say “mapping solution here!” 😉

Reading a bit further, you discover that text mining is used to locate sequences and that data is then mapped to “UniProtKB protein entries.”

The data set provides access to:

  • UniProt ID
  • Organism
  • Position
  • PubMed Id
  • Sequence

My concern is what happens when X is mapped to a UniProtKB protein entry to:

  • The prior identifier for X (in the article or source), and
  • The mapping from X to the UniProtKB protein entry?

If both of those are captured, then prior literature can be annotated upon rendering to point to later aggregation of information on a subject.

If the prior identifier, place of usage, the mapping, etc., are not captured, then prior literature, when we encounter it, remains frozen in time.

Mapping solutions work, but repay the effort several times over if the prior identifier and its mapping to the “new” identifier are captured as part of the process.

Abstract

Summary: S-nitrosylation (SNO), a selective and reversible protein post-translational modification that involves the covalent attachment of nitric oxide (NO) to the sulfur atom of cysteine, critically regulates protein activity, localization and stability. Due to its importance in regulating protein functions and cell signaling, a mass spectrometry-based proteomics method rapidly evolved to increase the dataset of experimentally determined SNO sites. However, there is currently no database dedicated to the integration of all experimentally verified S-nitrosylation sites with their structural or functional information. Thus, the dbSNO database is created to integrate all available datasets and to provide their structural analysis. Up to April 15, 2012, the dbSNO has manually accumulated >3000 experimentally verified S-nitrosylated peptides from 219 research articles using a text mining approach. To solve the heterogeneity among the data collected from different sources, the sequence identity of these reported S-nitrosylated peptides are mapped to the UniProtKB protein entries. To delineate the structural correlation and consensus motif of these SNO sites, the dbSNO database also provides structural and functional analyses, including the motifs of substrate sites, solvent accessibility, protein secondary and tertiary structures, protein domains and gene ontology.

Availability: The dbSNO is now freely accessible via http://dbSNO.mbc.nctu.edu.tw. The database content is regularly updated upon collecting new data obtained from continuously surveying research articles.

Contacts: francis@saturn.yu.edu.tw or yujuchen@gate.sinica.edu.tw.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress