Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 22, 2012

Developing a biocuration workflow for AgBase… [Authoring Interfaces]

Filed under: Bioinformatics,Biomedical,Curation,Genomics,Text Mining — Patrick Durusau @ 9:50 am

Developing a biocuration workflow for AgBase, a non-model organism database by Lakshmi Pillai, Philippe Chouvarine, Catalina O. Tudor, Carl J. Schmidt, K. Vijay-Shanker and Fiona M. McCarthy.

Abstract:

AgBase provides annotation for agricultural gene products using the Gene Ontology (GO) and Plant Ontology, as appropriate. Unlike model organism species, agricultural species have a body of literature that does not just focus on gene function; to improve efficiency, we use text mining to identify literature for curation. The first component of our annotation interface is the gene prioritization interface that ranks gene products for annotation. Biocurators select the top-ranked gene and mark annotation for these genes as ‘in progress’ or ‘completed’; links enable biocurators to move directly to our biocuration interface (BI). Our BI includes all current GO annotation for gene products and is the main interface to add/modify AgBase curation data. The BI also displays Extracting Genic Information from Text (eGIFT) results for each gene product. eGIFT is a web-based, text-mining tool that associates ranked, informative terms (iTerms) and the articles and sentences containing them, with genes. Moreover, iTerms are linked to GO terms, where they match either a GO term name or a synonym. This enables AgBase biocurators to rapidly identify literature for further curation based on possible GO terms. Because most agricultural species do not have standardized literature, eGIFT searches all gene names and synonyms to associate articles with genes. As many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene, and filtering is applied to remove abstracts that mention a gene in passing. The BI is linked to our Journal Database (JDB) where corresponding journal citations are stored. Just as importantly, biocurators also add to the JDB citations that have no GO annotation. The AgBase BI also supports bulk annotation upload to facilitate our Inferred from electronic annotation of agricultural gene products. All annotations must pass standard GO Consortium quality checking before release in AgBase.

Database URL: http://www.agbase.msstate.edu/

Another approach to biocuration. I will be posting on eGift separately but do note this is a domain specific tool.

The authors did not set out to create the universal curation tool but one suited to their specific data and requirements.

I think there is an important lesson here for semantic authoring interfaces. Word processors offer very generic interfaces but consequently little in the way of structure. Authoring annotated information requires more structure and that requires domain specifics.

Now there is an idea, create topic map authoring interfaces on top of a common skeleton, instead of hard coding interfaces as users “should” use the tool.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress