Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 19, 2011

Knime4Bio:…Next Generation Sequencing data with KNIME

Filed under: Bioinformatics,Biomedical,Data Mining — Patrick Durusau @ 3:15 pm

Knime4Bio:…Next Generation Sequencing data with KNIME by # Pierre Lindenbaum, Solena Le Scouarnec, Vincent Portero and Richard Redon.

Abstract:

Analysing large amounts of data generated by next-generation sequencing (NGS) technologies is difficult for researchers or clinicians without computational skills. They are often compelled to delegate this task to computer biologists working with command line utilities. The availability of easy-to-use tools will become essential with the generalisation of NGS in research and diagnosis. It will enable investigators to handle much more of the analysis. Here, we describe Knime4Bio, a set of custom nodes for the KNIME (The Konstanz Information Miner) interactive graphical workbench, for the interpretation of large biological datasets. We demonstrate that this tool can be utilised to quickly retrieve previously published scientific findings.

Code: http://code.google.com/p/knime4bio/

While I applaud the trend towards “easy-to-use” software, I do worry about results that are returned by automated analysis, which of course “must be true.”

I am mindful of the four-year old whose name was on a terrorist watch list and so delayed the departure of a plane. The ground personnel lacked the moral courage or judgement to act on what was clearly a case of mistaken identity.

As “bigdata” grows ever larger, I wonder if “easy” interfaces will really be facile interfaces, that we lack the courage (skill?) to question?

2 Comments

  1. The problem is the same if the clinician let a bioinformatician manage his data.

    I’ve done “stuff” with your data: http://biocomicals.blogspot.com/2011/05/thats-what-bioinformaticians-do.html

    Why should he believe I’ve processed the data correctly ?

    There is no magic here, we are just combining list, grouping by columns, etc… Here we cannot just process the file as if we were a specialist of a disease. Believe me, the clinicians want to get their hands dirty, they KNOW the candidate genes, they KNOW their disease, but they just don’t know how to process this amount data.

    Error ? Bugs ? yes, there might be bugs. But evn the major softwares (BLAST, SAMTOOLS, GATK) have bugs that will be/have been corrected.

    Regard,

    Comment by yokofakun — October 21, 2011 @ 9:24 am

  2. Not just a question of bugs. Sure clinicians know their names for genes, but as the experience with HUGO has shown, there will be several names before the canonical one and the non-canonical names will persist in the earlier publications. (How could they not?)

    True, later mappings, perhaps domain specific mappings can overcome those issues, but only if we acknowledge them to exist and to be worthy of repair. Yes?

    Comment by Patrick Durusau — October 24, 2011 @ 6:38 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress