This is going to require free registration at Genomeweb but I think it will be worth it. (Genomeweb also offers $premium content but I haven’t tried any of it, yet.)
Nice overview of Hadoop in genome research.
Annoying in that it lists the following projects, sans hyperlinks. I have supplied the project listing with hyperlinks, just in case you are interested in Hadoop and genome research.
Crossbow: Whole genome resequencing analysis; SNP genotyping from short reads
Contrail: De novo assembly from short sequencing reads
Myrna: Ultrafast short read alignment and differential gene expression from large RNA-seq eakRanger: Cloud-enabled peak caller for ChIP-seq data
Quake: Quality-aware detection and sequencing error correction tool
BlastReduce: High-performance short read mapping (superceded by CloudBurst)
CloudBLAST*: Hadoop implementation of NCBI’s Blast
MrsRF: Algorithm for analyzing large evolutionary trees
*CloudBLAST was the only project without a webpage or similar source of information. This is a paper, perhaps the original paper on the technique. Searching for any of these techniques reveals a wealth of material on using Hadoop in bioinformatics.
Topic maps can capture your path through data (think of bread crumbs or string). So when today you think, “I should have gone left, rather than right”, you can retrace your steps and take a another path. Try that with a Google search. If you are lucky, you may get the same ads. 😉
You can also share your bread crumbs or string with others, but that is a story for another time.