Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 1, 2011

Parallel approaches in next-generation sequencing analysis pipelines

Filed under: Bioinformatics,Parallel Programming,Parallelism — Patrick Durusau @ 3:34 pm

Parallel approaches in next-generation sequencing analysis pipelines

From the post:

My last post described a distributed exome analysis pipeline implemented on the CloudBioLinux and CloudMan frameworks. This was a practical introduction to running the pipeline on Amazon resources. Here I’ll describe how the pipeline runs in parallel, specifically diagramming the workflow to identify points of parallelization during lane and sample processing.

Incredible innovation in throughput makes parallel processing critical for next-generation sequencing analysis. When a single Hi-Seq run can produce 192 samples (2 flowcells x 8 lanes per flowcell x 12 barcodes per lane), the analysis steps quickly become limited by the number of processing cores available.

The heterogeneity of architectures utilized by researchers is a major challenge in building re-usable systems. A pipeline needs to support powerful multi-core servers, clusters and virtual cloud-based machines. The approach we took is to scale at the level of individual samples, lanes and pipelines, exploiting the embarassingly parallel nature of the computation. An AMQP messaging queue allows for communication between processes, independent of the system architecture. This flexible approach allows the pipeline to serve as a general framework that can be easily adjusted or expanded to incorporate new algorithms and analysis methods.

The message passing based parallelism sounds a lot like Storm doesn’t it? Will message passing be what frees us from the constraints of architecture? Wondering what sort of performance “hit” we will take when not working really close to the metal? But, then the “metal” may become the basis for such message passing systems. Not quite yet but perhaps not so far away either.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress