STAR: ultrafast universal RNA-seq aligner
by Stephen Turner.
From the post:
There’s a new kid on the block for RNA-seq alignment.
Dobin, Alexander, et al. “STAR: ultrafast universal RNA-seq aligner.” Bioinformatics (2012).
Aligning RNA-seq data is challenging because reads can overlap splice junctions. Many other RNA-seq alignment algorithms (e.g. Tophat) are built on top of DNA sequence aligners. STAR (Spliced Transcripts Alignment to a Reference) is a standalone RNA-seq alignment algorithm that uses uncompressed suffix arrays and a mapping algorithm similar to those used in large-scale genome alignment tools to align RNA-seq reads to a genomic reference. STAR is over 50 times faster than any other previously published RNA-seq aligner, and outperforms other aligners in both sensitivity and specificity using both simulated and real (replicated) RNA-seq data.
I had a brief exchange of comments with Lars Marius Garshol on string matching recently. Another example of a string processing approach you may adapt to different circumstances.