Archive for the ‘Sequence Detection’ Category

…a phylogeny-aware graph algorithm

Monday, June 25th, 2012

Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm by Loytynoja, A., Vilella, A. J., Goldman, N.

From the post:

Motivation: Accurate alignment of large numbers of sequences is demanding and the computational burden is further increased by downstream analyses depending on these alignments. With the abundance of sequence data, an integrative approach of adding new sequences to existing alignments without their full re-computation and maintaining the relative matching of existing sequences is an attractive option. Another current challenge is the extension of reference alignments with fragmented sequences, as those coming from next-generation metagenomics, that contain relatively little information. Widely used methods for alignment extension are based on profile representation of reference sequences. These do not incorporate and use phylogenetic information and are affected by the composition of the reference alignment and the phylogenetic positions of query sequences.

Results: We have developed a method for phylogeny-aware alignment of partial-order sequence graphs and apply it here to the extension of alignments with new data. Our new method, called PAGAN, infers ancestral sequences for the reference alignment and adds new sequences in their phylogenetic context, either to predefined positions or by finding the best placement for sequences of unknown origin. Unlike profile-based alternatives, PAGAN considers the phylogenetic relatedness of the sequences and is not affected by inclusion of more diverged sequences in the reference set. Our analyses show that PAGAN outperforms alternative methods for alignment extension and provides superior accuracy for both DNA and protein data, the improvement being especially large for fragmented sequences. Moreover, PAGAN-generated alignments of noisy next-generation sequencing (NGS) sequences are accurate enough for the use of RNA-seq data in evolutionary analyses.

Availability: PAGAN is written in C++, licensed under the GPL and its source code is available at http://code.google.com/p/pagan-msa.

Contact: ari.loytynoja@helsinki.fi

Does your graph software support “…phylogeny-aware alignment of partial-order sequence graphs…?”

New Insights from Text Analytics

Monday, November 28th, 2011

New Insights from Text Analytics by Themos Kalafatis.

From the post:

“I have been trying repeatedly to solve my billing problem through customer care. I first talked with someone called Mrs Jane Doe. She said she should transfer my call to another representative from the sales department. Yet another rep from the sales department informed me that i should be talking with the Billing department instead. Unfortunately my bad experience of being transferred through various representatives was not over because the Billing department informed me that i should speak to the……”

Currently Text Analytics software will identify key elements of the above text but a very important piece of information goes unnoticed. It is the sequence of events which takes place :

(Jane Doe => Sales Dept =>Billing Dept =>…)

Is your software capturing sequences?

If not, how would you go about doing it?

And once captured, how do you represent it in a topic map?

PS: I would have isolated more segments in the sequence. How about you?