University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > An empirical study of the effect of sequence alignment on phylogenetic analysis

An empirical study of the effect of sequence alignment on phylogenetic analysis

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Mustapha Amrani.

Phylogenetics

Phylogenetic analyses start with a multiple sequence alignment, which is often accepted as known despite wide recognition that errors may impact downstream phylogenetic analysis. Many phylogenetic methods involve testing which of a range of competing hypotheses best describe the evolution of a set of sequences. These tests may be justified statistically when using the correct alignment, but errors in the alignment lead to non-homologous characters being placed together, which in turn may systematically bias the test. We investigate empirically the impact of different alignment methods on phylogenetic analyses and assess the relative impact of different approximations used by different alignment methods.

We examine the effect of alignment on two phylogenetic analyses that are commonly used in computational biology: the inference of a maximum-likelihood tree using RAxML, and a test for positive selection by comparing the M7 and M8 models in PAML . We test 200 sets of sequences from the Adaptive Evolution Database using the popular aligners ClustalW, Muscle, MAAFT , ProbCons, and the phylogenetic aligner Prank. We also sample from the posterior distribution of the statistical aligner BAli-Phy, which enables us to compare the relative impact of aligner choice to uncertainty from a single aligner.

The algorithmic basis of an aligner tends to determine the outcome of the phylogenetic analysis. For example, trees estimated from progressive aligners tend to be more similar to one another than those estimated from phylogenetically aware (Prank) or consensus (ProbCons) aligners. Moreover the spread of phylogenetic parameter estimates inferred from BAli-Phys posterior distribution of alignments is much smaller than the differences between other aligners, suggesting differences are larger than could be expected by chance. Of the aligners examined, our results suggest that the phylogenetically informed Prank provides the closest approximation to full statistical alignment.

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity