Simultaneous statistical multiple alignment and phylogeny reconstruction

被引:61
作者
Fleissner, R
Metzler, D
Von Haeseler, A
机构
[1] Univ Idaho, Dept Math, IBEST, Moscow, ID 83844 USA
[2] Goethe Univ Frankfurt, D-60054 Frankfurt, Germany
[3] Univ Dusseldorf, Inst Bioinformat, D-40225 Dusseldorf, Germany
[4] John von Neumann Inst Comp, Forsch Grp Bioinformat, D-52425 Julich, Germany
关键词
multiple sequence alignment; statistical alignment; TKF model; tree reconstruction;
D O I
10.1080/10635150590950371
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Although the reconstruction of phylogenetic trees and the computation of multiple sequence alignments are highly interdependent, these two areas of research lead quite separate lives, the former often making use of stochastic modeling, whereas the latter normally does not. Despite the fact that reasonable insertion and deletion models for sequence pairs were already introduced more than 10 years ago, they have only recently been applied to multiple alignment and only in their simplest version. In this paper we present and discuss a strategy based on simulated annealing, which makes use of these models to infer a phylogenetic tree for a set of DNA or protein sequences together with the sequences' indel history, i.e., their multiple alignment augmented with information about the positioning of insertion and deletion events in the tree. Our method is also the first application of the TKF2 model in the context of multiple sequence alignment. We validate the method via simulations and illustrate it using a data set of primate mtDNA.
引用
收藏
页码:548 / 561
页数:14
相关论文
共 42 条
[1]   SEQUENCE AND ORGANIZATION OF THE HUMAN MITOCHONDRIAL GENOME [J].
ANDERSON, S ;
BANKIER, AT ;
BARRELL, BG ;
DEBRUIJN, MHL ;
COULSON, AR ;
DROUIN, J ;
EPERON, IC ;
NIERLICH, DP ;
ROE, BA ;
SANGER, F ;
SCHREIER, PH ;
SMITH, AJH ;
STADEN, R ;
YOUNG, IG .
NATURE, 1981, 290 (5806) :457-465
[2]  
[Anonymous], PHYLOGENETIC HDB
[3]  
Durbin R., 1998, Biological sequence analysis: Probabilistic models of proteins and nucleic acids
[4]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[5]  
Felsenstein Joseph, 2004, Inferring_phylogenies, V2
[6]   PROGRESSIVE SEQUENCE ALIGNMENT AS A PREREQUISITE TO CORRECT PHYLOGENETIC TREES [J].
FENG, DF ;
DOOLITTLE, RF .
JOURNAL OF MOLECULAR EVOLUTION, 1987, 25 (04) :351-360
[7]  
FLEISSNER R, 2004, SEQUENCE ALIGNMENT P
[8]  
Fleissner R., 2000, P GERM C BIOINF, P89
[9]   Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments [J].
Gotoh, O .
JOURNAL OF MOLECULAR BIOLOGY, 1996, 264 (04) :823-838
[10]   Multiple sequence alignment: Algorithms and applications [J].
Gotoh, O .
ADVANCES IN BIOPHYSICS, VOL 36, 1999, 1999, 36 :159-206