Statistical alignment based on fragment insertion and deletion models

被引:39
作者
Metzler, D [1 ]
机构
[1] Goethe Univ Frankfurt, Fachbereich Math, D-6000 Frankfurt, Germany
关键词
D O I
10.1093/bioinformatics/btg026
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The topic of this paper is the estimation of alignments and mutation rates based on stochastic sequence-evolution models that allow insertions and deletions of subsequences ('fragments') and not just single bases. The model we propose is a variant of a model introduced by Thorne et al., (J. Mol. Evol., 34, 3-16, 1992). The computational tractability of the model depends on certain restrictions in the insertion/deletion process; possible effects we discuss. Results: The process of fragment insertion and deletion in the sequence-evolution model induces a hidden Markov structure at the level of alignments and thus makes possible efficient statistical alignment algorithms. As an example we apply a sampling procedure to assess the variability in alignment and mutation parameter estimates for HVR1 sequences of human and orangutan, improving results of previous work. Simulation studies give evidence that estimation methods based on the proposed model also give satisfactory results when applied to data for which the restrictions in the insertion/deletion process do not hold.
引用
收藏
页码:490 / 499
页数:10
相关论文
共 23 条
[1]   SEQUENCE AND ORGANIZATION OF THE HUMAN MITOCHONDRIAL GENOME [J].
ANDERSON, S ;
BANKIER, AT ;
BARRELL, BG ;
DEBRUIJN, MHL ;
COULSON, AR ;
DROUIN, J ;
EPERON, IC ;
NIERLICH, DP ;
ROE, BA ;
SANGER, F ;
SCHREIER, PH ;
SMITH, AJH ;
STADEN, R ;
YOUNG, IG .
NATURE, 1981, 290 (5806) :457-465
[2]  
[Anonymous], 1996, MOL SYSTEMATICS
[3]  
Durbin R., 1998, BIOL SEQUENCE ANAL
[4]   A hidden Markov Model approach to variation among sites in rate of evolution [J].
Felsenstein, J ;
Churchill, GA .
MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (01) :93-104
[5]  
FLEISSNER R, 2000, GCB 2000, P89
[6]  
FLEISSNER R, 2002, UNPUB SIMULTANEOUS P
[7]  
Gamerman D., 1997, MARKOV CHAIN MONTE C
[8]   Size-biased and conditioned random splitting trees [J].
Geiger, J .
STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 1996, 65 (02) :187-207
[9]  
*GPL, 2000, GNU PUBL LIC
[10]   Compilation of human mtDNA control region sequences [J].
Handt, O ;
Meyer, S ;
von Haeseler, A .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :126-129