Gene finding with a hidden Markov model of genome structure and evolution

被引:61
作者
Pedersen, JS
Hein, J
机构
[1] Aarhus Univ, Inst Biol Sci, Dept Genet & Ecol, Bioinformat Res Ctr, DK-8000 Aarhus C, Denmark
[2] Univ Oxford, Dept Stat, Oxford OX1 3SY, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1093/bioinformatics/19.2.219
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A growing number of genomes are sequenced. The differences in evolutionary pattern between functional regions can thus be observed genome-wide in a whole set of organisms. The diverse evolutionary pattern of different functional regions can be exploited in the process of genomic annotation. The modelling of evolution by the existing comparative gene finders leaves room for improvement. Results: A probabilistic model of both genome structure and evolution is designed. This type of model is called an Evolutionary Hidden Markov Model (EHMM), being composed of an HMM and a set of region-specific evolutionary models based on a phylogenetic tree. All parameters can be estimated by maximum likelihood, including the phylogenetic tree. It can handle any number of aligned genomes, using their phylogenetic tree to model the evolutionary correlations. The time complexity of all algorithms used for handling the model are linear in alignment length and genome number. The model is applied to the problem of gene finding. The benefit of modelling sequence evolution is demonstrated both in a range of simulations and on a set of orthologous human/mouse gene pairs. Availability: Free availability over the Internet on www server: http://www.birc.dk/Software/evogene Contact: jsp@daimi.au.dk.
引用
收藏
页码:219 / 227
页数:9
相关论文
共 31 条
[1]   gff2ps:: visualizing genomic annotations [J].
Abril, JF ;
Guigó, R .
BIOINFORMATICS, 2000, 16 (08) :743-744
[2]  
[Anonymous], 2000, PHYLOGENETIC ANAL MA
[3]  
Bafna V, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P3
[4]   Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[5]  
BLAYO P, 1999, UNPUB ORPHAN GENE FI
[6]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[7]   Evaluation of gene structure prediction programs [J].
Burset, M ;
Guigo, R .
GENOMICS, 1996, 34 (03) :353-367
[8]  
CHURCHILL GA, 1989, B MATH BIOL, V51, P79
[9]  
Durbin R., 1998, BIOL SEQUENCE ANAL P
[10]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376