Current methods of gene prediction, their strengths and weaknesses

被引:286
作者
Mathé, C
Sagot, MF
Schiex, T
Rouzé, P
机构
[1] Inst Pharmacol & Biol Struct, UMR 5089, F-31077 Toulouse, France
[2] Univ Lyon 1, INRIA rhone Alpes, UMR Biometrie & Biol Evolut 5558, F-69622 Villeurbanne, France
[3] INRA Toulouse, Dept Biometrie & Intelligence Artificielle, F-31326 Castanet Tolosan, France
[4] Univ Ghent, Lab INRA France, B-9000 Ghent, Belgium
关键词
D O I
10.1093/nar/gkf543
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
While the genomes of many organisms have been sequenced over the last few years, transforming such raw sequence data into knowledge remains a hard task. A great number of prediction programs have been developed that try to address one part of this problem, which consists of locating the genes along a genome. This paper reviews the existing approaches to predicting genes in eukaryotic genomes and underlines their intrinsic advantages and limitations. The main mathematical models and computational algorithms adopted are also briefly described and the resulting software classified according to both the method and the type of evidence used. Finally, the several difficulties and pitfalls encountered by the programs are detailed, showing that improvements are needed and that new directions must be considered.
引用
收藏
页码:4103 / 4117
页数:15
相关论文
共 159 条
[1]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[2]  
[Anonymous], P 2 INT C BIOINF SUP
[3]  
Ashburner M, 2001, GENOME RES, V11, P1425
[4]   Self-identification of protein-coding regions in microbial genomes [J].
Audic, S ;
Claverie, JM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (17) :10026-10031
[5]  
BAFNA V, 2000, 8 INT C INT SYST MOL, P3
[6]   Analysis of EST-driven gene annotation in human genomic sequence [J].
Bailey, LC ;
Searls, DB ;
Overton, GC .
GENOME RESEARCH, 1998, 8 (04) :362-376
[7]   Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[8]  
Bellman R., 1957, DYNAMIC PROGRAMMING
[9]   Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences [J].
Bergman, CM ;
Kreitman, M .
GENOME RESEARCH, 2001, 11 (08) :1335-1345
[10]   THE ISOCHORE ORGANIZATION OF THE HUMAN GENOME [J].
BERNARDI, G .
ANNUAL REVIEW OF GENETICS, 1989, 23 :637-661