Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments

被引:2585
作者
Haas, Brian J. [1 ,2 ]
Salzberg, Steven L. [3 ]
Zhu, Wei [1 ]
Pertea, Mihaela [3 ]
Allen, Jonathan E. [3 ,4 ]
Orvis, Joshua [1 ,5 ]
White, Owen [1 ]
Buell, C. Robin [1 ,6 ]
Wortman, Jennifer R. [1 ,5 ]
机构
[1] J Craig Venter Inst, Inst Genom Res, Rockville, MD 20850 USA
[2] MIT, Braod Inst, Cambridge, MA 02142 USA
[3] Univ Maryland, Dept Comp Sci, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
[4] Lawrence Livermore Natl Lab, Computat Directorate, Livermore, CA 94550 USA
[5] Univ Maryland, Sch Med, Inst Genom Sci, Baltimore, MD 21201 USA
[6] Michigan State Univ, Dept Plant Biol, E Lansing, MI 48824 USA
基金
美国国家科学基金会;
关键词
D O I
10.1186/gb-2008-9-1-r7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.
引用
收藏
页数:22
相关论文
共 49 条
[1]   SLAM: Cross-species gene finding and alignment with a generalized pair hidden Markov model [J].
Alexandersson, M ;
Cawley, S ;
Pachter, L .
GENOME RESEARCH, 2003, 13 (03) :496-502
[2]   Features of Arabidopsis genes and genome discovered using full-length cDNAs [J].
Alexandrov, NN ;
Troukhan, ME ;
Brover, VV ;
Tatarinova, T ;
Flavell, RB ;
Feldmann, KA .
PLANT MOLECULAR BIOLOGY, 2006, 60 (01) :69-85
[3]   JIGSAW: integration of multiple sources of evidence for gene prediction [J].
Allen, JE ;
Salzberg, SL .
BIOINFORMATICS, 2005, 21 (18) :3596-3603
[4]  
Allen JE, 2004, GENOME RES, V14, P142, DOI 10.1101/gr.1562804
[5]  
[Anonymous], 2002, Genome Biol
[6]  
Berriman Matt, 2003, Briefings in Bioinformatics, V4, P124, DOI 10.1093/bib/4.2.124
[7]   GeneWise and genomewise [J].
Birney, E ;
Clamp, M ;
Durbin, R .
GENOME RESEARCH, 2004, 14 (05) :988-995
[8]   An overview of ensembl [J].
Birney, E ;
Andrews, TD ;
Bevan, P ;
Caccamo, M ;
Chen, Y ;
Clarke, L ;
Coates, G ;
Cuff, J ;
Curwen, V ;
Cutts, T ;
Down, T ;
Eyras, E ;
Fernandez-Suarez, XM ;
Gane, P ;
Gibbins, B ;
Gilbert, J ;
Hammond, M ;
Hotz, HR ;
Iyer, V ;
Jekosch, K ;
Kahari, A ;
Kasprzyk, A ;
Keefe, D ;
Keenan, S ;
Lehvaslaiho, H ;
McVicker, G ;
Melsopp, C ;
Meidl, P ;
Mongin, E ;
Pettett, R ;
Potter, S ;
Proctor, G ;
Rae, M ;
Searle, S ;
Slater, G ;
Smedley, D ;
Smith, J ;
Spooner, W ;
Stabenau, A ;
Stalker, J ;
Storey, R ;
Ureta-Vidal, A ;
Woodwark, KC ;
Cameron, G ;
Durbin, R ;
Cox, A ;
Hubbard, T ;
Clamp, M .
GENOME RESEARCH, 2004, 14 (05) :925-928
[9]  
BREJOVA B, 2005, BIOINFORMATICS, P57
[10]   Genome annotation past, present, and future: How to define an ORF at each locus [J].
Brent, MR .
GENOME RESEARCH, 2005, 15 (12) :1777-1786