Ab initio gene identification:: prokaryote genome annotation with GeneScan and GLIMMER

被引:61
作者
Aggarwal, G [1 ]
Ramaswamy, R [1 ]
机构
[1] Jawaharlal Nehru Univ, Sch Phys Sci, New Delhi 110067, India
关键词
GLIMMER; GeneScan;
D O I
10.1007/BF02703679
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We compare the annotation of three complete genomes using the ab initio methods of gene identification GeneScan and GLIMMER. The annotation given in GenBank, the standard against which these are compared, has been made using GeneMark. We find a number of novel genes which are predicted by both methods used here, as well as a number of genes that are predicted by GeneMark, but are not identified by either of the nonconsensus methods that we have used. The three organisms studied here are all prokaryotic species with fairly compact genomes. The Fourier measure forms the basis for an efficient non-consensus method for gene prediction, and the algorithm GencScan exploits this measure. We have bench-marked this program as well as GLIMMER using 3 complete prokaryotic genomes. An effort has also been made to study the limitations of these techniques for complete genome analysis. GeneScan and GLIMMER are of comparable accuracy insofar as gene-identification is concerned, with sensitivities and specificities typically greater than 0.9. The number of false predictions (both positive and negative) is higher for GeneScan as compared to GLIMMER, but in a significant number of cases, similar results are provided by the two techniques. This suggests that there could be some as-yet unidentified additional genes in these three genomes, and also that some of the putative identifications made hitherto might require re-evaluation. All these cases are discussed in detail.
引用
收藏
页码:7 / 14
页数:8
相关论文
共 21 条
[1]   Self-identification of protein-coding regions in microbial genomes [J].
Audic, S ;
Claverie, JM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (17) :10026-10031
[2]  
Bhattacharya A, 1999, CURR SCI INDIA, V77, P564
[3]   Identification of parasitic genes by computational methods [J].
Bhattacharya, A ;
Bhattacharya, S ;
Joshi, A ;
Ramachandran, S ;
Ramaswamy, R .
PARASITOLOGY TODAY, 2000, 16 (03) :127-131
[4]   GENMARK - PARALLEL GENE RECOGNITION FOR BOTH DNA STRANDS [J].
BORODOVSKY, M ;
MCININCH, J .
COMPUTERS & CHEMISTRY, 1993, 17 (02) :123-133
[5]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[6]   Evaluation of gene structure prediction programs [J].
Burset, M ;
Guigo, R .
GENOMICS, 1996, 34 (03) :353-367
[7]   Computational methods for the identification of genes in vertebrate genomic sequences [J].
Claverie, JM .
HUMAN MOLECULAR GENETICS, 1997, 6 (10) :1735-1744
[8]   Improved microbial gene identification with GLIMMER [J].
Delcher, AL ;
Harmon, D ;
Kasif, S ;
White, O ;
Salzberg, SL .
NUCLEIC ACIDS RESEARCH, 1999, 27 (23) :4636-4641
[9]   The DNA sequence of human chromosome 22 [J].
Dunham, I ;
Shimizu, N ;
Roe, BA ;
Chissoe, S ;
Dunham, I ;
Hunt, AR ;
Collins, JE ;
Bruskiewich, R ;
Beare, DM ;
Clamp, M ;
Smink, LJ ;
Ainscough, R ;
Almeida, JP ;
Babbage, A ;
Bagguley, C ;
Balley, J ;
Barlow, K ;
Bates, KN ;
Beasley, O ;
Bird, CP ;
Blakey, S ;
Bridgeman, AM ;
Buck, D ;
Burgess, J ;
Burrill, WD ;
Burton, J ;
Carder, C ;
Carter, NP ;
Chen, Y ;
Clark, G ;
Clegg, SM ;
Cobley, V ;
Cole, CG ;
Collier, RE ;
Connor, RE ;
Conroy, D ;
Corby, N ;
Coville, GJ ;
Cox, AV ;
Davis, J ;
Dawson, E ;
Dhami, PD ;
Dockree, C ;
Dodsworth, SJ ;
Durbin, RM ;
Ellington, A ;
Evans, KL ;
Fey, JM ;
Fleming, K ;
French, L .
NATURE, 1999, 402 (6761) :489-495
[10]   The gene identification problem: An overview for developers [J].
Fickett, JW .
COMPUTERS & CHEMISTRY, 1996, 20 (01) :103-118