Automated annotation of microbial proteomes in SWISS-PROT

被引:96
作者
Gattiker, A
Michoud, K
Rivoire, C
Auchincloss, AH
Coudert, E
Lima, T
Kersey, P
Pagni, M
Sigrist, CJA
Lachaize, C
Veuthey, AL
Gasteiger, E
Bairoch, A
机构
[1] Swiss Inst Bioinformat, SWISS PROT Grp, CH-1211 Geneva 4, Switzerland
[2] EMBL, European Bioinformat Inst, Cambridge CB10 1SD, England
[3] Swiss Inst Expt Canc Res ISREC, Swiss Inst Bioinformat, CH-1066 Epalinges, Switzerland
关键词
protein sequence database; automatic annotation; complete genome; feature propagation; protein family; ORFans;
D O I
10.1016/S1476-9271(02)00094-4
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Large-scale sequencing of prokaryotic genomes demands the automation of certain annotation tasks currently manually performed in the production of the SWISS-PROT protein knowledgebase. The HAMAP project, or 'High-quality Automated and Manual Annotation of microbial Proteomes', aims to integrate manual and automatic annotation methods in order to enhance the speed of the curation process while preserving the quality of the database annotation. Automatic annotation is only applied to entries that belong to manually defined orthologous families and to entries with no identifiable similarities (ORFans). Many checks are enforced in order to prevent the propagation of wrong annotation and to spot problematic cases, which are channelled to manual curation. The results of this annotation are integrated in SWISS-PROT, and a website is provided at http://www.expasy.org/ sprot/hamap/. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:49 / 58
页数:10
相关论文
共 22 条
[1]   Homology-based method for identification of protein repeats using statistical significance estimates [J].
Andrade, MA ;
Ponting, CP ;
Gibson, TJ ;
Bork, P .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 298 (03) :521-537
[2]   Automated genome sequence analysis and annotation [J].
Andrade, MA ;
Brown, NP ;
Leroy, C ;
Hoersch, S ;
de Daruvar, A ;
Reich, C ;
Franchini, A ;
Tamames, J ;
Valencia, A ;
Ouzounis, C ;
Sander, C .
BIOINFORMATICS, 1999, 15 (05) :391-412
[3]  
Biswas Margaret, 2002, Brief Bioinform, V3, P285, DOI 10.1093/bib/3.3.285
[4]  
BOECKMANN B, 2003, IN PRESS NUCL ACIDS, V31
[5]   Go hunting in sequence databases but watch out for the traps [J].
Bork, P .
TRENDS IN GENETICS, 1996, 12 (10) :425-427
[6]   A flexible motif search technique based on generalized profiles [J].
Bucher, P ;
Karplus, K ;
Moeri, N ;
Hofmann, K .
COMPUTERS & CHEMISTRY, 1996, 20 (01) :3-23
[7]   Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence [J].
Cole, ST ;
Brosch, R ;
Parkhill, J ;
Garnier, T ;
Churcher, C ;
Harris, D ;
Gordon, SV ;
Eiglmeier, K ;
Gas, S ;
Barry, CE ;
Tekaia, F ;
Badcock, K ;
Basham, D ;
Brown, D ;
Chillingworth, T ;
Connor, R ;
Davies, R ;
Devlin, K ;
Feltwell, T ;
Gentles, S ;
Hamlin, N ;
Holroyd, S ;
Hornby, T ;
Jagels, K ;
Krogh, A ;
McLean, J ;
Moule, S ;
Murphy, L ;
Oliver, K ;
Osborne, J ;
Quail, MA ;
Rajandream, MA ;
Rogers, J ;
Rutter, S ;
Seeger, K ;
Skelton, J ;
Squares, R ;
Squares, S ;
Sulston, JE ;
Taylor, K ;
Whitehead, S ;
Barrell, BG .
NATURE, 1998, 393 (6685) :537-+
[8]  
Eddy S, 2001, HMMER PROFILE HIDDEN
[9]   Functional and structural genomics using PEDANT [J].
Frishman, D ;
Albermann, K ;
Hani, J ;
Heumann, K ;
Metanomski, A ;
Zollner, A ;
Mewes, HW .
BIOINFORMATICS, 2001, 17 (01) :44-57
[10]   MAGPIE: Automated genome interpretation [J].
Gaasterland, T ;
Sensen, CW .
TRENDS IN GENETICS, 1996, 12 (02) :76-78