Automated system for gene annotation and metabolic pathway reconstruction using general sequence databases

被引:25
作者
Alves, Joao M. P. [1 ]
Buck, Gregory A. [1 ]
机构
[1] Virginia Commonwealth Univ, Ctr Study Biol Complexity, Richmond, VA 23284 USA
关键词
D O I
10.1002/cbdv.200790212
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Despite the growing number of genomes published or currently being sequenced, there is a relative paucity of software for functional classification of newly discovered genes and their assignment to metabolic pathways. Available software for such analyses has a very steep learning curve and requires the installation, configuration, and maintenance of large amounts of complex infrastructure, including complementary software and databases. Many such tools are restricted to one or a few data sources and classification schemes. In this work, we report an automated system for gene annotation and metabolic pathway reconstruction (ASGARD), which was designed to be powerful and generalizable, yet simple for the biologist to install and run on centralized, commonly available computers. It avoids the requirement for complex resources such as relational databases and web servers, as well as the need for administrator access to the operating system. Our methodology contributes to a more rapid investigation of the potential biochemical capabilities of genes and genomes by the biological researcher, and is useful in biochemical as well as comparative and evolutionary studies of pathways and networks.
引用
收藏
页码:2593 / 2602
页数:10
相关论文
共 26 条
[1]   Complete genome sequence of the apicomplexan, Cryptosporidium parvum [J].
Abrahamsen, MS ;
Templeton, TJ ;
Enomoto, S ;
Abrahante, JE ;
Zhu, G ;
Lancto, CA ;
Deng, MQ ;
Liu, C ;
Widmer, G ;
Tzipori, S ;
Buck, GA ;
Xu, P ;
Bankier, AT ;
Dear, PH ;
Konfortov, BA ;
Spriggs, HF ;
Iyer, L ;
Anantharaman, V ;
Aravind, L ;
Kapur, V .
SCIENCE, 2004, 304 (5669) :441-445
[2]   Genome sequence of Streptococcus mutans UA159, a cariogenic dental pathogen [J].
Ajdic, D ;
McShan, WM ;
McLaughlin, RE ;
Savic, G ;
Chang, J ;
Carson, MB ;
Primeaux, C ;
Tian, RY ;
Kenton, S ;
Jia, HG ;
Lin, SP ;
Qian, YD ;
Li, SL ;
Zhu, H ;
Najar, F ;
Lai, HS ;
White, J ;
Roe, BA ;
Ferretti, JJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (22) :14434-14439
[3]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[4]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkh131, 10.1093/nar/gkw1099]
[5]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[6]   The ENZYME database in 2000 [J].
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :304-305
[7]   The genome of the African trypanosome Trypanosoma brucei [J].
Berriman, M ;
Ghedin, E ;
Hertz-Fowler, C ;
Blandin, G ;
Renauld, H ;
Bartholomeu, DC ;
Lennard, NJ ;
Caler, E ;
Hamlin, NE ;
Haas, B ;
Böhme, W ;
Hannick, L ;
Aslett, MA ;
Shallom, J ;
Marcello, L ;
Hou, LH ;
Wickstead, B ;
Alsmark, UCM ;
Arrowsmith, C ;
Atkin, RJ ;
Barron, AJ ;
Bringaud, F ;
Brooks, K ;
Carrington, M ;
Cherevach, I ;
Chillingworth, TJ ;
Churcher, C ;
Clark, LN ;
Corton, CH ;
Cronin, A ;
Davies, RM ;
Doggett, J ;
Djikeng, A ;
Feldblyum, T ;
Field, MC ;
Fraser, A ;
Goodhead, I ;
Hance, Z ;
Harper, D ;
Harris, BR ;
Hauser, H ;
Hostetter, J ;
Ivens, A ;
Jagels, K ;
Johnson, D ;
Johnson, J ;
Jones, K ;
Kerhornou, AX ;
Koo, H ;
Larke, N .
SCIENCE, 2005, 309 (5733) :416-422
[8]   ArrayXPath: mapping and visualizing microarray gene-expression data with integrated biological pathway resources using Scalable Vector Graphics [J].
Chung, HJ ;
Kim, M ;
Park, CH ;
Kim, J ;
Kim, JH .
NUCLEIC ACIDS RESEARCH, 2004, 32 :W460-W464
[9]   The Ensembl automatic gene annotation system [J].
Curwen, V ;
Eyras, E ;
Andrews, TD ;
Clarke, L ;
Mongin, E ;
Searle, SMJ ;
Clamp, M .
GENOME RESEARCH, 2004, 14 (05) :942-950
[10]   The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease [J].
El-Sayed, NM ;
Myler, PJ ;
Bartholomeu, DC ;
Nilsson, D ;
Aggarwal, G ;
Tran, AN ;
Ghedin, E ;
Worthey, EA ;
Delcher, AL ;
Blandin, G ;
Westenberger, SJ ;
Caler, E ;
Cerqueira, GC ;
Branche, C ;
Haas, B ;
Anupama, A ;
Arner, E ;
Åslund, L ;
Attipoe, P ;
Bontempi, E ;
Bringaud, F ;
Burton, P ;
Cadag, E ;
Campbell, DA ;
Carrington, M ;
Crabtree, J ;
Darban, H ;
da Silveira, JF ;
de Jong, P ;
Edwards, K ;
Englund, PT ;
Fazelina, G ;
Feldblyum, T ;
Ferella, M ;
Frasch, AC ;
Gull, K ;
Horn, D ;
Hou, LH ;
Huang, YT ;
Kindlund, E ;
Ktingbeil, M ;
Kluge, S ;
Koo, H ;
Lacerda, D ;
Levin, MJ ;
Lorenzi, H ;
Louie, T ;
Machado, CR ;
McCulloch, R ;
McKenna, A .
SCIENCE, 2005, 309 (5733) :409-415