AutoFACT: An (Auto)under-barmatic (F)under-barunctional (A)under-barnnotation and (C)under-barlassification (T)under-barool

被引:159
作者
Koski, LB [1 ]
Gray, MW
Lang, BF
Burger, G
机构
[1] Univ Montreal, Robert Cedergren Ctr Bioinformat & Genom, Montreal, PQ, Canada
[2] Dalhousie Univ, Dept Biochem & Mol Biol, Halifax, NS, Canada
关键词
D O I
10.1186/1471-2105-6-151
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Assignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets. Results: We present AutoFACT, a fully automated and customizable annotation tool that assigns biologically informative functions to a sequence. Key features of this tool are that it ( 1) analyzes nucleotide and protein sequence data; ( 2) determines the most informative functional description by combining multiple BLAST reports from several user-selected databases; ( 3) assigns putative metabolic pathways, functional classes, enzyme classes, GeneOntology terms and locus names; and ( 4) generates output in HTML, text and GFF formats for the user's convenience. We have compared AutoFACT to four well-established annotation pipelines. The error rate of functional annotation is estimated to be only between 1 - 2%. Comparison of AutoFACT to the traditional top-BLAST-hit annotation method shows that our procedure increases the number of functionally informative annotations by approximately 50%. Conclusion: AutoFACT will serve as a useful annotation tool for smaller sequencing groups lacking dedicated bioinformatics staff. It is implemented in PERL and runs on LINUX/ UNIX platforms. AutoFACT is available at http:// megasun. bch. umontreal. ca/ Software/ AutoFACT. htm.
引用
收藏
页数:11
相关论文
共 27 条
[1]   Automatic annotation of protein function based on family identification [J].
Abascal, F ;
Valencia, A .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 (03) :683-692
[2]   A System for Automated Bacterial (genome) Integrated Annotation - SABIA [J].
Almeida, LGP ;
Paixao, R ;
Souza, RC ;
da Costa, GC ;
Barrientos, FJA ;
dos Santos, MT ;
de Almeida, DF ;
Vasconcelos, ATR .
BIOINFORMATICS, 2004, 20 (16) :2832-2833
[3]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]   The genome sequence of Rickettsia prowazekii and the origin of mitochondria [J].
Andersson, SGE ;
Zomorodipour, A ;
Andersson, JO ;
Sicheritz-Pontén, T ;
Alsmark, UCM ;
Podowski, RM ;
Näslund, AK ;
Eriksson, AS ;
Winkler, HH ;
Kurland, CG .
NATURE, 1998, 396 (6707) :133-140
[5]   Automated genome sequence analysis and annotation [J].
Andrade, MA ;
Brown, NP ;
Leroy, C ;
Hoersch, S ;
de Daruvar, A ;
Reich, C ;
Franchini, A ;
Tamames, J ;
Valencia, A ;
Ouzounis, C ;
Sander, C .
BIOINFORMATICS, 1999, 15 (05) :391-412
[6]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[7]   PipeOnline 2.0: automated EST processing and functional data sorting [J].
Ayoubi, P ;
Jin, XJ ;
Leite, S ;
Liu, XH ;
Martajaja, J ;
Abduraham, A ;
Wan, QL ;
Yan, W ;
Misawa, E ;
Prade, RA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (21) :4761-4769
[8]  
Barrett Alan J., 1997, European Journal of Biochemistry, V250, P1
[9]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[10]   GeneWise and genomewise [J].
Birney, E ;
Clamp, M ;
Durbin, R .
GENOME RESEARCH, 2004, 14 (05) :988-995