MIPS bacterial genomes functional annotation benchmark dataset

被引:10
作者
Tetko, IV [1 ]
Brauner, B [1 ]
Dunger-Kaltenbach, I [1 ]
Frishman, G [1 ]
Montrone, C [1 ]
Fobo, G [1 ]
Ruepp, A [1 ]
Antonov, AV [1 ]
Surmeli, D [1 ]
Mewes, HW [1 ]
机构
[1] GSF, Natl Res Ctr Environm & Hlth, Inst Bioinformat MIPS, D-85764 Neuherberg, Germany
关键词
D O I
10.1093/bioinformatics/bti380
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. Results: The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation.
引用
收藏
页码:2520 / 2521
页数:2
相关论文
共 10 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Prediction of Saccharomyces cerevisiae protein functional class from functional domain composition
    Cai, YD
    Doig, AJ
    [J]. BIOINFORMATICS, 2004, 20 (08) : 1292 - 1300
  • [3] Predicting gene function in Saccharomyces cerevisiae
    Clare, A.
    King, R. D.
    [J]. BIOINFORMATICS, 2003, 19 : II42 - II49
  • [4] Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons
    Mateos, A
    Dopazo, J
    Jansen, R
    Tu, YH
    Gerstein, M
    Stolovitzky, G
    [J]. GENOME RESEARCH, 2002, 12 (11) : 1703 - 1715
  • [5] MIPS:: analysis and annotation of proteins from whole genomes
    Mewes, HW
    Amid, C
    Arnold, R
    Frishman, D
    Güldener, U
    Mannhaupt, G
    Münsterkötter, M
    Pagel, P
    Strack, N
    Stümpflen, V
    Warfsmann, J
    Ruepp, A
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D41 - D44
  • [6] Assessment of genome-wide protein function classification for Drosophila melanogaster
    Mi, HY
    Vandergriff, J
    Campbell, M
    Narechania, A
    Majoros, W
    Lewis, S
    Thomas, PD
    Ashburner, M
    [J]. GENOME RESEARCH, 2003, 13 (09) : 2118 - 2128
  • [7] InterPro, progress and status in 2005
    Mulder, NJ
    Apweiler, R
    Attwood, TK
    Bairoch, A
    Bateman, A
    Binns, D
    Bradley, P
    Bork, P
    Bucher, P
    Cerutti, L
    Copley, R
    Courcelle, E
    Das, U
    Durbin, R
    Fleischmann, W
    Gough, J
    Haft, D
    Harte, N
    Hulo, N
    Kahn, D
    Kanapin, A
    Krestyaninova, M
    Lonsdale, D
    Lopez, R
    Letunic, I
    Madera, M
    Maslen, J
    McDowall, J
    Mitchell, A
    Nikolskaya, AN
    Orchard, S
    Pagni, M
    Pointing, CP
    Quevillon, E
    Selengut, J
    Sigrist, CJA
    Silventoinen, V
    Studholme, DJ
    Vaughan, R
    Wu, CH
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D201 - D205
  • [8] EXPERT SYSTEM FOR PREDICTING PROTEIN LOCALIZATION SITES IN GRAM-NEGATIVE BACTERIA
    NAKAI, K
    KANEHISA, M
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1991, 11 (02): : 95 - 110
  • [9] Pearson WR, 1996, METHOD ENZYMOL, V266, P227
  • [10] The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes
    Ruepp, A
    Zollner, A
    Maier, D
    Albermann, K
    Hani, J
    Mokrejs, M
    Tetko, I
    Güldener, U
    Mannhaupt, G
    Münsterkötter, M
    Mewes, HW
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 (18) : 5539 - 5545