Intrinsic errors in genome annotation

被引:190
作者
Devos, D [1 ]
Valencia, A [1 ]
机构
[1] CSIC, CNB, Prot Design Grp, E-28049 Madrid, Spain
关键词
D O I
10.1016/S0168-9525(01)02348-4
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome sequencing is usually followed by routine annotation of protein function based on the assumption that similar sequences will have similar functions. Here, we introduce a simple calculation to estimate the magnitude of any possible annotation errors. We counted the number of discrepancies in the annotation of well-established sets of similar proteins and extrapolated these values to the pairs of similar sequences used for the annotation of different microbial genomes. We conclude that the number of potential errors in the prediction of detailed functions is higher than is usually believed.
引用
收藏
页码:429 / 431
页数:3
相关论文
共 22 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Automated genome sequence analysis and annotation
    Andrade, MA
    Brown, NP
    Leroy, C
    Hoersch, S
    de Daruvar, A
    Reich, C
    Franchini, A
    Tamames, J
    Valencia, A
    Ouzounis, C
    Sander, C
    [J]. BIOINFORMATICS, 1999, 15 (05) : 391 - 412
  • [3] [Anonymous], ENZ NOM
  • [4] Errors in genome annotation
    Brenner, SE
    [J]. TRENDS IN GENETICS, 1999, 15 (04) : 132 - 133
  • [5] Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii
    Bult, CJ
    White, O
    Olsen, GJ
    Zhou, LX
    Fleischmann, RD
    Sutton, GG
    Blake, JA
    FitzGerald, LM
    Clayton, RA
    Gocayne, JD
    Kerlavage, AR
    Dougherty, BA
    Tomb, JF
    Adams, MD
    Reich, CI
    Overbeek, R
    Kirkness, EF
    Weinstock, KG
    Merrick, JM
    Glodek, A
    Scott, JL
    Geoghagen, NSM
    Weidman, JF
    Fuhrmann, JL
    Nguyen, D
    Utterback, TR
    Kelley, JM
    Peterson, JD
    Sadow, PW
    Hanna, MC
    Cotton, MD
    Roberts, KM
    Hurst, MA
    Kaine, BP
    Borodovsky, M
    Klenk, HP
    Fraser, CM
    Smith, HO
    Woese, CR
    Venter, JC
    [J]. SCIENCE, 1996, 273 (5278) : 1058 - 1073
  • [6] CHALLENGING TIMES FOR BIOINFORMATICS
    CASARI, G
    ANDRADE, MA
    BORK, P
    BOYLE, J
    DARUVAR, A
    OUZOUNIS, C
    SCHNEIDER, R
    TAMAMES, J
    VALENCIA, A
    SANDER, C
    [J]. NATURE, 1995, 376 (6542) : 647 - 648
  • [7] Re-annotating the Mycoplasma pneumoniae genome sequence:: adding value, function and reading frames
    Dandekar, T
    Huynen, M
    Regula, JT
    Ueberle, B
    Zimmermann, CU
    Andrade, MA
    Doerks, T
    Sánchez-Pulido, L
    Snel, B
    Suyama, M
    Yuan, YP
    Herrmann, R
    Bork, P
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (17) : 3278 - 3288
  • [8] Devos D, 2000, PROTEINS, V41, P98, DOI 10.1002/1097-0134(20001001)41:1<98::AID-PROT120>3.0.CO
  • [9] 2-S
  • [10] WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD
    FLEISCHMANN, RD
    ADAMS, MD
    WHITE, O
    CLAYTON, RA
    KIRKNESS, EF
    KERLAVAGE, AR
    BULT, CJ
    TOMB, JF
    DOUGHERTY, BA
    MERRICK, JM
    MCKENNEY, K
    SUTTON, G
    FITZHUGH, W
    FIELDS, C
    GOCAYNE, JD
    SCOTT, J
    SHIRLEY, R
    LIU, LI
    GLODEK, A
    KELLEY, JM
    WEIDMAN, JF
    PHILLIPS, CA
    SPRIGGS, T
    HEDBLOM, E
    COTTON, MD
    UTTERBACK, TR
    HANNA, MC
    NGUYEN, DT
    SAUDEK, DM
    BRANDON, RC
    FINE, LD
    FRITCHMAN, JL
    FUHRMANN, JL
    GEOGHAGEN, NSM
    GNEHM, CL
    MCDONALD, LA
    SMALL, KV
    FRASER, CM
    SMITH, HO
    VENTER, JC
    [J]. SCIENCE, 1995, 269 (5223) : 496 - 512