Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study

被引:9
作者
Costanzo, Maria C. [1 ]
Park, Julie [1 ]
Balakrishnan, Rama [1 ]
Cherry, J. Michael [1 ]
Hong, Eurie L. [1 ]
机构
[1] Stanford Univ, Dept Genet, Sch Med, Stanford, CA 94305 USA
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2011年
基金
美国国家卫生研究院;
关键词
MANUAL CURATION; INTERPRO; GENOME;
D O I
10.1093/database/bar004
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Annotation using Gene Ontology (GO) terms is one of the most important ways in which biological information about specific gene products can be expressed in a searchable, computable form that may be compared across genomes and organisms. Because literature-based GO annotations are often used to propagate functional predictions between related proteins, their accuracy is critically important. We present a strategy that employs a comparison of literature-based annotations with computational predictions to identify and prioritize genes whose annotations need review. Using this method, we show that comparison of manually assigned 'unknown' annotations in the Saccharomyces Genome Database (SGD) with InterPro-based predictions can identify annotations that need to be updated. A survey of literature-based annotations and computational predictions made by the Gene Ontology Annotation (GOA) project at the European Bioinformatics Institute (EBI) across several other databases shows that this comparison strategy could be used to maintain and improve the quality of GO annotations for other organisms besides yeast. The survey also shows that although GOA-assigned predictions are the most comprehensive source of functional information for many genomes, a large proportion of genes in a variety of different organisms entirely lack these predictions but do have manual annotations. This underscores the critical need for manually performed, literature-based curation to provide functional information about genes that are outside the scope of widely used computational methods. Thus, the combination of manual and computational methods is essential to provide the most accurate and complete functional annotation of a genome.
引用
收藏
页数:8
相关论文
共 28 条
  • [1] Ashburner M, 2001, GENOME RES, V11, P1425, DOI 10.1101/gr.180801
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] The GOA database in 2009-an integrated Gene Ontology Annotation resource
    Barrell, Daniel
    Dimmer, Emily
    Huntley, Rachael P.
    Binns, David
    O'Donovan, Claire
    Apweiler, Rolf
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D396 - D403
  • [4] Manual curation is not sufficient for annotation of genomic databases
    Baumgartner, William A., Jr.
    Cohen, K. Bretonnel
    Fox, Lynne M.
    Acquaah-Mensah, George
    Hunter, Lawrence
    [J]. BIOINFORMATICS, 2007, 23 (13) : I41 - I48
  • [5] The Gene Ontology in 2010: extensions and refinements The Gene Ontology Consortium
    Berardini, Tanya Z.
    Li, Donghui
    Huala, Eva
    Bridges, Susan
    Burgess, Shane
    McCarthy, Fiona
    Carbon, Seth
    Lewis, Suzanna E.
    Mungall, Christopher J.
    Abdulla, Amina
    Wood, Valerie
    Feltrin, Erika
    Valle, Giorgio
    Chisholm, Rex L.
    Fey, Petra
    Gaudet, Pascale
    Kibbe, Warren
    Basu, Siddhartha
    Bushmanova, Yulia
    Eilbeck, Karen
    Siegele, Deborah A.
    McIntosh, Brenley
    Renfro, Daniel
    Zweifel, Adrienne
    Hu, James C.
    Ashburner, Michael
    Tweedie, Susan
    Alam-Faruque, Yasmin
    Apweiler, Rolf
    Auchinchloss, Andrea
    Bairoch, Amos
    Barrell, Daniel
    Binns, David
    Blatter, Marie-Claude
    Bougueleret, Lydie
    Boutet, Emmanuel
    Breuza, Lionel
    Bridge, Alan
    Browne, Paul
    Chan, Wei Mun
    Coudert, Elizabeth
    Daugherty, Louise
    Dimmer, Emily
    Eberhardt, Ruth
    Estreicher, Anne
    Famiglietti, Livia
    Ferro-Rojas, Serenella
    Feuermann, Marc
    Foulger, Rebecca
    Gruaz-Gumowski, Nadine
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : D331 - D335
  • [6] A biocurator perspective: Annotation at the research collaboratory for structural bioinformatics protein data bank
    Burkhardt, Kyle
    Schneider, Bohdan
    Ory, Jeramia
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2006, 2 (10) : 1186 - 1189
  • [7] Camon EB, 2005, BMC BIOINFORMATICS, V6, DOI 10.1186/1471-2105-6-S1-S17
  • [8] Automatic document classification of biological literature
    Chen, David
    Muller, Hans-Michael
    Sternberg, Paul W.
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [9] Functional annotations for the Saccharomyces cerevisiae genome: the knowns and the known unknowns
    Christie, Karen R.
    Hong, Eurie L.
    Cherry, J. Michael
    [J]. TRENDS IN MICROBIOLOGY, 2009, 17 (07) : 286 - 294
  • [10] New mutant phenotype data curation system in the Saccharomyces Genome Database
    Costanzo, Maria C.
    Skrzypek, Marek S.
    Nash, Robert
    Wong, Edith
    Binkley, Gail
    Engel, Stacia R.
    Hitz, Benjamin
    Hong, Eurie L.
    Cherry, J. Michael
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2009,