Finding function: evaluation methods for functional genomic data

被引:154
作者
Myers, Chad L.
Barrett, Daniel R.
Hibbs, Matthew A.
Huttenhower, Curtis
Troyanskaya, Olga G. [1 ]
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
[2] Princeton Univ, Lewis Sigler Inst Integrat Genom, Princeton, NJ 08544 USA
关键词
D O I
10.1186/1471-2164-7-187
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Accurate evaluation of the quality of genomic or proteomic data and computational methods is vital to our ability to use them for formulating novel biological hypotheses and directing further experiments. There is currently no standard approach to evaluation in functional genomics. Our analysis of existing approaches shows that they are inconsistent and contain substantial functional biases that render the resulting evaluations misleading both quantitatively and qualitatively. These problems make it essentially impossible to compare computational methods or large-scale experimental datasets and also result in conclusions that generalize poorly in most biological applications. Results: We reveal issues with current evaluation methods here and suggest new approaches to evaluation that facilitate accurate and representative characterization of genomic methods and data. Specifically, we describe a functional genomics gold standard based on curation by expert biologists and demonstrate its use as an effective means of evaluation of genomic approaches. Our evaluation framework and gold standard are freely available to the community through our website. Conclusion: Proper methods for evaluating genomic data and computational approaches will determine how much we, as a community, are able to learn from the wealth of available data. We propose one possible solution to this problem here but emphasize that this topic warrants broader community discussion.
引用
收藏
页数:15
相关论文
共 37 条
[1]   The Biomolecular Interaction Network Database and related tools 2005 update [J].
Alfarano, C ;
Andrade, CE ;
Anthony, K ;
Bahroos, N ;
Bajec, M ;
Bantoft, K ;
Betel, D ;
Bobechko, B ;
Boutilier, K ;
Burgess, E ;
Buzadzija, K ;
Cavero, R ;
D'Abreo, C ;
Donaldson, I ;
Dorairajoo, D ;
Dumontier, MJ ;
Dumontier, MR ;
Earles, V ;
Farrall, R ;
Feldman, H ;
Garderman, E ;
Gong, Y ;
Gonzaga, R ;
Grytsan, V ;
Gryz, E ;
Gu, V ;
Haldorsen, E ;
Halupa, A ;
Haw, R ;
Hrvojic, A ;
Hurrell, L ;
Isserlin, R ;
Jack, F ;
Juma, F ;
Khan, A ;
Kon, T ;
Konopinsky, S ;
Le, V ;
Lee, E ;
Ling, S ;
Magidin, M ;
Moniakis, J ;
Montojo, J ;
Moore, S ;
Muskat, B ;
Ng, I ;
Paraiso, JP ;
Parker, B ;
Pintilie, G ;
Pirone, R .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D418-D424
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   Integrating functional genomic information into the Saccharomyces genome database [J].
Ball, CA ;
Dolinski, K ;
Dwight, SS ;
Harris, MA ;
Issel-Tarver, L ;
Kasarskis, A ;
Scafe, CR ;
Sherlock, G ;
Binkley, G ;
Jin, H ;
Kaloper, M ;
Orr, SD ;
Schroeder, M ;
Weng, S ;
Zhu, Y ;
Botstein, D ;
Cherry, JM .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :77-80
[4]  
Barutcuoglu Z., 2006, BIOINFORMATICS
[5]   Choosing negative examples for the prediction of protein-protein interactions [J].
Ben-Hur, A ;
Noble, WS .
BMC BIOINFORMATICS, 2006, 7 (Suppl 1)
[6]   Kernel methods for predicting protein-protein interactions [J].
Ben-Hur, A ;
Noble, WS .
BIOINFORMATICS, 2005, 21 :I38-I46
[7]   Protein interaction networks from yeast to human [J].
Bork, P ;
Jensen, LJ ;
von Mering, C ;
Ramani, AK ;
Lee, I ;
Marcotte, EM .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (03) :292-299
[8]   The GRID: The General Repository for Interaction Datasets [J].
Breitkreutz, BJ ;
Stark, C ;
Tyers, M .
GENOME BIOLOGY, 2003, 4 (03)
[9]   Predicting gene function in Saccharomyces cerevisiae [J].
Clare, A. ;
King, R. D. .
BIOINFORMATICS, 2003, 19 :II42-II49
[10]   YPD™, PombePD™ and WormPD™:: model organism volumes of the BioKnowledge™ Library, an integrated resource for protein information [J].
Costanzo, MC ;
Crawford, ME ;
Hirschman, JE ;
Kranz, JE ;
Olsen, P ;
Robertson, LS ;
Skrzypek, MS ;
Braun, BR ;
Hopkins, KL ;
Kondu, P ;
Lengieza, C ;
Lew-Smith, JE ;
Tillberg, M ;
Garrels, JI .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :75-79