Predicting phenotype from patterns of annotation

被引:20
作者
King, Oliver D. [1 ]
Lee, Jeffrey C. [1 ]
Dudley, Aimee M. [2 ]
Janse, Daniel M. [2 ]
Church, George M. [2 ]
Roth, Frederick P. [1 ]
机构
[1] Harvard Univ, Sch Med, Dept Biol Chem & Mol Pharmacol, Boston, MA 02115 USA
[2] Harvard Univ, Sch Med, Dept Genet, Boston, MA 02115 USA
关键词
decision trees; phenotype; gene function;
D O I
10.1093/bioinformatics/btg1024
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Predicting the outcome of specific experiments (such as the growth of a particular mutant strain in a particular medium) has the potential to allow researchers to devote resources to experiments with higher expected numbers of 'hits'. Results: We use decision trees to predict phenotypes associated with Saccharomyces cerevisiae genes on the basis of Gene Ontology (GO) functional annotations from the Saccharomyces Genome Database (SGD) and other phenotypic annotations from the Yeast Phenotype Catalog at the Munich Information Center for Protein Sequences (MIPS). We assess the methodology in three ways: (1) we use cross-validation on the phenotypic annotations listed in MIPS, and show ROC curves indicating the tradeoff between true-positive rate and false-positive rate; (2) we do a literature-search for 100 of the predicted gene-phenotype associations that are not listed in MIPS, and find evidence for 43 of them; (3) we use deletion strains to experimentally assess 61 predicted gene-phenotype associations not listed in MIPS; significantly more of these deletion strains show abnormal growth than would be expected by chance.
引用
收藏
页码:i183 / i189
页数:7
相关论文
共 20 条
  • [1] [Anonymous], 1998, MSRTR9812
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] Large-scale phenotypic analysis reveals identical contributions to cell functions of known and unknown yeast genes
    Bianchi, MM
    Ngo, S
    Vandenbol, M
    Sartori, G
    Morlupi, A
    Ricci, C
    Stefani, S
    Morlino, GB
    Hilger, F
    Carignani, G
    Slonimski, PP
    Frontali, L
    [J]. YEAST, 2001, 18 (15) : 1397 - 1412
  • [4] BREIMAN L, 1984, CELL, V86, P297
  • [5] SGD:: Saccharomyces Genome Database
    Cherry, JM
    Adler, C
    Ball, C
    Chervitz, SA
    Dwight, SS
    Hester, ET
    Jia, YK
    Juvik, G
    Roe, T
    Schroeder, M
    Weng, SA
    Botstein, D
    [J]. NUCLEIC ACIDS RESEARCH, 1998, 26 (01) : 73 - 79
  • [6] Machine learning of functional class from phenotype data
    Clare, A
    King, RD
    [J]. BIOINFORMATICS, 2002, 18 (01) : 160 - 166
  • [7] YPD™, PombePD™ and WormPD™:: model organism volumes of the BioKnowledge™ Library, an integrated resource for protein information
    Costanzo, MC
    Crawford, ME
    Hirschman, JE
    Kranz, JE
    Olsen, P
    Robertson, LS
    Skrzypek, MS
    Braun, BR
    Hopkins, KL
    Kondu, P
    Lengieza, C
    Lew-Smith, JE
    Tillberg, M
    Garrels, JI
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 75 - 79
  • [8] Cover TM, 2006, Elements of Information Theory
  • [9] Ewans W, 2001, STAT METHODS BIOINFO
  • [10] Friedman N, 1996, P 12 C UNC ART INT, P252