Knowledge-based analysis of microarray gene expression data by using support vector machines

被引:1495
作者
Brown, MPS
Grundy, WN
Lin, D
Cristianini, N
Sugnet, CW
Furey, TS
Ares, M
Haussler, D
机构
[1] Univ Calif Santa Cruz, Dept Comp Sci, Santa Cruz, CA 95064 USA
[2] Univ Calif Santa Cruz, Ctr Mol Biol RNA, Dept Biol, Santa Cruz, CA 95064 USA
[3] Columbia Univ, Dept Comp Sci, New York, NY 10025 USA
[4] Univ Bristol, Dept Engn Math, Bristol BS8 1TR, Avon, England
关键词
D O I
10.1073/pnas.97.1.262
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
引用
收藏
页码:262 / 267
页数:6
相关论文
共 32 条
  • [1] ATP synthase of yeast mitochondria -: Isolation of subunit j and disruption of the ATP18 gene
    Arnold, I
    Pfeiffer, K
    Neupert, W
    Stuart, RA
    Schägger, H
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 1999, 274 (01) : 36 - 40
  • [2] Bishop C. M., 1995, NEURAL NETWORKS PATT
  • [3] A tutorial on Support Vector Machines for pattern recognition
    Burges, CJC
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) : 121 - 167
  • [4] The transcriptional program of sporulation in budding yeast
    Chu, S
    DeRisi, J
    Eisen, M
    Mulholland, J
    Botstein, D
    Brown, PO
    Herskowitz, I
    [J]. SCIENCE, 1998, 282 (5389) : 699 - 705
  • [5] Exploring the metabolic and genetic control of gene expression on a genomic scale
    DeRisi, JL
    Iyer, VR
    Brown, PO
    [J]. SCIENCE, 1997, 278 (5338) : 680 - 686
  • [6] QSR1, an essential yeast gene with a genetic relationship to a subunit of the mitochondrial cytochrome bc(1) complex, codes for a 60 S ribosomal subunit protein
    Dick, FA
    Karamanou, S
    Trumpower, BL
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 1997, 272 (20) : 13372 - 13379
  • [7] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [8] Son1p is a component of the 26S proteasome of the yeast Saccharomyces cerevisiae
    Fujimuro, M
    Tanaka, K
    Yokosawa, H
    Toh-e, A
    [J]. FEBS LETTERS, 1998, 423 (02) : 149 - 154
  • [9] *GARR GRISH, 1995, BIOCH, P619
  • [10] The yeast nascent polypeptide-associated complex initiates protein targeting to mitochondria in vivo
    George, R
    Beddoe, T
    Landl, K
    Lithgow, T
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (05) : 2296 - 2301