Gene classification using expression profiles: A feasibility study

被引:9
作者
Kuramochi, M [1 ]
Karypis, G [1 ]
机构
[1] Univ Minnesota, Dept Comp Sci, Army HPS Res Ctr, Minneapolis, MN 55455 USA
来源
2ND ANNUAL IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS | 2001年
关键词
D O I
10.1109/BIBE.2001.974429
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
As various genome sequencing projects have already been completed or are near completion, genome researchers are shifting their focus to functional genomics. Functional genomics represents the next phase, that expands the biological investigation to studying the functionality of genes of a single organism as well as studying and correlating the functionality of genes across many different organisms. Recently developed methods for monitoring genome-wide mRNA expression changes hold the promise of allowing us to inexpensively gain insights into the function of unknown genes. In this paper we focus on evaluating the feasibility of using supervised machine learning methods for determining the function of genes based solely on their expression profiles. We experimentally evaluate the performance of traditional classification algorithms such as support vector machines and k-nearest neighbors on the yeast genome, and present new approaches for classification that improve the overall recall with moderate reductions in precision. Our experiments show that the accuracies achieved for different classes varies dramatically. In analyzing these results we show that the achieved accuracy is highly dependent on whether or not the genes of that class were significantly active during the various experimental conditions, suggesting that gene expression profiles can become a viable alternative to sequence similarity searches provided that the genes are observed under a wide range of experimental conditions.
引用
收藏
页码:191 / 200
页数:10
相关论文
共 21 条
[1]  
[Anonymous], 1999, REPOSIT TU DORTMUND, DOI DOI 10.17877/DE290R-5098
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[4]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[5]  
Dasarathy B.V., 1991, IEEE COMPUTER SOC TU
[6]  
Eisen M, CLUSTER ANAL DISPLAY
[7]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[8]   MULTIPLEXED BIOCHEMICAL ASSAYS WITH BIOLOGICAL CHIPS [J].
FODOR, SPA ;
RAVA, RP ;
HUANG, XHC ;
PEASE, AC ;
HOLMES, CP ;
ADAMS, CL .
NATURE, 1993, 364 (6437) :555-556
[9]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[10]  
HVIDSTEN TR, 2001, P PAC S BIOC HAW JAN