Gene mining: a novel and powerful ensemble decision approach to hunting for disease genes using microarray expression profiling

被引:76
作者
Li, X
Rao, SQ
Wang, YD
Gong, BS
机构
[1] Cleveland Clin Fdn, Dept Mol Cardiol, Cleveland, OH 44195 USA
[2] Harbin Med Univ, Dept Biomed Engn Biomath & Bioinformat, Harbin 150086, Peoples R China
[3] Harbin Inst Technol, Dept Comp Sci, Harbin 150001, Peoples R China
[4] Cleveland Clin Fdn, Dept Cardiovasc Med, Cleveland, OH 44195 USA
基金
中国国家自然科学基金;
关键词
D O I
10.1093/nar/gkh563
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Current applications of microarrays focus on precise classification or discovery of biological types, for example tumor versus normal phenotypes in cancer research. Several challenging scientific tasks in the post-genomic epoch, like hunting for the genes underlying complex diseases from genome-wide gene expression profiles and thereby building the corresponding gene networks, are largely overlooked because of the lack of an efficient analysis approach. We have thus developed an innovative ensemble decision approach, which can efficiently perform multiple gene mining tasks. An application of this approach to analyze two publicly available data sets (colon data and leukemia data) identified 20 highly significant colon cancer genes and 23 highly significant molecular signatures for refining the acute leukemia phenotype, most of which have been verified either by biological experiments or by alternative analysis approaches. Furthermore, the globally optimal gene subsets identified by the novel approach have so far achieved the highest accuracy for classification of colon cancer tissue types. Establishment of this analysis strategy has offered the promise of advancing microarray technology as a means of deciphering the involved genetic complexities of complex diseases.
引用
收藏
页码:2685 / 2694
页数:10
相关论文
共 37 条
[11]   Genetic algorithm guided selection: Variable selection and subset selection [J].
Cho, SJ ;
Hermsmeier, MA .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (04) :927-936
[12]   Identifying marker genes in transcription profiling data using a mixture of feature relevance experts [J].
Chow, ML ;
Moler, EJ ;
Mian, IS .
PHYSIOLOGICAL GENOMICS, 2001, 5 (02) :99-111
[13]   Exploring the metabolic and genetic control of gene expression on a genomic scale [J].
DeRisi, JL ;
Iyer, VR ;
Brown, PO .
SCIENCE, 1997, 278 (5338) :680-686
[14]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15
[15]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[16]  
GUO Z, 2001, ANAL MED DATA INTRO
[17]  
Hall M. A., 1998, THESIS U WAIKATO HAM
[18]   INVESTIGATION OF LINKAGE BETWEEN A QUANTITATIVE TRAIT AND A MARKER LOCUS [J].
HASEMAN, JK ;
ELSTON, RC .
BEHAVIOR GENETICS, 1972, 2 (01) :3-19
[19]  
Hastie T, 2001, GENOME BIOL, V2
[20]   Wrappers for feature subset selection [J].
Kohavi, R ;
John, GH .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :273-324