Gene mining: a novel and powerful ensemble decision approach to hunting for disease genes using microarray expression profiling

被引:76
作者
Li, X
Rao, SQ
Wang, YD
Gong, BS
机构
[1] Cleveland Clin Fdn, Dept Mol Cardiol, Cleveland, OH 44195 USA
[2] Harbin Med Univ, Dept Biomed Engn Biomath & Bioinformat, Harbin 150086, Peoples R China
[3] Harbin Inst Technol, Dept Comp Sci, Harbin 150001, Peoples R China
[4] Cleveland Clin Fdn, Dept Cardiovasc Med, Cleveland, OH 44195 USA
基金
中国国家自然科学基金;
关键词
D O I
10.1093/nar/gkh563
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Current applications of microarrays focus on precise classification or discovery of biological types, for example tumor versus normal phenotypes in cancer research. Several challenging scientific tasks in the post-genomic epoch, like hunting for the genes underlying complex diseases from genome-wide gene expression profiles and thereby building the corresponding gene networks, are largely overlooked because of the lack of an efficient analysis approach. We have thus developed an innovative ensemble decision approach, which can efficiently perform multiple gene mining tasks. An application of this approach to analyze two publicly available data sets (colon data and leukemia data) identified 20 highly significant colon cancer genes and 23 highly significant molecular signatures for refining the acute leukemia phenotype, most of which have been verified either by biological experiments or by alternative analysis approaches. Furthermore, the globally optimal gene subsets identified by the novel approach have so far achieved the highest accuracy for classification of colon cancer tissue types. Establishment of this analysis strategy has offered the promise of advancing microarray technology as a means of deciphering the involved genetic complexities of complex diseases.
引用
收藏
页码:2685 / 2694
页数:10
相关论文
共 37 条
[21]   REGULATION OF THE MESSENGER-RNA FOR MONOCYTE-DERIVED NEUTROPHIL-ACTIVATING PEPTIDE IN DIFFERENTIATING HL60 PROMYELOCYTES [J].
KOWALSKI, J ;
DENHARDT, DT .
MOLECULAR AND CELLULAR BIOLOGY, 1989, 9 (05) :1946-1957
[22]   Characterization of the recurrent translocation t(1;1)(p36.3;q21.1-2) in non-Hodgkin lymphoma by multicolor banding and fluorescence in situ hybridization analysis [J].
Lestou, VS ;
Ludkovski, O ;
Connors, JM ;
Gascoyne, RD ;
Lam, WL ;
Horsman, DE .
GENES CHROMOSOMES & CANCER, 2003, 36 (04) :375-381
[23]   Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method [J].
Li, LP ;
Weinberg, CR ;
Darden, TA ;
Pedersen, LG .
BIOINFORMATICS, 2001, 17 (12) :1131-1142
[24]  
Li LP, 2001, COMB CHEM HIGH T SCR, V4, P727
[25]  
Li Zhi-gang, 2003, Zhonghua Xue Ye Xue Za Zhi, V24, P256
[26]   A new approach for filtering noise from high-density oligonucleotide microarray datasets [J].
Mills, JC ;
Gordon, JI .
NUCLEIC ACIDS RESEARCH, 2001, 29 (15) :art. no.-e72
[27]   Classification methods for confronting heterogeneity [J].
Province, MA ;
Shannon, WD ;
Rao, DC .
GENETIC DISSECTION OF COMPLEX TRAITS, 2001, 42 :273-286
[28]   Tree-based recursive partitioning methods for subdividing sibpairs into relatively more homogeneous subgroups [J].
Shannon, WD ;
Province, MA ;
Rao, DC .
GENETIC EPIDEMIOLOGY, 2001, 20 (03) :293-306
[29]   RankGene: identification of diagnostic genes based on expression data [J].
Su, Y ;
Murali, TM ;
Pavlovic, V ;
Schaffer, M ;
Kasif, S .
BIOINFORMATICS, 2003, 19 (12) :1578-1579
[30]   A mathematical programming approach for gene selection and tissue classification [J].
Sun, MH ;
Xiong, MM .
BIOINFORMATICS, 2003, 19 (10) :1243-1251