Independent component analysis-based penalized discriminant method for tumor classification using gene expression data

被引:239
作者
Huang, De-Shuang [1 ]
Zheng, Chun-Hou [1 ]
机构
[1] Chinese Acad Sci, Inst Intelligent Machines, Intelligent Comp Lab, Hefei 230031, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1093/bioinformatics/btl190
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Microarrays are capable of determining the expression levels of thousands of genes simultaneously. One important application of gene expression data is classification of samples into categories. In combination with classification methods, this technology can be useful to support clinical management decisions for individual patients, e. g. in oncology. Standard statistic methodologies in classification or prediction do not work well when the number of variables p (genes) far too exceeds the number of samples n. So, modification of existing statistical methodologies or development of new methodologies is needed for the analysis of microarray data. Results: This paper proposes a new method for tumor classification using gene expression data. In this method, we first employ independent component analysis to model the gene expression data, then apply optimal scoring algorithm to classify them. Further speaking, this approach can first make full use of the high-order statistical information contained in the gene expression data. Second, this approach also employs regularized regression models to handle the situation of large numbers of correlated predictor variables. Finally, the predictive models are developed for classifying tumors based on the entire gene expression profile. To show the validity of the proposed method, we apply it to classify four DNA microarray datasets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible.
引用
收藏
页码:1855 / 1862
页数:8
相关论文
共 34 条
[31]   Predicting the clinical status of human breast cancer by using gene expression profiles [J].
West, M ;
Blanchette, C ;
Dressman, H ;
Huang, E ;
Ishida, S ;
Spang, R ;
Zuzan, H ;
Olson, JA ;
Marks, JR ;
Nevins, JR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (20) :11462-11467
[32]  
West M, 2003, BAYESIAN STATISTICS 7, P733
[33]   Nonnegative independent component analysis based on minimizing. mutual information technique [J].
Zheng, CH ;
Huang, DS ;
Sun, ZL ;
Lyu, MR ;
Lok, TM .
NEUROCOMPUTING, 2006, 69 (7-9) :878-883
[34]  
Zheng CH, 2005, LECT NOTES COMPUT SC, V3497, P478