Multi-class cancer classification using multinomial probit regression with Bayesian gene selection

被引:18
作者
Zhou, X
Wang, X [1 ]
Dougherty, ER
机构
[1] Texas A&M Univ, Dept Elect Engn, College Stn, TX 77843 USA
[2] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
[3] Univ Texas, MD Anderson Canc Ctr, Dept Pathol, Houston, TX 77030 USA
来源
IEE PROCEEDINGS SYSTEMS BIOLOGY | 2006年 / 153卷 / 02期
关键词
D O I
10.1049/ip-syb:20050015
中图分类号
Q2 [细胞生物学];
学科分类号
071009 [细胞生物学]; 090102 [作物遗传育种];
摘要
We consider the problems of multi-class cancer classification from gene expression data. After discussing the multinomial probit regression model with Bayesian gene selection, we propose two Bayesian gene selection schemes: one employs different strongest genes for different probit regressions; the other employs the same strongest genes for all regressions. Some fast implementation issues for Bayesian gene selection are discussed, including preselection of the strongest genes and recursive computation of the estimation errors using QR decomposition. The proposed gene selection techniques are applied to analyse real breast cancer data, small round blue-cell tumours, the national cancer institute's anti-cancer drug-screen data and acute leukaemia data. Compared with existing multi-class cancer classifications, our proposed methods can find which genes are the most important genes affecting which kind of cancer. Also, the strongest genes selected using our methods are consistent with the biological significance. The recognition accuracies are very high using our proposed methods.
引用
收藏
页码:70 / 78
页数:9
相关论文
共 21 条
[1]
BAYESIAN-ANALYSIS OF BINARY AND POLYCHOTOMOUS RESPONSE DATA [J].
ALBERT, JH ;
CHIB, S .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (422) :669-679
[2]
Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[3]
CHIPMAN H, 2001, IMS LECT NOTES MONOG, V38, P117
[4]
Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[5]
Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[6]
Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[7]
Gene-expression profiles in hereditary breast cancer. [J].
Hedenfalk, I ;
Duggan, D ;
Chen, YD ;
Radmacher, M ;
Bittner, M ;
Simon, R ;
Meltzer, P ;
Gusterson, B ;
Esteller, M ;
Kallioniemi, OP ;
Wilfond, B ;
Borg, Å ;
Trent, J ;
Raffeld, M ;
Yakhini, Z ;
Ben-Dor, A ;
Dougherty, E ;
Kononen, J ;
Bubendorf, L ;
Fehrle, W ;
Pittaluga, S ;
Gruvberger, S ;
Loman, N ;
Johannsoson, O ;
Olsson, H ;
Sauter, G .
NEW ENGLAND JOURNAL OF MEDICINE, 2001, 344 (08) :539-548
[8]
Exploring expression data: Identification and analysis of coexpressed genes [J].
Heyer, LJ ;
Kruglyak, S ;
Yooseph, S .
GENOME RESEARCH, 1999, 9 (11) :1106-1115
[9]
A Bayesian analysis of the multinomial probit model using marginal data augmentation [J].
Imai, K ;
van Dyk, DA .
JOURNAL OF ECONOMETRICS, 2005, 124 (02) :311-334
[10]
JORNSTEN R, 2003, BIOINFORMATICS