On Bayesian classification with Laplace priors

被引:36
作者
Kaban, Ata [1 ]
机构
[1] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
关键词
Laplace prior; variational Bayes; sparsity; shrinkage effect; predictive features; microarray gene expressions;
D O I
10.1016/j.patrec.2007.02.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new classification approach, using a variational Bayesian estimation of probit regression with Laplace priors. Laplace priors have been previously used extensively as a sparsity-inducing mechanism to perform feature selection simultaneously with classification or regression. However, contrarily to the 'myth' of sparse Bayesian learning with Laplace priors, we find that the sparsity effect is due to a property of the maximum a posteriori (MAP) parameter estimates only. The Bayesian estimates, in turn, induce a posterior weighting rather than a hard selection of features, and has different advantageous properties: (1) it provides better estimates of the prediction uncertainty; (2) it is able to retain correlated features favouring generalisation; (3) it is more stable with respect to the hyperparameter choice and (4) it produces a weight-based ranking of the features, suited for interpretation.. We analyse the behaviour of the Bayesian estimate in comparison with its MAP counterpart, as well as other related models, (a) through a graphical interpretation of the associated shrinkage and (b) by controlled numerical simulations in a range of testing conditions. The results pinpoint the situations when the advantages of Bayesian estimates are feasible to exploit. Finally, we demonstrate the working of our method in a gene expression classification task. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:1271 / 1282
页数:12
相关论文
共 25 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
[Anonymous], 2000, VARIATIONAL RELEVANC
[3]  
Bernardo J., 2009, Bayesian theory
[4]  
Cawley G. C., 2006, GENE SELECTION CANC
[5]  
Chu W., 2005, BIOINFORMATICS
[6]  
DIAZURIATE R, 2005, DATA ANAL VISUALISAT
[7]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[8]  
Fawcett T., 2004, ROC graphs: notes and practical considerations for researchers
[9]  
FIGUEIREDO MAT, 2003, IEEE T PATTERN ANAL, V25
[10]   Regularization with a pruning prior [J].
Goutte, C ;
Hansen, LK .
NEURAL NETWORKS, 1997, 10 (06) :1053-1059