Sparse Bayesian learning and the relevance vector machine

被引:4873
作者
Tipping, ME [1 ]
机构
[1] Microsoft Res, Cambridge CB2 3NH, England
关键词
D O I
10.1162/15324430152748236
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classification tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the 'relevance vector machine' (RVM), a model of identical functional form to the popular and state-of-the-art 'support vector machine' (SVM). We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages. These include the benefits of probabilistic predictions, automatic estimation of 'nuisance' parameters, and the facility to utilise arbitrary basis functions (e.g. non-'Mercer' kernels). We detail the Bayesian framework and associated learning algorithm for the RVM, and give some illustrative examples of its application along with some comparative benchmarks. We offer some explanation for the exceptional degree of sparsity obtained, and discuss and demonstrate some of the advantageous features, and potential extensions, of Bayesian relevance learning.
引用
收藏
页码:211 / 244
页数:34
相关论文
共 38 条
[1]  
[Anonymous], P 13 INT C MACH LEAR
[2]  
[Anonymous], 1979, Multivariate analysis
[3]  
Berger O. J., 1985, STAT DECISION THEORY
[4]  
Bernstein D.S., 2000, P 16 C UNCERTAINTY A, P32
[5]  
Bishop C. M., 1995, NEURAL NETWORKS PATT
[6]  
Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
[7]  
Burges CJC, 1997, ADV NEUR IN, V9, P375
[8]  
CHEN S, 1995, 479 STANF U DEP STAT
[9]   MULTIVARIATE ADAPTIVE REGRESSION SPLINES [J].
FRIEDMAN, JH .
ANNALS OF STATISTICS, 1991, 19 (01) :1-67
[10]  
GRANDVALET Y, 1998, PERSPECTIVES NEURAL, P201