On the rate of convergence of regularized boosting classifiers

被引:36
作者
Blanchard, G
Lugosi, G
Vayatis, N
机构
[1] Univ Paris 11, CNRS, Math Lab, F-91405 Orsay, France
[2] Pompeu Fabra Univ, Dept Econ, Barcelona 08005, Spain
[3] Univ Paris 06, Lab Probabilities Modeles Aleatoires, F-75252 Paris 05, France
关键词
classification; boosting; consistency; rates of convergence; decision stumps;
D O I
10.1162/1532443041424319
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A regularized boosting method is introduced, for which regularization is obtained through a penalization function. It is shown through oracle inequalities that this method is model adaptive. The rate of convergence of the probability of misclassification is investigated. It is shown that for quite a large class of distributions, the probability of error converges to the Bayes risk at a rate faster than n(-(V+2)/(4(V+1))) where V is the VC dimension of the "base" class whose elements are combined by boosting methods to obtain an aggregated classifier. The dimension-independent nature of the rates may partially explain the good behavior of these methods in practical problems. Under Tsybakov's noise condition the rate of convergence is even faster. We investigate the conditions necessary to obtain such rates for different base classes. The special case of boosting using decision stumps is studied in detail. We characterize the class of classifiers realizable by aggregating decision stumps. It is shown that some versions of boosting work especially well in high-dimensional logistic additive models. It appears that adding a limited labelling noise to the training data may in certain cases improve the convergence, as has been also suggested by other authors.
引用
收藏
页码:861 / 894
页数:34
相关论文
共 48 条
[1]   UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945
[2]  
BARRON AR, 1992, YAL WORKSH AD LEARN
[3]  
BARTLETT P, 2002, P 15 ANN C COMP LEAR, P44
[4]  
Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690
[5]  
BARTLETT PL, 2003, UNPUB CONVEXITY CLAS
[6]  
BENNETT K, 2002, MACH LEARN, V48, P193
[7]   Minimum contrast estimators on sieves: exponential bounds and rates of convergence [J].
Birge, L ;
Massart, P .
BERNOULLI, 1998, 4 (03) :329-375
[8]  
BLANCHARD G, 2003, UNPUB STAT PERFORMAN
[9]  
BOUSQUET O, 2003, IN PRESS ANN I STAT
[10]  
Breiman L, 1998, ANN STAT, V26, P801