Boosting the margin: A new explanation for the effectiveness of voting methods

被引:9
作者
Schapire, RE
Freund, Y
Bartlett, P
Lee, WS
机构
[1] AT&T Bell Labs, Florham Pk, NJ 07932 USA
[2] Australian Natl Univ, Dept Syst Engn, RSISE, Canberra, ACT 0200, Australia
[3] Australian Def Force Acad, Univ Coll UNSW, Sch Elect Engn, Canberra, ACT 2600, Australia
关键词
ensemble methods; decision trees; neural networks; bagging; boosting; error-correcting; output coding; Markov chain; Monte Carlo;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this phenomenon is related to the distribution of margins of the training examples with respect to the generated voting classification rule, where the margin of an example is simply the difference between the number of correct votes and the maximum number of votes received by any incorrect label. We show that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error. We also show theoretically and experimentally that boosting is especially effective at increasing the margins of the training examples. Finally, we compare our explanation to those based on the bias-variance decomposition.
引用
收藏
页码:1651 / 1686
页数:36
相关论文
共 42 条
[11]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[12]   BOUNDS FOR THE UNIFORM DEVIATION OF EMPIRICAL MEASURES [J].
DEVROYE, L .
JOURNAL OF MULTIVARIATE ANALYSIS, 1982, 12 (01) :72-79
[13]  
Dietterich T. G., 1995, Journal of Artificial Intelligence Research, V2, P263
[14]  
DIETTERICH TG, 1998, UNPUB EXPT COMP 3 ME
[15]  
Donahue MJ, 1997, CONSTR APPROX, V13, P187
[16]  
Drucker H, 1996, ADV NEUR IN, V8, P479
[17]  
DRUCKER H, 1997, MACHINE LEARNING INT, P107
[18]  
Freund Y., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P148
[19]   BOOSTING A WEAK LEARNING ALGORITHM BY MAJORITY [J].
FREUND, Y .
INFORMATION AND COMPUTATION, 1995, 121 (02) :256-285
[20]  
Freund Y., 1996, Proceedings of the Ninth Annual Conference on Computational Learning Theory, P325, DOI 10.1145/238061.238163