Improved boosting algorithms using confidence-rated predictions

被引:2051
作者
Schapire, RE [1 ]
Singer, Y [1 ]
机构
[1] AT&T Labs Res, Shannon Lab, Florham Pk, NJ 07932 USA
关键词
boosting algorithms; multiclass classification; output coding; decision trees;
D O I
10.1023/A:1007614523901
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a simplified analysis of AdaBoost in this setting, and we show how this analysis can be used to find improved parameter settings as well as a refined criterion for training weak hypotheses. We give a specific method for assigning confidences to the predictions of decision trees, a method closely related to one used by Quinlan. This method also suggests a technique for growing decision trees which turns out to be identical to one proposed by Kearns and Mansour. We focus next on how to apply the new boosting algorithms to multiclass classification problems, particularly to the multi-label case in which each example may belong to more than one class. We give two boosting methods for this problem, plus a third method based on output coding. One of these leads to a new method for handling the single-label case which is simpler but as effective as techniques suggested by Freund and Schapire. Finally, we give some experimental results comparing a few of the algorithms discussed in this paper.
引用
收藏
页码:297 / 336
页数:40
相关论文
共 26 条
[1]  
[Anonymous], 1998, UCI REPOSITORY MACHI
[2]   The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network [J].
Bartlett, PL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (02) :525-536
[3]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[4]   What Size Net Gives Valid Generalization? [J].
Baum, Eric B. ;
Haussler, David .
NEURAL COMPUTATION, 1989, 1 (01) :151-160
[5]   Empirical support for Winnow and Weighted-Majority algorithms: Results on a calendar scheduling domain [J].
Blum, A .
MACHINE LEARNING, 1997, 26 (01) :5-23
[6]  
Breiman L, 1998, ANN STAT, V26, P801
[7]  
Csiszar I., 1984, STATISTICS DECISIO S, V1, P205
[8]  
Dietterich T. G., 1995, Journal of Artificial Intelligence Research, V2, P263
[9]  
DIETTERICH TG, IN PRESS MACHINE LEA
[10]  
DRUCKER H, 1996, ADV NEURAL INFORMATI, V8