Trimmed bagging

被引:40
作者
Croux, Christophe
Joossens, Kfistel
Lemmens, Aurelie
机构
[1] Katholieke Univ Leuven, Ctr Univ Stat, Fac Econ & Management, B-3000 Louvain, Belgium
[2] Erasmus Univ, Sch Econ, NL-3000 DR Rotterdam, Netherlands
关键词
aggregation; bagging; decision trees; error rate; support vector machine; trimmed means;
D O I
10.1016/j.csda.2007.06.012
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Bagging has been found to be successful in increasing the predictive performance of unstable classifiers. Bagging draws bootstrap samples from the training sample, applies the classifier to each bootstrap sample, and then averages over all obtained classification rules. The idea of trimmed bagging is to exclude the bootstrapped classification rules that yield the highest error rates, as estimated by the out-of-bag error rate, and to aggregate over the remaining ones. ln this note we explore the potential benefits of trimmed bagging. On the basis of numerical experiments, we conclude that trimmed bagging performs comparably to standard bagging when applied to unstable classifiers as decision trees, but yields better results when applied to more stable base classifiers, like support vector machines. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:362 / 368
页数:7
相关论文
共 14 条
[1]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]  
Breiman L, 1998, ANN STAT, V26, P801
[4]   Bagging, subagging and Bragging for improving some prediction algorithms [J].
Bühlmann, P .
RECENT ADVANCES AND TRENDS IN NONPARAMETRIC STATISTICS, 2003, :19-34
[5]  
Bühlmann P, 2002, ANN STAT, V30, P927
[6]  
Buja A, 2006, STAT SINICA, V16, P323
[7]   An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization [J].
Dietterich, TG .
MACHINE LEARNING, 2000, 40 (02) :139-157
[8]   Boosting and instability for regression trees [J].
Gey, S ;
Poggi, JM .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (02) :533-550
[9]   Estimating the functional form of a continuous covariate's effect on survival time [J].
Holländer, N ;
Schumacher, M .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (04) :1131-1151
[10]   Bundling classifiers by bagging trees [J].
Hothorn, T ;
Lausen, B .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 49 (04) :1068-1078