An introduction to the predictive technique AdaBoost with a comparison to generalized additive models

被引:29
作者
Kawakita, M [1 ]
Minami, M
Eguchi, S
Lennert-Cody, CE
机构
[1] Grad Univ Adv Studies, Dept Stat Sci, Tokyo 1068569, Japan
[2] Inst Stat Math, Tokyo 1068569, Japan
[3] Interamer Trop Tuna Commiss, La Jolla, CA 92037 USA
关键词
classification; boosting; decision stump; AsymBoost; logistic regression; shark bycatch;
D O I
10.1016/j.fishres.2005.07.011
中图分类号
S9 [水产、渔业];
学科分类号
0908 ;
摘要
The recently developed statistical learning method boosting is introduced for use with fisheries data. Boosting is a predictive technique for classification that has been shown to perform well with problematic data. The use of boosting algorithms AdaBoost and AsymBoost, with decision stumps, are described in detail, and their use is demonstrated with shark bycatch data from the eastern Pacific Ocean tuna purse-seine fishery. In addition, results of AdaBoost are compared to those obtained from generalized additive models (GAM). Compared to the logistic GAM, the prediction performance of AdaBoost was more stable, even with correlated predictors. Standard deviations of the test error were often considerably smaller for AdaBoost than for the logistic GAM. AdaBoost score plots, graphical displays of the contribution of each predictor to the discriminant function, were also more stable than score plots of the logistic GAM, particularly in regions of sparse data. AsymBoost, a variant of AdaBoost developed for binary classification of a skewed response variable, was shown to be effective at reducing the false negative ratio without substantially increasing the overall test error. Boosting shows promise for applications to fisheries data, both as a predictive technique and as a tool for exploratory data analysis. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:328 / 343
页数:16
相关论文
共 29 条
[1]  
BAYLIFF WH, 2001, 13 IATTC, P122
[2]   Environmental effects on swordfish and blue shark catch rates in the US North Pacific longline fishery [J].
Bigelow, KA ;
Boggs, CH ;
He, X .
FISHERIES OCEANOGRAPHY, 1999, 8 (03) :178-198
[3]  
Breiman L, 1996, ANN STAT, V24, P2350
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Prediction games and arcing algorithms [J].
Breiman, L .
NEURAL COMPUTATION, 1999, 11 (07) :1493-1517
[7]  
Breiman L, 1998, ANN STAT, V26, P801
[8]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[9]   Additive logistic regression: A statistical view of boosting - Rejoinder [J].
Friedman, J ;
Hastie, T ;
Tibshirani, R .
ANNALS OF STATISTICS, 2000, 28 (02) :400-407
[10]  
Friedman J., 2001, ELEMENTS STAT LEARNI, V1