Boosting algorithms: Regularization, prediction and model fitting

被引:690
作者
Buehlmann, Peter [1 ]
Hothorn, Torsten [2 ]
机构
[1] ETH, CH-8092 Zurich, Switzerland
[2] Univ Munich, Inst Stat, D-80539 Munich, Germany
关键词
generalized linear models; generalized additive models; gradient boosting; survival analysis; variable selection; software; ADDITIVE LOGISTIC-REGRESSION; VARIABLE SELECTION; STATISTICAL VIEW; MONOTONIC REGRESSION; TUMOR CLASSIFICATION; GRADIENT DESCENT; CONSISTENCY; CONVERGENCE; MARGIN;
D O I
10.1214/07-STS242
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival analysis. Concepts of degrees of freedom and corresponding Akaike or Bayesian information criteria, particularly useful for regularization and variable selection in high-dimensional covariate spaces, are discussed as well. The practical aspects of boosting procedures for fitting statistical models are illustrated by means of the dedicated open-source software package mboost. This package implements functions which can be used for model fitting, prediction and variable selection. It is flexible, allowing for the implementation of new boosting algorithms optimizing user-specified loss functions.
引用
收藏
页码:477 / 505
页数:29
相关论文
共 93 条
[1]   Shape quantization and recognition with randomized trees [J].
Amit, Y ;
Geman, D .
NEURAL COMPUTATION, 1997, 9 (07) :1545-1588
[2]  
[Anonymous], 2005, Working draft
[3]   Functional gradient descent for financial time series with an application to the measurement of market risk [J].
Audrino, F ;
Barone-Adesi, G .
JOURNAL OF BANKING & FINANCE, 2005, 29 (04) :959-977
[4]  
Audrino F., 2003, Journal of Computational Finance, V6, P65
[5]   A multivariate FGD technique to improve VaR computation in equity markets [J].
Audrino, Francesco ;
Barone-Adesi, Giovanni .
COMPUTATIONAL MANAGEMENT SCIENCE, 2005, 2 (02) :87-106
[6]  
BARTLETT P, 2003, P 13 IFAC S SYST ID
[7]  
Bartlett PL, 2007, J MACH LEARN RES, V8, P2347
[8]   Convexity, classification, and risk bounds [J].
Bartlett, PL ;
Jordan, MI ;
McAuliffe, JD .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) :138-156
[9]  
Benner A, 2002, COMPSTAT 2002: PROCEEDINGS IN COMPUTATIONAL STATISTICS, P171
[10]  
BINDER H, 2006, GAMBOOST GEN ADDITIV