Greedy function approximation: A gradient boosting machine

被引:17262
作者
Friedman, JH [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
关键词
function estimation; boosting; decision trees; robust nonparametric regression;
D O I
10.1214/aos/1013203451
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest-descent minimization. A general gradient descent "boosting" paradigm is developed for additive expansions based on any fitting criterion. Specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are regression trees, and tools for interpreting such "TreeBoost" models are presented. Gradient boosting of regression trees produces competitive, highly robust, interpretable procedures for both regression and classification, especially appropriate for ruining less than clean data. Connections between this approach and the boosting methods of Freund and Shapire and Friedman, Hastie and Tibshirani are discussed.
引用
收藏
页码:1189 / 1232
页数:44
相关论文
共 22 条
  • [1] [Anonymous], 1983, CLASSIFICATION REGRE
  • [2] [Anonymous], 1996, Journal of Computational and Graphical Statistics, DOI DOI 10.2307/1390777
  • [3] Prediction games and arcing algorithms
    Breiman, L
    [J]. NEURAL COMPUTATION, 1999, 11 (07) : 1493 - 1517
  • [4] Breiman L., 1997, PASTING BITES TOGETH
  • [5] COPAS JB, 1983, J R STAT SOC B, V45, P311
  • [6] Donoho D. L., 1993, Different, P173, DOI 10.1090/psapm/047
  • [7] Drucker H., 1997, ICML 97, P107
  • [8] Duffy N, 1999, LECT NOTES ARTIF INT, V1572, P18
  • [9] Freund Y., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P148
  • [10] Additive logistic regression: A statistical view of boosting - Rejoinder
    Friedman, J
    Hastie, T
    Tibshirani, R
    [J]. ANNALS OF STATISTICS, 2000, 28 (02) : 400 - 407