Extremely randomized trees

被引:5123
作者
Geurts, P [1 ]
Ernst, D [1 ]
Wehenkel, L [1 ]
机构
[1] Univ Liege, Dept Elect Engn & Comp Sci, B-4000 Liege, Belgium
关键词
supervised learning; decision and regression trees; ensemble methods; cut-point randomization; bias/variance tradeoff; kernel-based models;
D O I
10.1007/s10994-006-6226-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes anew tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the default choice of this parameter, and we also provide insight on how to adjust it in particular situations. Besides accuracy, the main strength of the resulting algorithm is computational efficiency. A bias/variance analysis of the Extra-Trees algorithm is also provided as well as a geometrical and a kernel characterization of the models induced.
引用
收藏
页码:3 / 42
页数:40
相关论文
共 49 条
[1]  
Ali K. M., 1995, On the link between error correlation and error reduction in decision tree ensembles
[2]  
Ali KM, 1996, MACH LEARN, V24, P173, DOI 10.1007/BF00058611
[3]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[4]  
Blake C.L., 1998, UCI repository of machine learning databases
[5]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Randomizing outputs to increase prediction accuracy [J].
Breiman, L .
MACHINE LEARNING, 2000, 40 (03) :229-242
[9]  
Breiman L., 1996, Technical Report
[10]  
BREIMAN L, 2000, 579 U CAL DEP STAT