Random forests and adaptive nearest neighbors

被引:304
作者
Lin, Yi [1 ]
Jeon, Yongho [1 ]
机构
[1] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
adaptive estimation; boosting; classification trees; randomized trees; regression trees;
D O I
10.1198/016214505000001230
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article we study random forests through their connection with a new framework of adaptive nearest-neighbor methods. We introduce a concept of potential nearest neighbors (k-PNNs) and show that random forests can be viewed as adaptively weighted k-PNN methods. Various aspects of random forests can be studied from this perspective. We study the effect of terminal node sizes on the prediction accuracy of random forests. We further show that random forests with adaptive splitting schemes assign weights to k-PNNs in a desirable way: for the estimation at a given target point, these random forests assign voting weights to the k-PNNs of the target point according to the local importance of different input variables. We propose a new simple splitting scheme that achieves desirable adaptivity in a straightforward fashion. This simple scheme can be combined with existing algorithms. The resulting algorithm is computationally faster and gives comparable results. Other possible aspects of random forests, such as using linear combinations in splitting, are also discussed. Simulations and real datasets are used to illustrate the results.
引用
收藏
页码:578 / 590
页数:13
相关论文
共 30 条
[1]   Shape quantization and recognition with randomized trees [J].
Amit, Y ;
Geman, D .
NEURAL COMPUTATION, 1997, 9 (07) :1545-1588
[2]  
[Anonymous], 1991, Nearest neighbor pattern classification techniques
[3]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[4]  
BREIMAN L, 1998, 518 U CAL BERK DEP S
[5]  
BREIMAN L, 2000, 577 U CAL BERK DEP S
[6]  
BREIMAN L, 1999, RANDOM FORESTS TECHN
[7]  
Bühlmann P, 2002, ANN STAT, V30, P927
[8]  
Cutler A, 1999, 59999 UT STAT U DEP
[9]  
CUTLER A, 2000, 500100 UT STAT U DEP
[10]  
DEVROYE L, 1996, PROBABILITY THEORY P