RainForest—A Framework for Fast Decision Tree Construction of Large Datasets

被引:7
作者
Johannes Gehrke
Raghu Ramakrishnan
Venkatesh Ganti
机构
[1] University of Wisconsin-,Department of Computer Sciences
来源
Data Mining and Knowledge Discovery | 2000年 / 4卷
关键词
data mining; decision trees; classification; scalability;
D O I
暂无
中图分类号
学科分类号
摘要
Classification of large datasets is an important data mining problem. Many classification algorithms have been proposed in the literature, but studies have shown that so far no algorithm uniformly outperforms all other algorithms in terms of quality. In this paper, we present a unifying framework called Rain Forest for classification tree construction that separates the scalability aspects of algorithms for constructing a tree from the central features that determine the quality of the tree. The generic algorithm is easy to instantiate with specific split selection methods from the literature (including C4.5, CART, CHAID, FACT, ID3 and extensions, SLIQ, SPRINT and QUEST).
引用
收藏
页码:127 / 162
页数:35
相关论文
共 35 条
[21]  
Rivest R.L.(undefined)undefined undefined undefined undefined-undefined
[22]  
Ibarra O.H.(undefined)undefined undefined undefined undefined-undefined
[23]  
Kim C.E.(undefined)undefined undefined undefined undefined-undefined
[24]  
Loh W.-Y.(undefined)undefined undefined undefined undefined-undefined
[25]  
Shih Y.-S.(undefined)undefined undefined undefined undefined-undefined
[26]  
Loh W.-Y.(undefined)undefined undefined undefined undefined-undefined
[27]  
Vanichsetakul N.(undefined)undefined undefined undefined undefined-undefined
[28]  
Murphy O.J.(undefined)undefined undefined undefined undefined-undefined
[29]  
McCraw R.L.(undefined)undefined undefined undefined undefined-undefined
[30]  
Naumov G.E.(undefined)undefined undefined undefined undefined-undefined