决策树剪枝方法的比较

被引:40
作者
魏红宁
机构
[1] 西南交通大学校长办公室四川成都
关键词
数据挖掘; 决策树; 事后剪枝; PEP; MEP; REP; CCP;
D O I
暂无
中图分类号
TP311 [程序设计、软件工程];
学科分类号
081202 ; 0835 ;
摘要
为在决策树剪枝中正确选择剪枝方法,基于理论分析和算例详细地比较了当前主要的4种剪枝方法的计算复杂性、剪枝方式、误差估计和理论基础.与PEP相比,MEP产生的树精度较小且树较大;REP是最简单的剪枝方法之一,但需要独立剪枝集;在同样精度情况下,CCP比REP产生的树小.如果训练数据集丰富,可以选择REP,如果训练数据集较少且剪枝精度要求较高,则可以选用PEP.
引用
收藏
页码:44 / 48
页数:5
相关论文
共 7 条
[1]  
The effects of training set sizes on decision tree. Oates T,Jensen D. Proc of the 14th Int l Conf on Machine Learning . 1997
[2]  
Simplifying decision trees: a survey. Breslow L A,Aha D W. The Knowledge Engineering Review . 1997
[3]  
Choosing the best pruned decision tree: a matter of bias. Malerba D,Semeraro G,Esposito F. Proc 5th Italian Workshop on Machine Learning . 1994
[4]  
On estimating probabilities in tree pruning. Cestnik B,Bratko I. Proc of European Working Sessions on Learning . 1991
[5]  
An analysis of reduced error pruning. Elomaa T,Kaariainen M. Journal of Artificial Organs . 2001
[6]  
Simplifying decision trees. Quinlan J R. International Journal of Man Machine Studies . 1987
[7]  
Analysis of a complexity based pruning scheme for classification trees. Nobel A. IEEE Transactions on Information Theory . 2002