Building multi-way decision trees with numerical attributes

被引:29
作者
Berzal, F [1 ]
Cubero, JC [1 ]
Marín, N [1 ]
Sánchez, D [1 ]
机构
[1] Univ Granada, Dept Comp Sci & Artificial Intelligence, E-18071 Granada, Spain
关键词
supervised learning; classification; decision trees; numerical attributes; hierarchical clustering;
D O I
10.1016/j.ins.2003.09.018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Decision trees are probably the most popular and commonly used classification model. They are recursively built following a top-down approach (from general concepts to particular examples) by repeated splits of the training dataset. When this dataset contains numerical attributes, binary splits are usually performed by choosing the threshold value which minimizes the impurity measure used as splitting criterion (e.g. C4.5 gain ration criterion or CART Gini's index). In this paper we propose the use of multi-way splits for continuous attributes in order to reduce the tree complexity without decreasing classification accuracy. This can be done by intertwining a hierarchical clustering algorithm with the usual greedy decision tree learning. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:73 / 90
页数:18
相关论文
共 31 条