On the quest for easy-to-understand splitting rules

被引：17

作者：

Berzal, F ^{[1
]}

Cubero, JC

Cuenca, F

Martín-Bautista, MJ

机构：

[1] Univ Granada, ETS Ingn Informat, Dept Comp Sci & Artificial Intelligence, E-18071 Granada, Spain

[2] Xfera, Madrid, Spain

来源：

DATA & KNOWLEDGE ENGINEERING | 2003年 / 44卷 / 01期

关键词：

supervised learning; classification; decision trees; splitting rules;

D O I：

10.1016/S0169-023X(02)00062-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Decision trees are probably the most popular and commonly used classification model. They are built recursively following a top-down approach (from general concepts to particular examples) by repeated splits of the training dataset. The chosen splitting criterion may affect the accuracy of the classifier, but not significantly. In fact, none of the proposed splitting criteria in the literature has proved to be universally better than the rest. Although they all yield similar results, their complexity varies significantly, and they are not always suitable for multi-way decision trees. Here we propose two new splitting rules which obtain similar results to other well-known criteria when used to build multi-way decision trees, while their simplicity makes them ideal for non-expert users. (C) 2002 Elsevier Science B.V. All rights reserved.

引用

页码：31 / 48

页数：18

共 21 条

[1] [Anonymous], VLDB 96
[2] Technical note: Some properties of splitting criteria
Breiman, L
[J]. MACHINE LEARNING, 1996, 24 (01) : 41 - 47
[3] Breiman L., 1984, BIOMETRICS, DOI DOI 10.2307/2530946
[4] A FURTHER COMPARISON OF SPLITTING RULES FOR DECISION-TREE INDUCTION
BUNTINE, W
NIBLETT, T
[J]. MACHINE LEARNING, 1992, 8 (01) : 75 - 85
[5] A DISTANCE-BASED ATTRIBUTE SELECTION MEASURE FOR DECISION TREE INDUCTION
DEMANTARAS, RL
[J]. MACHINE LEARNING, 1991, 6 (01) : 81 - 92
[6] Gehrke J, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P169, DOI 10.1145/304181.304197
[7] RainForest - A framework for fast decision tree construction of large datasets
Gehrke, J
Ramakrishnan, R
Ganti, V
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2000, 4 (2-3) : 127 - 162
[8] GEHRKE J, 1999, ACM SIGKDD 1999 INT, P1
[9] HIPP J, 2000, SIGKDD EXPLORATIONS, V2, P58
[10] Kononenko I., 1995, IJCAI-95. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, P1034

← 1 2 3 →