ON THE HANDLING OF CONTINUOUS-VALUED ATTRIBUTES IN DECISION TREE GENERATION

被引:261
作者
FAYYAD, UM [1 ]
IRANI, KB [1 ]
机构
[1] UNIV MICHIGAN,DEPT ELECT ENGN & COMP SCI,ARTIFICIAL INTELLIGENCE LAB,ANN ARBOR,MI 48109
关键词
INDUCTION; EMPIRICAL CONCEPT LEARNING; DECISION TREES; INFORMATION ENTROPY MINIMIZATION; DISCRETIZATION; CLASSIFICATION;
D O I
10.1007/BF00994007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
We present a result applicable to classification learning algorithms that generate decision trees or rules using the information entropy minimization heuristic for discretizing continuous-valued attributes. The result serves to give a better understanding of the entropy measure, to point out that the behavior of the information entropy heuristic possesses desirable properties that justify its usage in a formal sense, and to improve the efficiency of evaluating continuous-valued attributes for cut value selection. Along with the formal proof, we present empirical results that demonstrate the theoretically expected reduction in evaluation effort for training data sets from real-world domains.
引用
收藏
页码:87 / 102
页数:16
相关论文
共 13 条
[1]
Breiman L, 2017, CLASSIFICATION REGRE, P368, DOI 10.1201/9781315139470
[2]
CHENG J, 1988, 5TH P INT C MACH LEA, P100
[3]
Clark P., 1989, Machine Learning, V3, P261, DOI 10.1007/BF00116835
[4]
FAYYAD UM, 1991, CS6314 GM RES LABS G
[5]
FAYYAD UM, 1990, 5TH P NAT C ART INT, P749
[6]
FAYYAD UM, 1991, THESIS U MICHIGAN
[7]
The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[8]
AN ITERATIVE GROWING AND PRUNING ALGORITHM FOR CLASSIFICATION TREE DESIGN [J].
GELFAND, SB ;
RAVISHANKAR, CS ;
DELP, EJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1991, 13 (02) :163-174
[9]
IRANI KB, 1990, P SPIE C APPLICATION, V8, P956
[10]
CHARACTERISTIC SELECTION PROBLEM IN RECOGNITION SYSTEMS [J].
LEWIS, PM .
IRE TRANSACTIONS ON INFORMATION THEORY, 1962, 8 (02) :171-&