CODING DECISION TREES

被引:86
作者
WALLACE, CS [1 ]
PATRICK, JD [1 ]
机构
[1] DEAKIN UNIV,GEELONG,VIC 3217,AUSTRALIA
关键词
DECISION TREES; SUPERVISED LEARNING; MINIMUM MESSAGE LENGTH; MINIMUM DESCRIPTION LENGTH; INFORMATION THEORY;
D O I
10.1023/A:1022646101185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Quinlan and Rivest have suggested a decision-tree inference method using the Minimum Description Length idea. We show that there is an error in their derivation of message lengths, which fortunately has no effect on the final inference. We further suggest two improvements to their coding techniques, one removing an inefficiency in the description of non-binary trees, and one improving the coding of leaves. We argue that these improvements are superior to similarly motivated proposals in the original paper. Empirical tests confirm the good results reported by Quintan and Rivest, and show our coding proposals to lead to useful improvements in the performance of the method.
引用
收藏
页码:7 / 22
页数:16
相关论文
共 10 条
[1]   MINIMUM COMPLEXITY DENSITY-ESTIMATION [J].
BARRON, AR ;
COVER, TM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (04) :1034-1054
[2]  
GEORGEFF MP, 1984, 6TH P EUR C ART INT
[3]  
HAMMING R, 1980, CODING INFORMATION T
[4]  
Quinlan J. R., 1986, Machine Learning, V1, P81, DOI 10.1023/A:1022643204877
[5]  
QUINLAN JR, 1989, INFORM COMPUT, V80, P227, DOI 10.1016/0890-5401(89)90010-2
[6]   UNIVERSAL MODELING AND CODING [J].
RISSANEN, J .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1981, 27 (01) :12-23
[7]   A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH [J].
RISSANEN, J .
ANNALS OF STATISTICS, 1983, 11 (02) :416-431
[8]  
Shannon Claude E., 1949, MATH THEORY COMMUNIC
[9]   AN INFORMATION MEASURE FOR CLASSIFICATION [J].
WALLACE, CS ;
BOULTON, DM .
COMPUTER JOURNAL, 1968, 11 (02) :185-+
[10]  
WALLACE CS, 1987, J ROY STAT SOC B MET, V49, P240