Building classification trees using the total uncertainty criterion

被引:118
作者
Abellán, J [1 ]
Moral, S [1 ]
机构
[1] Univ Granada, ETSI Informat, Dept Ciencias Computac & Inteligencia Artificial, E-18071 Granada, Spain
关键词
D O I
10.1002/int.10143
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an application of the measure of total uncertainty on convex sets of probability distributions, also called credal sets, to the construction of classification trees. In these classification trees the probabilities of the classes in each one of its leaves is estimated by using the imprecise Dirichlet model. In this way, smaller samples give rise to wider probability intervals. Branching a classification tree can decrease the entropy associated with the classes but, at the same time, as the sample is divided among the branches the nonspecificity increases. We use a total uncertainty measure (entropy + nonspecificity) as branching criterion. The stopping rule is not to increase the total uncertainty. The good behavior of this procedure for the standard classification problems is shown. It is important to remark that it does not experience of overfitting, with similar results in the training and test samples. (C) 2003 Wiley Periodicals, Inc.
引用
收藏
页码:1215 / 1225
页数:11
相关论文
共 25 条
[21]  
Walley P., 1991, STAT REASONING IMPRE
[22]   ENTROPY AND SPECIFICITY IN A MATHEMATICAL-THEORY OF EVIDENCE [J].
YAGER, RR .
INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 1983, 9 (04) :249-260
[23]   Exact credal treatment of missing data [J].
Zaffalon, M .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2002, 105 (01) :105-122
[24]   The naive credal classifier [J].
Zaffalon, M .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2002, 105 (01) :5-21
[25]  
ZAFFALON M, 1999, P 1 INT S IMPR PROB, P405