A formalism for relevance and its application in feature subset selection

被引:125
作者
Bell, DA [1 ]
Wang, H [1 ]
机构
[1] Univ Ulster, Fac Informat, Sch Informat & Software Engn, Newtownabbey BT37 0QB, North Ireland
关键词
machine learning; knowledge discovery; data mining; relevance; feature subset selection;
D O I
10.1023/A:1007612503587
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The notion of relevance is used in many technical fields. In the areas of machine learning and data mining, for example, relevance is frequently used as a measure in feature subset selection (FSS). In previous studies, the interpretation of relevance has varied and its connection to FSS has been loose. In this paper a rigorous mathematical formalism is proposed for relevance, which is quantitative and normalized. To apply the formalism in FSS, a characterization is proposed for FSS: preservation of learning information and minimization of joint entropy. Based on the characterization, a tight connection between relevance and FSS is established: maximizing the relevance of features to the decision attribute, and the relevance of the decision attribute to the features. This connection is then used to design an algorithm for FSS. The algorithm is linear in the number of instances and quadratic in the number of features. The algorithm is evaluated using 23 public datasets, resulting in an improvement in prediction accuracy on 16 datasets, and a loss in accuracy on only 1 dataset. This provides evidence that both the formalism and its connection to FSS are sound.
引用
收藏
页码:175 / 195
页数:21
相关论文
共 39 条
[1]  
ALMUALLIM H, 1991, PROCEEDINGS : NINTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P547
[2]   LEARNING BOOLEAN CONCEPTS IN THE PRESENCE OF MANY IRRELEVANT FEATURES [J].
ALMUALLIM, H ;
DIETTERICH, TG .
ARTIFICIAL INTELLIGENCE, 1994, 69 (1-2) :279-305
[3]   WHAT SIZE NETWORK IS GOOD FOR GENERALIZATION OF A SPECIFIC TASK OF INTEREST [J].
AMIRIKIAN, B ;
NISHIMURA, H .
NEURAL NETWORKS, 1994, 7 (02) :321-329
[4]  
[Anonymous], P AAAI FALL S REL
[5]  
[Anonymous], 1962, LOGICAL FDN PROBABIL
[6]  
[Anonymous], 1994, AAAI WORKSH CAS BAS
[7]  
BLUM A, 1994, RELEVANCE, P14
[8]   OCCAM RAZOR [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
INFORMATION PROCESSING LETTERS, 1987, 24 (06) :377-380
[9]  
Caruana R., 1994, MACH LEARN P 1994, P28, DOI 10.1016/B978-1-55860-335-6.50012-X
[10]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X