USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING

被引:1772
作者
BATTITI, R
机构
[1] Dipartimento di Matematica, Università di Trento, 38050, Povo (Trento)
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1994年 / 5卷 / 04期
关键词
FEATURE EXTRACTION; NEURAL NETWORK PRUNING; DIMENSIONALITY REDUCTION; MUTUAL INFORMATION; SUPERVISED LEARNING; ADAPTIVE CLASSIFIERS;
D O I
10.1109/72.298224
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier. Because the mutual information measures arbitrary dependencies between random variables, it is suitable for assessing the ''information content'' of features in complex classification tasks, where methods bases on linear relations (like the correlation) are prone to mistakes. The fact that the mutual information is independent of the coordinates chosen permits a robust estimation. Nonetheless, the use of the mutual information for tasks characterized by high input dimensionality requires suitable approximations because of the prohibitive demands on computation and samples. An algorithm is proposed that is based on a ''greedy'' selection of the features and that takes both the mutual information with respect to the output class and with respect to the already-selected features into account. Finally the results of a series of experiments are discussed.
引用
收藏
页码:537 / 550
页数:14
相关论文
共 23 条
[1]  
[Anonymous], 1987, LEARNING INTERNAL RE
[2]   MINIMUM CLASS ENTROPY - A MAXIMUM INFORMATION APPROACH TO LAYERED NETWORKS [J].
BICHSEL, M ;
SEITZ, P .
NEURAL NETWORKS, 1989, 2 (02) :133-141
[3]  
Bridle J. S., 1990, PROC 2 INT C NEURAL, P211, DOI [10.5555/2969830, DOI 10.5555/2969830]
[4]  
Duda R. O., 1973, PATTERN CLASSIFICATI, V3
[5]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[6]   INDEPENDENT COORDINATES FOR STRANGE ATTRACTORS FROM MUTUAL INFORMATION [J].
FRASER, AM ;
SWINNEY, HL .
PHYSICAL REVIEW A, 1986, 33 (02) :1134-1140
[7]   RECONSTRUCTING ATTRACTORS FROM SCALAR TIME-SERIES - A COMPARISON OF SINGULAR SYSTEM AND REDUNDANCY CRITERIA [J].
FRASER, AM .
PHYSICA D, 1989, 34 (03) :391-404
[8]   ON THE APPROXIMATE REALIZATION OF CONTINUOUS-MAPPINGS BY NEURAL NETWORKS [J].
FUNAHASHI, K .
NEURAL NETWORKS, 1989, 2 (03) :183-192
[9]   ANALYSIS OF HIDDEN UNITS IN A LAYERED NETWORK TRAINED TO CLASSIFY SONAR TARGETS [J].
GORMAN, RP ;
SEJNOWSKI, TJ .
NEURAL NETWORKS, 1988, 1 (01) :75-89
[10]   ON THE PRACTICAL IMPLICATION OF MUTUAL INFORMATION FOR STATISTICAL DECISION-MAKING [J].
KANAYA, F ;
NAKAGAWA, K .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (04) :1151-1155