Evolution and generalization of a single neurone - II. Complexity of statistical classifiers and sample size considerations

被引:26
作者
Raudys, S [1 ]
机构
[1] Inst Math & Informat, LT-2600 Vilnius, Lithuania
关键词
single-layer perceptron; statistical classification; generalization error; initialization; overtraining; dimensionality; complexity; sample size; scissors effect;
D O I
10.1016/S0893-6080(97)00136-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unlike many other investigations on this topic, the present one does not consider the nonlinear SLP as a single special type of the classification rule. In SLP training we can obtain seven statistical classifiers of differing complexity. (1) the Euclidean distance classifier; (2) the standard Fisher linear discriminant function (DF); (3) the Fisher linear DF with pseudo-inversion of the covariance matrix; (4) regularized linear discriminant analysis; (5) the generalized Fisher DF; (6) the minimum empirical error classifier; and (7) the maximum margin classifier. ii survey of earlier and new results, referring to relationships between the complexity of six classifiers, generalization error, and the number of learning examples, is presented. These relationships depend on the complexities of both the classifier and the data. This knowledge indicates how to control the SLP classifier complexity purposefully by determining optimal values of the targets, learning-step and its change in the training process, the number of iterations, and addition or subtraction of a regularization term. A correct initialization of weights, and a simplifying data structure can help to reduce the generalization error. (C) 1998 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:297 / 313
页数:17
相关论文
共 56 条
[1]   4 TYPES OF LEARNING-CURVES [J].
AMARI, S ;
FUJITA, N ;
SHINOMOTO, S .
NEURAL COMPUTATION, 1992, 4 (04) :605-618
[2]   STATISTICAL-THEORY OF LEARNING-CURVES UNDER ENTROPIC LOSS CRITERION [J].
AMARI, S ;
MURATA, N .
NEURAL COMPUTATION, 1993, 5 (01) :140-153
[3]   A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].
AMARI, S .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+
[4]   A UNIVERSAL THEOREM ON LEARNING-CURVES [J].
AMARI, SI .
NEURAL NETWORKS, 1993, 6 (02) :161-166
[5]  
[Anonymous], 1992, NIPS 91 P 4 INT C NE
[6]   SCALING LAWS IN LEARNING OF CLASSIFICATION TASKS [J].
BARKAI, N ;
SEUNG, HS ;
SOMPOLINSKY, H .
PHYSICAL REVIEW LETTERS, 1993, 70 (20) :3167-3170
[7]  
BOS S, 1996, P ICANN 96 BOH
[8]  
COVER TM, 1965, IEEE T ELEC COMP, V14, P325
[10]  
Deev A. D., 1972, STAT METHODS CLASSIF, V31, P6