Evolution and generalization of a single neurone - II. Complexity of statistical classifiers and sample size considerations

被引:26
作者
Raudys, S [1 ]
机构
[1] Inst Math & Informat, LT-2600 Vilnius, Lithuania
关键词
single-layer perceptron; statistical classification; generalization error; initialization; overtraining; dimensionality; complexity; sample size; scissors effect;
D O I
10.1016/S0893-6080(97)00136-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unlike many other investigations on this topic, the present one does not consider the nonlinear SLP as a single special type of the classification rule. In SLP training we can obtain seven statistical classifiers of differing complexity. (1) the Euclidean distance classifier; (2) the standard Fisher linear discriminant function (DF); (3) the Fisher linear DF with pseudo-inversion of the covariance matrix; (4) regularized linear discriminant analysis; (5) the generalized Fisher DF; (6) the minimum empirical error classifier; and (7) the maximum margin classifier. ii survey of earlier and new results, referring to relationships between the complexity of six classifiers, generalization error, and the number of learning examples, is presented. These relationships depend on the complexities of both the classifier and the data. This knowledge indicates how to control the SLP classifier complexity purposefully by determining optimal values of the targets, learning-step and its change in the training process, the number of iterations, and addition or subtraction of a regularization term. A correct initialization of weights, and a simplifying data structure can help to reduce the generalization error. (C) 1998 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:297 / 313
页数:17
相关论文
共 56 条
[11]  
DEEV AD, 1974, ENG CYBERN, V12, P153
[12]  
DEEV AD, 1970, REPORTS ACAD SCI USS, V195, P756
[13]  
ESTES SE, 1965, THESIS STANFORD U ST
[14]  
GYORGYI G, 1990, NEURAL NETWORKS SPIN, P31
[15]   STOCHASTIC LINEAR LEARNING - EXACT TEST AND TRAINING ERROR AVERAGES [J].
HANSEN, LK .
NEURAL NETWORKS, 1993, 6 (03) :393-396
[16]  
Haussler D., 1994, Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory, COLT 94, P76, DOI 10.1145/180139.181018
[17]   ON MEAN ACCURACY OF STATISTICAL PATTERN RECOGNIZERS [J].
HUGHES, GF .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1968, 14 (01) :55-+
[18]  
Jain AK., 1982, Handbook of Statistics, DOI [10.1016/S0169-7161(82)02042-2, DOI 10.1016/S0169-7161(82)02042-2]
[19]   ERRORS IN DISCRIMINATION [J].
JOHN, S .
ANNALS OF MATHEMATICAL STATISTICS, 1961, 32 :1125-+
[20]   DIMENSIONALITY AND SAMPLE SIZE IN STATISTICAL PATTERN CLASSIFICATION [J].
KANAL, L ;
CHANDRASEKARAN, B .
PATTERN RECOGNITION, 1971, 3 (03) :225-+