Cost functions to estimate a posteriori probabilities in multiclass problems

被引:53
作者
Cid-Sueiro, J
Arribas, JI
Urbán-Muñoz, S
Figueiras-Vidal, AR
机构
[1] Univ Valladolid, ETSIT, Dept Teor Senal & Comunicac & Ing Telemat, E-47011 Valladolid, Spain
[2] Univ Carlos III Madrid, Dept Tecnol Comunicac, EPS, Leganes Madrid 28911, Spain
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1999年 / 10卷 / 03期
关键词
neural networks; pattern classification; probability estimation;
D O I
10.1109/72.761724
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of designing cost functions to estimate a posteriori probabilities in multiclass problems is addressed in this paper. We establish necessary and sufficient conditions that these costs must satisfy in one-class one-output networks whose outputs are consistent with probability laws. We focus our attention on a particular subset of the corresponding cost functions; those which verify two usually interesting properties: symmetry and separability (well-known cost functions, such as the quadratic cost or the cross entropy are particular cases in this subset). Finally, we present a universal stochastic gradient learning rule for single-layer networks, in the sense of minimizing a general version of these cost functions for a,vide family of nonlinear activation functions.
引用
收藏
页码:645 / 656
页数:12
相关论文
共 30 条
[21]  
Pearlmutter B., 1990, P 1990 CONN MOD SUMM
[22]   Neural Network Classifiers Estimate Bayesian a posteriori Probabilities [J].
Richard, Michael D. ;
Lippmann, Richard P. .
NEURAL COMPUTATION, 1991, 3 (04) :461-483
[23]   CLASSIFICATION OF LINEARLY NONSEPARABLE PATTERNS BY LINEAR THRESHOLD ELEMENTS [J].
ROYCHOWDHURY, VP ;
SIU, KY ;
KAILATH, T .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1995, 6 (02) :318-331
[24]  
Ruck D W, 1990, IEEE Trans Neural Netw, V1, P296, DOI 10.1109/72.80266
[25]   Comparing support vector machines with Gaussian kernels to radial basis function classifiers [J].
Scholkopf, B ;
Sung, KK ;
Burges, CJC ;
Girosi, F ;
Niyogi, P ;
Poggio, T ;
Vapnik, V .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1997, 45 (11) :2758-2765
[26]   ENERGY FUNCTIONS FOR MINIMIZING MISCLASSIFICATION ERROR WITH MINIMUM-COMPLEXITY NETWORKS [J].
TELFER, BA ;
SZU, HH .
NEURAL NETWORKS, 1994, 7 (05) :809-818
[27]  
TELFER BA, 1992, P INT JOINT C NEUR N, V4, P214
[28]  
Vapnik V, 1999, NATURE STAT LEARNING
[29]  
Wan E A, 1990, IEEE Trans Neural Netw, V1, P303, DOI 10.1109/72.80269
[30]  
WITTNER BS, 1988, NEURAL INFORMATION P, P850