New results on error correcting output codes of kernel machines

被引:103
作者
Passerini, A [1 ]
Pontil, M
Frasconi, P
机构
[1] Univ Florence, Dept Comp Sci & Syst, Florence, Italy
[2] UCL, Dept Comp Sci, London WC1E, England
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2004年 / 15卷 / 01期
关键词
error correcting output codes (ECOC); machine learning; statistical learning theory; support vector machines;
D O I
10.1109/TNN.2003.820841
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using margin-based binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map the outputs of the classifiers into class codewords. In this paper we introduce a new decoding function that combines the margins through an estimate of their class conditional probabilities. Concerning model selection, we present new theoretical results bounding the leave-one-out (1,00) error of ECOC of kernel machines, which can be used to tune kernel hyperparameters. We report experiments using support vector machines as the base binary classifiers, showing the advantage of the proposed decoding function over other functions of the margin commonly used in practice. Moreover, our empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters.
引用
收藏
页码:45 / 54
页数:10
相关论文
共 35 条
[1]  
Akaike H., 1973, 2 INT S INFORM THEOR, P267, DOI [DOI 10.1007/978-1-4612-1694-0_15, 10.1007/978-1-4612-1694-0_15]
[2]   Reducing multiclass to binary: A unifying approach for margin classifiers [J].
Allwein, EL ;
Schapire, RE ;
Singer, Y .
JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) :113-141
[3]  
[Anonymous], 1998, ADV KERNEL METHODS S
[4]  
[Anonymous], 1987, P CBMS NSF REG C SER
[5]  
[Anonymous], ADV LARGE MARGIN CLA
[6]  
[Anonymous], 1982, ESTIMATION DEPENDENC
[7]   THEORY OF REPRODUCING KERNELS [J].
ARONSZAJN, N .
TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1950, 68 (MAY) :337-404
[8]   Gradient-based optimization of hyperparameters [J].
Bengio, Y .
NEURAL COMPUTATION, 2000, 12 (08) :1889-1900
[9]  
Blake C.L., 1998, UCI repository of machine learning databases
[10]  
Bose R. C., 1960, INFORM CONTR, V3, P68, DOI DOI 10.1016/S0019-9958(60)90287-4