Benchmarking least squares support vector machine classifiers

被引:604
作者
van Gestel, T [1 ]
Suykens, JAK
Baesens, B
Viaene, S
Vanthienen, J
Dedene, G
de Moor, B
Vandewalle, J
机构
[1] Katholieke Univ Leuven, Dept Elect Engn, ESAT SISTA, Louvain, Belgium
[2] Katholieke Univ Leuven, Leuven Inst Res Informat Syst, Louvain, Belgium
关键词
least squares support vector machines; multiclass support vector machines; sparse approximation;
D O I
10.1023/B:MACH.0000008082.80494.e0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In Support Vector Machines (SVMs), the solution of the classification problem is characterized by a ( convex) quadratic programming (QP) problem. In a modified version of SVMs, called Least Squares SVM classifiers (LS-SVMs), a least squares cost function is proposed so as to obtain a linear set of equations in the dual space. While the SVM classifier has a large margin interpretation, the LS-SVM formulation is related in this paper to a ridge regression approach for classification with binary targets and to Fisher's linear discriminant analysis in the feature space. Multiclass categorization problems are represented by a set of binary classifiers using different output coding schemes. While regularization is used to control the effective number of parameters of the LS-SVM classifier, the sparseness property of SVMs is lost due to the choice of the 2-norm. Sparseness can be imposed in a second stage by gradually pruning the support value spectrum and optimizing the hyperparameters during the sparse approximation procedure. In this paper, twenty public domain benchmark datasets are used to evaluate the test set performance of LS-SVM classifiers with linear, polynomial and radial basis function (RBF) kernels. Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances. These SVM and LS-SVM performances are consistently very good when compared to a variety of methods described in the literature including decision tree based algorithms, statistical algorithms and instance based learning methods. We show on ten UCI datasets that the LS-SVM sparse approximation procedure can be successfully applied.
引用
收藏
页码:5 / 32
页数:28
相关论文
共 59 条
  • [1] AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
  • [2] Reducing multiclass to binary: A unifying approach for margin classifiers
    Allwein, EL
    Schapire, RE
    Singer, Y
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2001, 1 (02) : 113 - 141
  • [3] [Anonymous], 1998, Advances in Kernel Methods
  • [4] [Anonymous], 1998, (ICML-1998) Proceedings of the 15th International Confer- ence on Machine Learning
  • [5] [Anonymous], 1999, NEURAL NETWORKS SIGN
  • [6] [Anonymous], NONLINEAR MODELING A
  • [7] Generalized discriminant analysis using a kernel approach
    Baudat, G
    Anouar, FE
    [J]. NEURAL COMPUTATION, 2000, 12 (10) : 2385 - 2404
  • [8] Bay S. D., 1999, Intelligent Data Analysis, V3, P191, DOI 10.1016/S1088-467X(99)00018-9
  • [9] Bishop C. M., 1996, Neural networks for pattern recognition
  • [10] Blake C.L., 1998, UCI repository of machine learning databases