A heuristic training for support vector regression

被引:76
作者
Wang, WJ [1 ]
Xu, ZB
机构
[1] Shanxi Univ, Dept Comp Sci, Taiyuan 030006, Peoples R China
[2] Xian Jiaotong Univ, Fac Sci, Inst Informat & Syst Sci, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
heuristic sparse control; reducing data; regression; similarity measurement; support vector machine;
D O I
10.1016/j.neucom.2003.11.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A heuristic method for accelerating support vector machine (SVM) training based on a measurement of similarity among samples is presented in this paper. To train SVM, a quadratic function with linear constraints is optimized. The original formulation of the objective function of an SVM is efficient during optimization phase, but the yielded discriminant function often contains redundant terms. The economy of the discriminant function of an SVM is dependent on a sparse subset of the training data, say, selected support vectors by optimization techniques. The motivation for using a sparse controlled version of an SVM is therefore a practical issue since it is the requirement of decreasing computation expense during the SVM testing and enhancing the ability to interpretation of the model. Besides the existing approaches, an intuitive way to achieve this task is to control support vectors sparsely by reducing training data without discounting generalization performance. The most attractive feature of the idea is to make SVM training fast especially for training data of large size because the size of optimization problem can be decreased greatly. In this paper, a heuristic rule is utilized to reduce training data for support vector regression (SVR). At first, all the training data are divided into several groups, and then for each group, some training vectors will be discarded based on the measurement of similarity among samples. The prior reduction process is carried out in the original data space before SVM training, so the extra computation expense may be rarely taken into account. Even considering the preprocessing cost, the total spending time is still less than that for training SVM with the complete training set. As a result, the number of vectors for SVR training becomes small and the training time can be decreased greatly without compromising the generalization capability of SVMs. Simulating results show the effectiveness of the presented method. (C) 2003 Published by Elsevier B.V.
引用
收藏
页码:259 / 275
页数:17
相关论文
共 20 条
  • [1] ALMEIDA MB, 2000, P 6 BRAZ S NEUR NETW
  • [2] [Anonymous], 1982, ESTIMATION DEPENDENC
  • [3] [Anonymous], P 13 INT C MACH LEAR
  • [4] On domain knowledge and feature selection using a support vector machine
    Barzilay, O
    Brailovsky, VL
    [J]. PATTERN RECOGNITION LETTERS, 1999, 20 (05) : 475 - 484
  • [5] Knowledge-based analysis of microarray gene expression data by using support vector machines
    Brown, MPS
    Grundy, WN
    Lin, D
    Cristianini, N
    Sugnet, CW
    Furey, TS
    Ares, M
    Haussler, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) : 262 - 267
  • [6] Burges CJC, 1997, ADV NEUR IN, V9, P375
  • [7] Support vector machines for spam categorization
    Drucker, H
    Wu, DH
    Vapnik, VN
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (05): : 1048 - 1054
  • [8] Friess T., 1998, P 15 TH INT C MACHIN
  • [9] Gunn S. R., 1998, SUPPORT VECTOR MACHI
  • [10] Joachims T., 1998, Lecture Notes in Computer Science, P137, DOI DOI 10.1007/BFB0026683