Support vector machines for classification in nonstandard situations

被引:205
作者
Lin, Y [1 ]
Lee, Y [1 ]
Wahba, G [1 ]
机构
[1] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
support vector machine; classification; Bayes rule; GCKL; GACV;
D O I
10.1023/A:1012406528296
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The majority of classification algorithms are developed for the standard situation in which it is assumed that the examples in the training set come from the same distribution as that of the target population, and that the cost of misclassification into different classes are the same. However, these assumptions are often violated in real world settings. For some classification methods, this can often be taken care of simply with a change of threshold; for others, additional effort is required. In this paper, we explain why the standard support vector machine is not suitable for the nonstandard situation, and introduce a simple procedure for adapting the support vector machine methodology to the nonstandard situation. Theoretical justification for the procedure is provided. Simulation study illustrates that the modified support vector machine significantly improves upon the standard support vector machine in the nonstandard situation. The computational load of the proposed procedure is the same as that of the standard support vector machine. The procedure reduces to the standard support vector machine in the standard situation.
引用
收藏
页码:191 / 202
页数:12
相关论文
共 11 条
[1]  
[Anonymous], ADV LARGE MARGIN CLA
[2]  
[Anonymous], 1999, UNIFIED FRAMEWORK RE
[3]  
Boser B.E., 1992, P 5 ANN ACM WORKSH C
[4]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[5]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[6]   ASYMPTOTIC ANALYSIS OF PENALIZED LIKELIHOOD AND RELATED ESTIMATORS [J].
COX, DD ;
OSULLIVAN, F .
ANNALS OF STATISTICS, 1990, 18 (04) :1676-1695
[7]  
CRISTIANINI N, 1998, NEUROCOLT2 TECHNICAL
[8]  
KARAKOULAS G, 1999, ADV NEURAL INFORMATI, V11
[9]  
LIN Y, 1999, IN PRESS DATA MINING
[10]  
Ringrose T. J., 1997, Construction and Assessment of Classification Rules, V53, DOI https://doi.org/10.2307/2533581