基于支持向量机的不平衡数据集分类方法研究

被引:14
作者
杨智明
彭宇
彭喜元
机构
[1] 哈尔滨工业大学自动化测试与控制系
关键词
支持向量机; 不平衡数据集; 模糊样本集修剪; 指导型欠采样;
D O I
10.19650/j.cnki.cjsi.2009.05.037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
研究表明支持向量机分类方法在样本集分布不均衡情况下,对少类样本分类准确率急剧下降。针对该问题,本文提出了一种基于模糊样本集修剪技术和指导型欠采样技术的支持向量机分类算法,并对算法中新引入的参数进行了深入讨论。算法分析和仿真结果表明,文中提出的方法在不增加计算复杂度的前提下,有效地提高了算法整体分类准确率。
引用
收藏
页码:1094 / 1099
页数:6
相关论文
共 8 条
  • [1] The Nature of Statistical Learning Theory. Vapnik VN. . 2000
  • [2] z-SVM:An SVM for improved classification of imbalanced data. T Imam,K M Ting,J Kamruzzaman. Australian Joint Conference on AI . 2006
  • [3] Class-boundary alignment forimbalanced dataset learning. G Wu,E.Cheng. Workshop onLearning from Imbalanced Datasets(ICML03) . 2003
  • [4] SMOTE:synthetic minority over-sampling technique. CHAWLAN V,BOWYERK W,HALLL O,KEGELMEYERWP. Journal of Artificial Organs . 2002
  • [5] UCI repositoryof machine learning databases. BLAKE C L,KEOGH E,MERZ C J. . 1998
  • [6] Addressing the curse of imbal-anced training sets:one-sided selection. KUBATM,MATWIN S. Proc.of the14th International Conference on Machine Learning . 1997
  • [7] On kernel target alignment. CRISTIANINI N,KANDOLA J,ELISSEEFF A,et al. Proceedings of the Neu-ral Information Processing Systems . 2001
  • [8] Controlling the sensitivity of support vector machines. VEROPOULOS K,CAMPBELL C,CRISTIANINI N. Proceedings of the International Joint Conference on AI . 1999