Cost-based feature selection for Support Vector Machines: An application in credit scoring

被引:118
作者
Maldonado, Sebastian [1 ]
Perez, Juan [1 ]
Bravo, Cristian [2 ]
机构
[1] Univ Los Andes, Fac Ingn & Ciencias Aplicadas, Monsenor Alvaro del Portillo 12455, Santiago, Chile
[2] Univ Southampton, Dept Decis Analyt & Risk, Southampton Business Sch, Univ Rd, Southampton SO17 1BJ, Hants, England
关键词
Analytics; Feature selection; Support Vector Machines; Mixed-integer programming; Credit scoring;
D O I
10.1016/j.ejor.2017.02.037
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
In this work we propose two formulations based on Support Vector Machines for simultaneous classification and feature selection that explicitly incorporate attribute acquisition costs. This is a challenging task for two main reasons: the estimation of the acquisition costs is not straightforward and may depend on multivariate factors, and the inter-dependence between variables must be taken into account for the modelling process since companies usually acquire groups of related variables rather than acquiring them individually. Mixed-integer linear programming models are proposed for constructing classifiers that constrain acquisition costs while classifying adequately. Experimental results using credit scoring datasets demonstrate the effectiveness of our methods in terms of predictive performance at a low cost compared to well-known feature selection approaches. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:656 / 665
页数:10
相关论文
共 33 条
[1]  
Anderson Raymond, 2007, The Credit Scoring Toolkit
[2]  
[Anonymous], 2007, Data Mining Workshops
[3]  
[Anonymous], J MACHINE LEARNING R
[4]  
Bach FR, 2006, J MACH LEARN RES, V7, P1713
[5]  
Bache K., 2013, UCI Machine Learning Repository
[6]  
Basel Committee on Banking Supervision, 2006, BASE 2 INT CONVERGEN
[7]   BEST SUBSET SELECTION VIA A MODERN OPTIMIZATION LENS [J].
Bertsimas, Dimitris ;
King, Angela ;
Mazumder, Rahul .
ANNALS OF STATISTICS, 2016, 44 (02) :813-852
[8]  
Bradley P. S., 1998, Machine Learning. Proceedings of the Fifteenth International Conference (ICML'98), P82
[9]   Granting and managing loans for micro-entrepreneurs: New developments and practical experiences [J].
Bravo, Cristian ;
Maldonado, Sebastian ;
Weber, Richard .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2013, 227 (02) :358-366
[10]   Multi-group support vector machines with measurement costs: A biobjective approach [J].
Carrizosa, Emilio ;
Martin-Barragan, Belen ;
Morales, Dolores Romero .
DISCRETE APPLIED MATHEMATICS, 2008, 156 (06) :950-966