A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets

被引:218
作者
Fernandez, Alberto [1 ]
Garcia, Salvador [1 ]
Jose del Jesus, Maria [2 ]
Herrera, Francisco [1 ]
机构
[1] Univ Granada, Dept Comp Sci & AI, E-18071 Granada, Spain
[2] Univ Jaen, Dept Comp Sci, Jaen, Spain
基金
英国医学研究理事会;
关键词
fuzzy rule based classification systems; imbalanced data-sets; imbalance class problem; instance selection; over-sampling; fuzzy reasoning method; rule weights; conjunction operators;
D O I
10.1016/j.fss.2007.12.023
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the field of classification problems, we often encounter classes with a very different percentage of patterns between them, classes with a high pattern percentage and classes with a low pattern percentage. These problems receive the name of "classification problerns with imbalanced data-sets". In this paper we study the behaviour of fuzzy rule based classification systems in the framework of imbalanced data-sets, focusing on the synergy with the preprocessing mechanisms of instances and the configuration of fuzzy rule based classification systems. We will analyse the necessity of applying a preprocessing step to deal with the problem of imbalanced data-sets. Regarding the components of the fuzzy rule base classification system, we are interested in the granularity of the fuzzy partitions, the use of distinct conjunction operators, the application of some approaches to compute the rule weights and the use of different fuzzy reasoning methods. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:2378 / 2398
页数:21
相关论文
共 49 条
[1]  
[Anonymous], 2004, ACM SIGKDD EXPLOR NE, DOI DOI 10.1145/1007730.1007736
[2]  
[Anonymous], 2004, PROC IPMU C
[3]  
[Anonymous], IEEE T SYST MAN CYB
[4]   Strategies for learning in class imbalance problems [J].
Barandela, R ;
Sánchez, JS ;
García, V ;
Rangel, E .
PATTERN RECOGNITION, 2003, 36 (03) :849-851
[5]  
Batista G.E.A.P.A., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]
[6]  
Bhavani Raskutti, 2004, ACM Sigkdd Explor Newsl, V6, P60
[7]   Support vector machines for candidate nodules classification [J].
Campadelli, P ;
Casiraghi, E ;
Valentini, G .
NEUROCOMPUTING, 2005, 68 :281-288
[8]  
Chawla N. V., 2004, ACM Sigkdd Explorations Newsletter, V6, P1, DOI [DOI 10.1145/1007730.1007733, 10.1145/1007730.1007733]
[9]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[10]  
Chi Z., 1996, Fuzzy Algorithms: With Applications to Image Processing and Pattern Recognition, V10