New fuzzy c-means clustering model based on the data weighted approach

被引:27
作者
Tang, Chenglong [1 ]
Wang, Shigang [1 ]
Xu, Wei [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Mech & Dynam Engn, Shanghai 200240, Peoples R China
关键词
Fuzzy clustering; Data weighted approach; Exponent impact factor; Influence exponent; Outliers mining; ALGORITHM; OPTIMIZATION; VALIDITY;
D O I
10.1016/j.datak.2010.05.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new kind of data weighted fuzzy c-means clustering approach. Different from most existing fuzzy clustering approaches, the data weighted clustering approach considers the internal connectivity of all data points. An exponent impact factors vector and an influence exponent are introduced to the new model. Together they influence the clustering process. The data weighted clustering can simultaneously produce three categories of parameters: fuzzy membership degrees, exponent impact factors and the cluster prototypes. A new fuzzy algorithm, DWG-K, is developed by combining the data weighted approach and the G-K. Two groups of numerical experiments were executed. Group 1 demonstrates the clustering performance of the DWG-K. The counterpart is the G-K. The results show the DWG-K can obtain better clustering quality and meanwhile it holds the same level of computational efficiency as the G-K holds. Group 2 checks the ability of the DWG-K in mining the outliers. The counterpart is the well-known LOF. The results show the DWG-K has considerable advantage over the LOF in computational efficiency. And the outliers mined by the DWG-K are global. It was pointed out that the data weighted clustering approach has its unique advantages when mining the outliers of the large scale data sets, when clustering the data set for better clustering results, and especially when these two tasks are done simultaneously. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:881 / 900
页数:20
相关论文
共 24 条
[1]  
[Anonymous], P IEEE C DEC CONTR S
[2]  
[Anonymous], Pattern Recognition with Fuzzy Objective Function Algorithms
[3]   A possibilistic approach to clustering - Comments [J].
Barni, M ;
Cappellini, V ;
Mecocci, A .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1996, 4 (03) :393-396
[4]  
BORGELT C, 2008, FUZZ IEEE 2008 2008, P838
[5]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[6]   TOD: Temporal outlier detection by using quasi-functional temporal dependencies [J].
Bruno, Giulia ;
Garza, Paolo .
DATA & KNOWLEDGE ENGINEERING, 2010, 69 (06) :619-639
[7]   An optimization algorithm for clustering using weighted dissimilarity measures [J].
Chan, EY ;
Ching, WK ;
Ng, MK ;
Huang, JZ .
PATTERN RECOGNITION, 2004, 37 (05) :943-952
[8]   CHARACTERIZATION AND DETECTION OF NOISE IN CLUSTERING [J].
DAVE, RN .
PATTERN RECOGNITION LETTERS, 1991, 12 (11) :657-664
[9]   Fast mining of distance-based outliers in high-dimensional datasets [J].
Ghoting, Amol ;
Parthasarathy, Srinivasan ;
Otey, Matthew Eric .
DATA MINING AND KNOWLEDGE DISCOVERY, 2008, 16 (03) :349-364
[10]  
Kamber M., 2005, Data Mining: Concepts and Techniques, V2nd