Incremental kernel fuzzy c-means with optimizing cluster center initialization and delivery

被引:8
作者
Jiao, Runhai [1 ]
Liu, Shaolong [1 ]
Wen, Wu [1 ]
Lin, Biying [1 ]
机构
[1] North China Elect Power Univ, Sch Control & Comp Engn, Beijing, Peoples R China
关键词
Big data; Incremental clustering; Initial cluster center; Multiple passing points; ALGORITHM;
D O I
10.1108/K-08-2015-0209
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Purpose - The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on incremental clustering which divides data into series of data chunks and only a small amount of data need to be clustered at each time. Few researches on incremental clustering algorithm address the problem of optimizing cluster center initialization for each data chunk and selecting multiple passing points for each cluster. Design/methodology/approach - Through optimizing initial cluster centers, quality of clustering results is improved for each data chunk and then quality of final clustering results is enhanced. Moreover, through selecting multiple passing points, more accurate information is passed down to improve the final clustering results. The method has been proposed to solve those two problems and is applied in the proposed algorithm based on streaming kernel fuzzy c-means (stKFCM) algorithm. Findings - Experimental results show that the proposed algorithm demonstrates more accuracy and better performance than streaming kernel stKFCM algorithm. Originality/value - This paper addresses the problem of improving the performance of increment clustering through optimizing cluster center initialization and selecting multiple passing points. The paper analyzed the performance of the proposed scheme and proved its effectiveness.
引用
收藏
页码:1273 / 1291
页数:19
相关论文
共 23 条
[1]  
Aaron B., 2014, P 6 INT C PERV PATT, P28
[2]   Dynamic Incremental K-means Clustering [J].
Aaron, Bryant ;
Tamir, Dan E. ;
Rishe, Naphtali D. ;
Kandel, Abraham .
2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), VOL 1, 2014, :308-313
[3]  
Baili N, 2011, IEEE INT CONF FUZZY, P490
[4]   Local information-based fast approximate spectral clustering [J].
Cao, Jiangzhong ;
Chen, Pei ;
Dai, Qingyun ;
Ling, Wing-Kuen .
PATTERN RECOGNITION LETTERS, 2014, 38 :63-69
[5]  
Chitta R., 2011, P 17 ACM SIGKDD INT, P895, DOI [DOI 10.1145/2020408.2020558, 10.1145/2020408.2020558]
[6]  
Farrash Majed, 2013, 2013 IEEE International Conference on Big Data, P42, DOI 10.1109/BigData.2013.6691732
[7]   AN IMPROVED ALGORITHM FOR MATCHING BIOLOGICAL SEQUENCES [J].
GOTOH, O .
JOURNAL OF MOLECULAR BIOLOGY, 1982, 162 (03) :705-708
[8]   Fuzzy c-Means Algorithms for Very Large Data [J].
Havens, Timothy C. ;
Bezdek, James C. ;
Leckie, Christopher ;
Hall, Lawrence O. ;
Palaniswami, Marimuthu .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2012, 20 (06) :1130-1146
[9]  
Hore P., 2008, FUZZY INFORM PROCESS, P1, DOI DOI 10.1109/NAFIPS.2008.4531233
[10]  
Hore P, 2007, IEEE INT CONF FUZZY, P240