A dynamic clustering algorithm for building overlapping clusters

被引:5
作者
Perez-Suarez, Airel [1 ,2 ]
Fco Martinez-Trinidad, Jose [1 ]
Carrasco-Ochoa, Jesus A. [1 ]
Medina-Pagola, Jose E. [2 ]
机构
[1] Natl Inst Astrophys, Dept Comp Sci, Puebla 72840, Mexico
[2] Adv Technol Applicat Ctr, Havana, Cuba
关键词
Data mining; overlapping clustering; graph-based algorithms; TOPIC DETECTION;
D O I
10.3233/IDA-2012-0520
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a Data Mining technique which has been widely used in many practical applications. In some of these applications like, medical diagnosis, categorization of digital libraries, topic detection and others, the objects could belong to more than one cluster. However, most of the clustering algorithms generate disjoint clusters. Moreover, processing additions, deletions and modifications of objects in the clustering built so far, without having to rebuild the clustering from the beginning is an issue that has been little studied. In this paper, we introduce DCS, a clustering algorithm which includes a new graph-cover strategy for building a set of clusters that could overlap, and a strategy for dynamically updating the clustering, managing multiple additions and/or deletions of objects. The experimental evaluation conducted over different collections demonstrates the good performance of the proposed algorithm.
引用
收藏
页码:211 / 232
页数:22
相关论文
共 40 条
  • [21] A threshold criterion, auto-detection and its use in MST-based clustering
    He, Yu
    Chen, Lihui
    [J]. INTELLIGENT DATA ANALYSIS, 2005, 9 (03) : 253 - 271
  • [22] Data clustering: A review
    Jain, AK
    Murty, MN
    Flynn, PJ
    [J]. ACM COMPUTING SURVEYS, 1999, 31 (03) : 264 - 323
  • [23] Khy S., 2006, P 22 INT C DAT ENG W, P40
  • [24] Kuncheva LI, 2004, IEEE SYS MAN CYBERN, P1214
  • [25] Lazarevic A., 2001, Intelligent Data Analysis, V5, P285
  • [26] LIU YG, 2004, INTELL DATA ANAL, V8, P325
  • [27] Mahata P., 2008, IEEE ACM T COMPUTATI
  • [28] Fast and effective clustering of XML data using structural information
    Nayak, Richi
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 14 (02) : 197 - 215
  • [29] Streaming-data algorithms for high-quality clustering
    O'Callaghan, L
    Mishra, N
    Meyerson, A
    Guha, S
    Motwani, R
    [J]. 18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 685 - 694
  • [30] An overview of clustering methods
    Omran, Mahamed G. H.
    Engelbrecht, Andries P.
    Salman, Ayed
    [J]. INTELLIGENT DATA ANALYSIS, 2007, 11 (06) : 583 - 605