Ensemble clustering with voting active clusters

被引:54
作者
Tumer, Kagan [1 ]
Agogino, Adrian K. [2 ]
机构
[1] Oregon State Univ, Corvallis, OR 97330 USA
[2] NASA, Ames Res Ctr, UCSC, Moffett Field, CA 94035 USA
关键词
cluster ensembles; consensus clustering; distributed clustering; adaptive clustering;
D O I
10.1016/j.patrec.2008.06.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an integral part of pattern recognition problems and is connected to both the data reduction and the data understanding steps. Combining multiple clusterings into an ensemble clustering is critical in many real world applications, particularly for domains with large data sets, high-dimensional feature sets and proprietary data. This paper presents voting active clusters (VACs), a method for combining multiple "base" clusterings into a single unified "ensemble" Clustering that is robust against missing data and does not require all the data to be collected in one central location. In this approach, separate Processing centers produce many base clusterings based on some portion of the data. The clusterings Of Such separate processing centers are then pooled to produce a unified ensemble Clustering through a voting mechanism. The major contribution of this work is in providing an adaptive voting method by which the clusterings (e.g., spatially distributed processing centers) update their votes in order to maximize an overall quality measure. Our results show that this method achieves comparable or better performance than traditional Cluster ensemble methods in noise-free conditions, and remains effective in noisy scenarios where many traditional methods are inapplicable. (c) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:1947 / 1953
页数:7
相关论文
共 31 条
[1]  
AGOGINO A, 2004, GENETIC EVOLUTIONARY
[2]  
AGOGINO A, 2006, P 5 INT JOINT C AUT
[3]  
ALIMOGLU F, 1997, 4 INT C DOC AN REC U, V2
[4]  
[Anonymous], 2005, Data Mining Pratical Machine Learning Tools and Techniques
[5]  
[Anonymous], 2002, Relationship-based Clustering and Cluster Ensembles for High-Dimensional Data Mining
[6]  
[Anonymous], 2005, NEURAL NETWORKS PATT
[7]  
[Anonymous], 1997, Machine Learning
[8]   Partitioning-based clustering for Web document categorization [J].
Boley, D ;
Gini, M ;
Gross, R ;
Han, EH ;
Hastings, K ;
Karypis, G ;
Kumar, V ;
Mobasher, B ;
Moore, J .
DECISION SUPPORT SYSTEMS, 1999, 27 (03) :329-341
[9]   Scale-based clustering using the radial basis function network [J].
Chakravarthy, SV ;
Ghosh, J .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (05) :1250-1261
[10]   Concept decompositions for large sparse text data using clustering [J].
Dhillon, IS ;
Modha, DS .
MACHINE LEARNING, 2001, 42 (1-2) :143-175