Landscape of clustering algorithms

被引:70
作者
Jain, AK [1 ]
Topchy, A [1 ]
Law, MHC [1 ]
Buhmann, JM [1 ]
机构
[1] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
来源
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1 | 2004年
关键词
D O I
10.1109/ICPR.2004.1334073
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Numerous clustering algorithms, their taxonomies and evaluation studies are available in the literature. Despite the diversity of different clustering algorithms, solutions delivered by these algorithms exhibit many commonalities. An analysis of the similarity and properties of clustering objective functions is necessary from the operational/user perspective. We revisit conventional categorization of clustering algorithms and attempt to relate them according to the partitions they produce. We empirically study the similarity of clustering solutions obtained by many traditional as well as relatively recent clustering algorithms on a number of real-world data sets. Sammon's mapping and a complete-link clustering of the inter-clustering dissimilarity values are performed to detect a meaningful grouping of the objective functions. We find that only a small number of clustering algorithms are sufficient to represent a large spectrum of clustering criteria. For example, interesting groups of clustering algorithms are centered around the graph partitioning, linkage-based and Gaussian mixture model based algorithms.
引用
收藏
页码:260 / 263
页数:4
相关论文
共 13 条
[1]
Mean shift: A robust approach toward feature space analysis [J].
Comaniciu, D ;
Meer, P .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (05) :603-619
[2]
Cox T., 2000, MULTIDIMENSIONAL SCA
[3]
CLUSTERING TECHNIQUES - USERS DILEMMA [J].
DUBES, R ;
JAIN, AK .
PATTERN RECOGNITION, 1976, 8 (04) :247-260
[4]
FRALEY C, 380 U WASH DEP STAT
[5]
Guha S., 1998, SIGMOD Record, V27, P73, DOI 10.1145/276305.276312
[6]
Hart, 2006, PATTERN CLASSIFICATI
[7]
Jain K, 1988, Algorithms for clustering data
[8]
KAMVAR S, 2002, P 19 INT C MACH LEAR, P283
[9]
Chameleon: Hierarchical clustering using dynamic modeling [J].
Karypis, G ;
Han, EH ;
Kumar, V .
COMPUTER, 1999, 32 (08) :68-+
[10]
Comparing clusterings by the variation of information [J].
Meila, M .
LEARNING THEORY AND KERNEL MACHINES, 2003, 2777 :173-187