Performance guarantees for hierarchical clustering

被引:115
作者
Dasgupta, S [1 ]
Long, PM
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92103 USA
[2] Genome Inst Singapore, Singapore, Singapore
关键词
hierarchical clustering; complete linkage; k-center;
D O I
10.1016/j.jcss.2004.10.006
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We show that for any data set in any metric space, it is possible to construct a hierarchical clustering with the guarantee that for every k, the induced k-clustering has cost at most eight times that of the optimal k-clustering. Here the cost of a clustering is taken to be the maximum radius of its clusters. Our algorithm is similar in simplicity and efficiency to popular agglomerative heuristics for hierarchical clustering, and we show that these heuristics have unbounded approximation factors. (c) 2004 Elsevier Inc. All rights reserved.
引用
收藏
页码:555 / 569
页数:15
相关论文
共 16 条