Smaller coresets for k-median and k-means clustering

被引:107
作者
Har-Peled, Sariel [1 ]
Kushal, Akash [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
关键词
Voronoi Diagram; Discrete Comput Geom; Cluster Problem; Cumulative Error; Weighted Point;
D O I
10.1007/s00454-006-1271-x
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we show that there exists a (k, epsilon)-coreset for k-median and k-means clustering of n points in R-d, which is of size independent of n. In particular, we construct a (k, epsilon)-coreset of size O(k(2)/epsilon(d)) for k-median clustering, and of size O(k(3)/epsilon(d+1)) for k-means clustering.
引用
收藏
页码:3 / 19
页数:17
相关论文
共 16 条
[1]   Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems [J].
Arora, S .
JOURNAL OF THE ACM, 1998, 45 (05) :753-782
[2]  
Arora S., 1998, Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, P106, DOI 10.1145/276698.276718
[3]  
Badoiu M, 2002, P 34 ANN ACM S THEOR, P250, DOI DOI 10.1145/509907.509947
[4]  
Charikar M., 1999, Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, P1, DOI 10.1145/301250.301257
[5]  
de la Vega W.F., 2003, P 35 ANN ACM S THEOR, P50, DOI [10.1145/780542.780550, DOI 10.1145/780542.780550]
[6]   FIXED-PARAMETER TRACTABILITY AND COMPLETENESS .1. BASIC RESULTS [J].
DOWNEY, RG ;
FELLOWS, MR .
SIAM JOURNAL ON COMPUTING, 1995, 24 (04) :873-921
[7]  
EFFROS M, 2003, TR04085 EL C COMP CO
[8]   Clustering data streams [J].
Guha, S ;
Mishra, N ;
Motwani, R ;
O'Callaghan, L .
41ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2000, :359-366
[9]  
Har-Peled Sariel, 2004, Proc. 36th Annu. ACM Sympos. Theory Comput, P291
[10]  
Inaba M., 1994, Proceedings of the Tenth Annual Symposium on Computational Geometry, P332, DOI 10.1145/177424.178042