Streaming-data algorithms for high-quality clustering

被引：216

作者：

O'Callaghan, L ^{[1
]}

Mishra, N ^{[1
]}

Meyerson, A ^{[1
]}

Guha, S ^{[1
]}

Motwani, R ^{[1
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

来源：

18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS | 2002年

关键词：

D O I：

10.1109/ICDE.2002.994785

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Streaming data analysis has recently attracted attention in numerous applications including telephone records, web documents and clickstreams. For such analysis, single-pass algorithms that consume a small amount of memory arc critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.

引用

页码：685 / 694

页数：10

共 27 条

[1]

Ankerst M, 1999, P SIGMOD

[2]

[Anonymous], 2006, PATTERN CLASSIFICATI

[3]

Bradley P. S., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P9

[4]

BRADLEY PS, 1998, P 15 INT C MACH LEAR, P91

[5]

CHARIKAR M, 1999, P FOCS

[6]

Ester M., 1996, DENSITY BASED ALGORI

[7]

FARNSTROM F, 2000, SIGKDD EXPL

[8]

FEIGENBAUM J, 1999, APPROXIMATE 11 DIFFE

[9]

Guha S., 1998, SIGMOD Record, V27, P73, DOI 10.1145/276305.276312

[10] Clustering data streams [J].

Guha, S ;

Mishra, N ;

Motwani, R ;

O'Callaghan, L .

41ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2000, :359-366

← 1 2 3 →