Clustering on demand for multiple data streams

被引:14
作者
Dai, BR [1 ]
Huang, JW [1 ]
Yeh, MY [1 ]
Chen, MS [1 ]
机构
[1] Natl Taiwan Univ, Dept Elect Engn, Taipei 10764, Taiwan
来源
FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2004年
关键词
D O I
10.1109/ICDM.2004.10060
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the data stream environment, the patterns generated by the mining techniques are usually distinct at different time because of the evolution of data. In order to deal with various types of multiple data streams and to support flexible mining requirements, we devise in this paper a Clustering on Demand framework, abbreviated as COD framework, to dynamically cluster multiple data streams. While providing a general framework of clustering on multiple data streams, the COD framework has two major features, namely one data scan for online statistics collection and compact multi-resolution approximations, which are designed to address, respectively, the time and the space constraints in a data stream environment. Furthermore, with the multi-resolution approximations of data streams, flexible clustering demands can be supported.
引用
收藏
页码:367 / 370
页数:4
相关论文
共 11 条
[1]  
[Anonymous], P VLDB
[2]  
BABCOCK B, 2002, P PODS JUN
[3]   SWAT: Hierarchical stream summarization in large networks [J].
Bulut, A ;
Singh, AK .
19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, :303-314
[4]  
DOBRA A, 2002, P 2002 ACM SIGMOD IN, P61, DOI DOI 10.1145/564696.564699
[5]   Clustering data streams [J].
Guha, S ;
Mishra, N ;
Motwani, R ;
O'Callaghan, L .
41ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2000, :359-366
[6]  
Hulten G, 2001, P 7 ACM SIGKDD INT C, P97, DOI DOI 10.1145/502512.502529
[7]  
KEOGH E, 2003, P ICDM NOV
[8]  
Manku GS., 2002, P 28 INT C VER LARG, P346, DOI 10.1016/B978-155860869-6/50038-X
[9]  
OCALLAGHAN L, 2002, P ICDE
[10]  
TENG WG, 2003, P VLDB SEP