An Online algorithm for segmenting time series

被引:653
作者
Keogh, E [1 ]
Chu, S [1 ]
Hart, D [1 ]
Pazzani, M [1 ]
机构
[1] Univ Calif Irvine, Dept Informat & Comp Sci, Irvine, CA 92697 USA
来源
2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2001年
关键词
D O I
10.1109/ICDM.2001.989531
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, there has been an explosion of interest in mining time series databases. As with most computer science problems, representation of the data is the key to efficient and effective solutions. One of the most commonly used representations is piecewise linear approximation. This representation has been used by various researchers to support clustering, classification, indexing and association rule mining of time series data. A variety of algorithms have been proposed to obtain this representation, with several algorithms having been independently rediscovered several times. In this paper, we undertake the first extensive review and empirical comparison of all proposed techniques. We show that all these algorithms have fatal flaws front a data mining perspective. We introduce a novel algorithm that we empirically show to be superior to all others in the literature.
引用
收藏
页码:289 / 296
页数:8
相关论文
共 31 条
[1]  
Agrawal R., 1995, VLDB '95. Proceedings of the 21st International Conference on Very Large Data Bases, P490
[2]  
AGRAWAL R, 1995, P 21 INT C VER LARG
[3]  
Agrawal R., 1993, EFFICIENT SIMILARITY
[4]  
[Anonymous], 1996, P 12 IEEE INT C DAT
[5]  
Chan K., 1999, P 15 IEEE INT C DAT
[6]  
Das G., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P16
[7]  
Douglas D. H., 1973, CARTOGRAPHICA, V10, P112, DOI [10.3138/fm57-6770-u75u-7727., DOI 10.3138/FM57-6770-U75U-7727, 10.3138/FM57-6770-U75U-7727]
[8]  
GE X, 2001, IN PRESS IEEE T SEMI
[9]  
Hart P.E., 1973, Pattern recognition and scene analysis
[10]  
HECKBERT PS, 1997, P 24 INT C COMP GRAP