Time series clustering with ARMA mixtures

被引:158
作者
Xiong, YM [1 ]
Yeung, DY [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
关键词
ARMA model; EM algorithm; mixture model; model-based clustering; time series analysis;
D O I
10.1016/j.patcog.2003.12.018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Clustering problems are central to many knowledge discovery and data mining tasks. However, most existing clustering methods can only work with fixed-dimensional representations of data patterns. In this paper, we study the clustering of data patterns that are represented as sequences or time series possibly of different lengths. We propose a model-based approach to this problem using mixtures of autoregressive moving average (ARMA) models. We derive an expectation-maximization (EM) algorithm for learning the mixing coefficients as well as the parameters of the component models. To address the model selection problem, we use the Bayesian information criterion (BIC) to determine the number of clusters in the data. Experiments are conducted on a number of simulated and real datasets. Results from the experiments show that our method compares favorably with other methods proposed previously by others for similar time series clustering tasks. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1675 / 1689
页数:15
相关论文
共 44 条
[1]
[Anonymous], 1985, Computational Statistics Quarterly, DOI DOI 10.1155/2010/874592
[2]
MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[3]
Box G, 1976, TIME SERIES ANAL FOR
[4]
Cadez I., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P280, DOI 10.1145/347090.347151
[5]
Cadez I. V., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P140, DOI 10.1145/347090.347119
[6]
Cadez I.V., 2001, MSRTR0018
[7]
A CLASSIFICATION EM ALGORITHM FOR CLUSTERING AND 2 STOCHASTIC VERSIONS [J].
CELEUX, G ;
GOVAERT, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1992, 14 (03) :315-332
[8]
MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[9]
A MAXIMUM-LIKELIHOOD METHODOLOGY FOR CLUSTERWISE LINEAR-REGRESSION [J].
DESARBO, WS ;
CRON, WL .
JOURNAL OF CLASSIFICATION, 1988, 5 (02) :249-282
[10]
Gaffney S., 2003, P 9 INT WORKSH ART I