Time-series clustering - A decade review

被引:1121
作者
Aghabozorgi, Saeed [1 ]
Shirkhorshidi, Ali Seyed [1 ]
Teh Ying Wah [1 ]
机构
[1] Univ Malaya, Dept Informat Syst, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
关键词
Clustering; Time-series; Distance measure; Evaluation measure; Representations; GENE-EXPRESSION DATA; SIMILARITY SEARCH; DIMENSIONALITY REDUCTION; AVERAGING METHOD; REPRESENTATION; ALGORITHMS; MODEL; CLASSIFICATION; COMPRESSION; RECOGNITION;
D O I
10.1016/j.is.2015.04.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is a solution for classifying enormous data when there is not any early knowledge about classes. With emerging new concepts like cloud computing and big data and their vast applications in recent years, research works have been increased on unsupervised solutions like clustering algorithms to extract knowledge from this avalanche of data. Clustering time-series data has been used in diverse scientific areas to discover patterns which empower data analysts to extract valuable information from complex and massive datasets. In case of huge datasets, using supervised classification solutions is almost impossible, while clustering can solve this problem using unsupervised approaches. In this research work, the focus is on time-series data, which is one of the popular data types in clustering problems and is broadly used from gene expression data in biology to stock market analysis in finance. This review will expose four main components of time-series clustering and is aimed to represent an updated investigation on the trend of improvements in efficiency, quality and complexity of clustering time-series approaches during the last decade and enlighten new paths for future works. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:16 / 38
页数:23
相关论文
共 242 条
[1]   Aligning gene expression time series with time warping algorithms [J].
Aach, J ;
Church, GM .
BIOINFORMATICS, 2001, 17 (06) :495-508
[2]  
Abdulla WH, 2003, TENCON IEEE REGION, P1576
[3]  
Abonyi J., 2005, IEEE international conference on computational cybernetics, P29
[4]  
Aghabozorgi Saeed R., 2011, Proceedings of the 2011 International Conference on Data Mining (DMIN 2011), P214
[5]   A Hybrid Algorithm for Clustering of Time Series Data Based on Affinity Search Technique [J].
Aghabozorgi, Saeed ;
Teh, Ying Wah ;
Herawan, Tutut ;
Jalab, Hamid A. ;
Shaygan, Mohammad Amin ;
Jalali, Alireza .
SCIENTIFIC WORLD JOURNAL, 2014,
[6]   Stock market co-movement assessment using a three-phase clustering method [J].
Aghabozorgi, Saeed ;
Teh, Ying Wah .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (04) :1301-1314
[7]  
Aghabozorgi S, 2012, J INF SCI ENG, V28, P671
[8]  
Agrawal R., 1993, Foundations of Data Organization and Algorithms. 4th International Conference. FODO '93 Proceedings, P69
[9]  
Alon J, 2003, PROC CVPR IEEE, P375
[10]   A comparison of extrinsic clustering evaluation metrics based on formal constraints [J].
Amigo, Enrique ;
Gonzalo, Julio ;
Artiles, Javier ;
Verdejo, Felisa .
INFORMATION RETRIEVAL, 2009, 12 (04) :461-486