Biomedical time series clustering based on non-negative sparse coding and probabilistic topic model

被引:24
作者
Wang, Jin [1 ,3 ]
Liu, Ping [2 ]
She, Mary F. H. [1 ,3 ]
Nahavandi, Saeid [1 ]
Kouzani, Abbas [4 ]
机构
[1] Deakin Univ, Ctr Intelligent Syst Res, Waurn Ponds, Vic 3217, Australia
[2] Univ S Carolina, Dept Comp Sci, Columbia, SC 29205 USA
[3] Deakin Univ, Inst Frontier Mat, Waurn Ponds, Vic 3217, Australia
[4] Deakin Univ, Sch Engn, Waurn Ponds, Vic 3217, Australia
关键词
Unsupervised learning; Bag-of-Words; Probabilistic topic model; Sparse coding; ECG; REPRESENTATION; CLASSIFICATION; RECOGNITION;
D O I
10.1016/j.cmpb.2013.05.022
中图分类号
TP39 [计算机的应用];
学科分类号
080201 [机械制造及其自动化];
摘要
Biomedical time series clustering that groups a set of unlabelled temporal signals according to their underlying similarity is very useful for biomedical records management and analysis such as biosignals archiving and diagnosis. In this paper, a new framework for clustering of long-term biomedical time series such as electrocardiography (ECG) and electroencephalography (EEG) signals is proposed. Specifically, local segments extracted from the time series are projected as a combination of a small number of basis elements in a trained dictionary by non-negative sparse coding. A Bag-of-Words (BoW) representation is then constructed by summing up all the sparse coefficients of local segments in a time series. Based on the BoW representation, a probabilistic topic model that was originally developed for text document analysis is extended to discover the underlying similarity of a collection of time series. The underlying similarity of biomedical time series is well captured attributing to the statistic nature of the probabilistic topic model. Experiments on three datasets constructed from publicly available EEG and ECG signals demonstrates that the proposed approach achieves better accuracy than existing state-of-the-art methods, and is insensitive to model parameters such as length of local segments and dictionary size. (c) 2013 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:629 / 641
页数:13
相关论文
共 39 条
[1]
K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].
Aharon, Michal ;
Elad, Michael ;
Bruckstein, Alfred .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) :4311-4322
[2]
Andrzejak R. G., 2001, PHYS REV E 1, V64
[3]
[Anonymous], 2005, Advances in Neural Information Processing Systems
[4]
Probabilistic Topic Models [J].
Blei, David M. .
COMMUNICATIONS OF THE ACM, 2012, 55 (04) :77-84
[5]
Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[6]
Decoding by linear programming [J].
Candes, EJ ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (12) :4203-4215
[7]
Parallel Spectral Clustering in Distributed Systems [J].
Chen, Wen-Yen ;
Song, Yangqiu ;
Bai, Hongjie ;
Lin, Chih-Jen ;
Chang, Edward Y. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (03) :568-586
[8]
A wavelet optimization approach for ECG signal classification [J].
Daamouche, Abdelhamid ;
Hamami, Latifa ;
Alajlan, Naif ;
Melgani, Farid .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2012, 7 (04) :342-349
[10]
Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499