Characterizing activity sequences using profile Hidden Markov Models

被引:45
作者
Liu, Feng [1 ]
Janssens, Davy [1 ]
Cui, JianXun [2 ]
Wets, Geert [1 ]
Cools, Mario [3 ]
机构
[1] Hasselt Univ, Transportat Res Inst IMOB, B-3590 Diepenbeek, Belgium
[2] Harbin Inst Technol, Dept Transport Engn, Harbin 1500, Peoples R China
[3] Univ Liege, LEMA, B-4000 Liege, Belgium
关键词
Profile Hidden Markov Models (pHMMs); Sequence Alignment Methods (SAM); Multiple sequence alignments; Activity sequences; Activity-travel diaries; Mobile phone data; OPTIMAL MATCHING ANALYSIS; ACTIVITY PATTERNS; TIME-USE; ALIGNMENT; SPACE; RECOGNITION; SYSTEM; CLASSIFICATION; IDENTIFICATION; SERVICES;
D O I
10.1016/j.eswa.2015.02.057
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In literature, activity sequences, generated from activity-travel diaries, have been analyzed and classified into clusters based on the composition and ordering of the activities using Sequence Alignment Methods (SAM). However, using these methods, only the frequent activities in each cluster are extracted and qualitatively described; the infrequent activities and their related travel episodes are disregarded. Thus, to quantify the occurrence probabilities of all the daily activities as well as their sequential orders, we develop a novel process to build multiple alignments of the sequences and subsequently derive profile Hidden Markov Models (pHMMs). This process consists of 4 major steps. First, activity sequences are clustered based on a pre-defined scheme. The frequent activities along with their sequential orders are then identified in each cluster, and they are subsequently used as a template to guide the construction of a multiple alignment of the cluster of sequences. Finally, a pHMM is employed to convert the multiple alignment into a position-specific scoring system, representing the probability of each frequent activity at each important position of the alignment as well as the probabilities of both insertion and deletion of infrequent activities. By applying the derived pHMMs to a set of activity-travel diaries collected in Belgium as well as a group of mobile phone call location data recorded in Switzerland, the potemial and effectiveness of the models in capturing the sequential features of each cluster and distinguishing them from those of other clusters, are demonstrated. The proposed method can also be utilized to improve activity-based transportation model validation and travel survey designs. Furthermore, it offers a wide application in characterizing a group of any related sequences, particularly sequences varying in length and with a high frequency of short sequences that are typically present in human behavior. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:5705 / 5722
页数:18
相关论文
共 67 条
[1]   OPTIMAL MATCHING METHODS FOR HISTORICAL SEQUENCES [J].
ABBOTT, A ;
FORREST, J .
JOURNAL OF INTERDISCIPLINARY HISTORY, 1986, 16 (03) :471-494
[2]  
[Anonymous], WORKSH LONG RES SOC
[3]  
[Anonymous], EXPERT SYSTEMS APPL
[4]  
[Anonymous], J TRANSPORT IN PRESS
[5]  
[Anonymous], AMB INT FUT TRENDS I
[6]  
[Anonymous], DEV DYNAMIC ACTIVITY
[7]  
[Anonymous], P BIVEC GIBET TRAN 2
[8]  
[Anonymous], P ESRI INT US C
[9]  
[Anonymous], J MOL BIOL
[10]   A learning-based transportation oriented simulation system [J].
Arentze, TA ;
Timmermans, HJP .
TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2004, 38 (07) :613-633