Action categorization by structural probabilistic latent semantic analysis

被引:39
作者
Zhang, Jianguo [1 ]
Gong, Shaogang [2 ]
机构
[1] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT7 1NN, Antrim, North Ireland
[2] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England
基金
英国工程与自然科学研究理事会;
关键词
Action categorization; pLSA; Structural pLSA; Local shape context; OBJECT CATEGORIES; RECOGNITION; SHAPE;
D O I
10.1016/j.cviu.2010.04.006
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
Temporal dependency is a very important cue for modeling human actions. However, approaches using latent topics models, e.g., probabilistic latent semantic analysis (pLSA), employ the bag of words assumption therefore word dependencies are usually ignored. In this work, we propose a new approach structural pLSA (SpLSA) to model explicitly word orders by introducing latent variables. More specifically, we develop an action categorization approach that learns action representations as the distribution of latent topics in an unsupervised way, where each action frame is characterized by a codebook representation of local shape context. The effectiveness of this approach is evaluated using both the WEIZMANN dataset and the MIT dataset. Results show that the proposed approach outperforms the standard pLSA. Additionally, our approach is compared favorably with six existing models including GMM, logistic regression, HMM, SVM, CRF, and HCRF given the same feature representation. These comparative results show that our approach achieves higher categorization accuracy than the five existing models and is comparable to the state-of-the-art hidden conditional random field based model using the same feature set. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:857 / 864
页数:8
相关论文
共 42 条
[1]
[Anonymous], 2005, 2 JOINT IEEE INT WOR
[2]
[Anonymous], 2007, UCBCSD041366 CALTECH
[3]
[Anonymous], 2006, Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Volume 2, Washington, DC, USA
[4]
[Anonymous], IEEE C COMP VIS PAT, DOI DOI 10.1109/CVPR.2006.68
[5]
Shape matching and object recognition using shape contexts [J].
Belongie, S ;
Malik, J ;
Puzicha, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (04) :509-522
[6]
Blank M, 2005, IEEE I CONF COMP VIS, P1395
[7]
The recognition of human movement using temporal templates [J].
Bobick, AF ;
Davis, JW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) :257-267
[8]
Detecting irregularities in images and in video [J].
Boiman, Oren ;
Irani, Michal .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2007, 74 (01) :17-31
[9]
Bosch A., 2008, INT J COMPUTER UNPUB
[10]
3-d articulated pose tracking for untethered diectic reference [J].
Demirdjian, D ;
Darrell, T .
FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, :267-272