Unsupervised approximate-semantic vocabulary learning for human action and video classification

被引:9
作者
Zhao, Qiong [1 ]
Ip, Horace H. S. [2 ]
机构
[1] USTC CityU Joint Adv Res Ctr, Suzhou, Peoples R China
[2] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
关键词
Contextual spectral embedding; Pearson product moment correlation; Visual vocabulary;
D O I
10.1016/j.patrec.2013.03.037
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
The paper presents a novel unsupervised contextual spectral (CSE) framework for human action and video classification. Similar to textual words, the visual word (a mid-level semantic) representation of an image or video contains a combination of synonymous words which give rise to the ambiguity of the representation. To narrow the semantic gap between visual words (mid-level semantic representation) and high-level semantics, we propose a high level representation called approximate-semantic descriptor. The experimental results show that the proposed approach for visual words disambiguation could improve the subsequent classification performance. In the paper, the approximate-semantic descriptor learning is formulated as a spectral clustering problem, such that semantically associated visual words are placed closely in low-dimensional semantic space and then clustered into one approximate-semantic descriptor. Specifically, the high level representation of human action videos is learnt by capturing the inter-video context of mid-level semantics via a non-parametric correlation measure. Experiments on four standard datasets demonstrate that our approach can achieve significantly improved results with respect to the state of the art, particularly for unconstrained environments. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:1870 / 1878
页数:9
相关论文
共 32 条
[1]
[Anonymous], 2006, 2006 IEEE COMP SOC C
[2]
[Anonymous], 2005, ICCV
[3]
[Anonymous], ICCV
[4]
[Anonymous], 2004, ICPR
[5]
[Anonymous], ICCV
[6]
[Anonymous], 2008, CVPR
[7]
Columbia University ADVENT, 2006, DTO CHALL WORKSH LAR
[8]
Dollar P., 2005, Proceedings. 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS) (IEEE Cat. No. 05EX1178), P65
[9]
Fei-Fei L, 2005, PROC CVPR IEEE, P524
[10]
Actions as space-time shapes [J].
Gorelick, Lena ;
Blank, Moshe ;
Shechtman, Eli ;
Irani, Michal ;
Basri, Ronen .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (12) :2247-2253