Semantic-based surveillance video retrieval

被引：161

作者：

Hu, Weiming ^{[1
]}

Xie, Dan

Fu, Zhouyu

Zeng, Wenrong

Maybank, Steve

机构：

[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, Beijing 100080, Peoples R China

[2] Univ London Birkbeck Coll, Sch Comp Sci & Informat Syst, London WC1E 7HX, England

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2007年 / 16卷 / 04期

基金：

中国国家自然科学基金;

关键词：

activity models; semantic-based; video retrieval; visual surveillance;

D O I：

10.1109/TIP.2006.891352

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual surveillance produces large amounts of video data. Effective indexing and retrieval from surveillance video databases are very important. Although there are many ways to represent the content of video clips in current video retrieval algorithms, there still exists a semantic gap between users and retrieval systems. Visual surveillance systems supply a platform for investigating semantic-based video retrieval. In this paper, a semantic-based video retrieval framework for visual surveillance is proposed. A cluster-based tracking algorithm is developed to acquire motion trajectories. The trajectories are then clustered hierarchically using the spatial and temporal information, to learn activity models. A hierarchical structure of semantic indexing and retrieval of object activities, where each individual activity automatically inherits all the semantic descriptions of the activity model to which it belongs, is proposed for accessing video clips and individual objects at the semantic level. The proposed retrieval framework supports various queries including queries by keywords, multiple object queries, and queries by sketch. For multiple object queries, succession and simultaneity restrictions, together with depth and breadth first orders, are considered. For sketch-based queries, a method for matching trajectories drawn by users to spatial trajectories is proposed. The effectiveness and efficiency of our framework are tested in a crowded traffic scene.

引用

页码：1168 / 1181

页数：14

共 37 条

[21]

Jung YK, 2001, IEEE T INTELL TRANSP, V2, P151, DOI 10.1109/6979.954548

[22] Traffic Monitoring and Accident Detection at Intersections [J].

Kamijo, Shunsuke ;

Matsushita, Yasuyuki ;

Ikeuchi, Katsushi ;

Sakauchi, Masao .

IEEE Transactions on Intelligent Transportation Systems, 2000, 1 (02) :108-117

[23] Natural language description of human activities from video images based on concept hierarchy of actions [J].

Kojima, A ;

Tamura, T ;

Fukunaga, K .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2002, 50 (02) :171-184

[24]

Liu ZQ, 2001, IEEE T SYST MAN CY B, V31, P557, DOI 10.1109/3477.938260

[25] Path detection in video surveillance [J].

Makris, D ;

Ellis, T .

IMAGE AND VISION COMPUTING, 2002, 20 (12) :895-903

[26]

Maurin B, 2002, IEEE 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, PROCEEDINGS, P19, DOI 10.1109/ITSC.2002.1041182

[27]

MOHNHAUPT M, 1990, P EUR C COMP VIS ANT, P598

[28]

Ng AY, 2002, ADV NEUR IN, V14, P849

[29] Application of the self-organising map to trajectory classification [J].

Owens, J ;

Hunter, A .

THIRD IEEE INTERNATIONAL WORKSHOP ON VISUAL SURVEILLANCE, PROCEEDINGS, 2000, :77-83

[30] Complexity reduction for "large image" processing [J].

Pal, NR ;

Bezdek, JC .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2002, 32 (05) :598-611

← 1 2 3 4 →