Semantic-based surveillance video retrieval

被引：161

作者：

Hu, Weiming ^{[1
]}

Xie, Dan

Fu, Zhouyu

Zeng, Wenrong

Maybank, Steve

机构：

[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, Beijing 100080, Peoples R China

[2] Univ London Birkbeck Coll, Sch Comp Sci & Informat Syst, London WC1E 7HX, England

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2007年 / 16卷 / 04期

基金：

中国国家自然科学基金;

关键词：

activity models; semantic-based; video retrieval; visual surveillance;

D O I：

10.1109/TIP.2006.891352

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual surveillance produces large amounts of video data. Effective indexing and retrieval from surveillance video databases are very important. Although there are many ways to represent the content of video clips in current video retrieval algorithms, there still exists a semantic gap between users and retrieval systems. Visual surveillance systems supply a platform for investigating semantic-based video retrieval. In this paper, a semantic-based video retrieval framework for visual surveillance is proposed. A cluster-based tracking algorithm is developed to acquire motion trajectories. The trajectories are then clustered hierarchically using the spatial and temporal information, to learn activity models. A hierarchical structure of semantic indexing and retrieval of object activities, where each individual activity automatically inherits all the semantic descriptions of the activity model to which it belongs, is proposed for accessing video clips and individual objects at the semantic level. The proposed retrieval framework supports various queries including queries by keywords, multiple object queries, and queries by sketch. For multiple object queries, succession and simultaneity restrictions, together with depth and breadth first orders, are considered. For sketch-based queries, a method for matching trajectories drawn by users to spatial trajectories is proposed. The effectiveness and efficiency of our framework are tested in a crowded traffic scene.

引用

页码：1168 / 1181

页数：14

共 37 条

[1] Monitoring human behavior from video taken in an office environment [J].

Ayers, D ;

Shah, M .

IMAGE AND VISION COMPUTING, 2001, 19 (12) :833-846

[2]

Bashir FI, 2003, 2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 2, PROCEEDINGS, P623

[3] CONTEXT KNOWLEDGE AND SEARCH CONTROL ISSUES IN OBJECT-ORIENTED PROLOG-BASED IMAGE UNDERSTANDING [J].

BELL, B ;

PAU, LF .

PATTERN RECOGNITION LETTERS, 1992, 13 (04) :279-290

[4] VISUAL SURVEILLANCE IN A DYNAMIC AND UNCERTAIN WORLD [J].

BUXTON, H ;

GONG, SG .

ARTIFICIAL INTELLIGENCE, 1995, 78 (1-2) :431-459

[5] A fully automated content-based video search engine supporting spatiotemporal queries [J].

Chang, SF ;

Chen, W ;

Meng, HJ ;

Sundaram, H ;

Zhong, D .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 1998, 8 (05) :602-615

[6] A new methodology to predict energy bandgaps in GaxIn1-xAsyP1-y compounds by ANFIS theories [J].

Chen, SL ;

Fann, DA .

OPTOELECTRONIC MATERIALS AND DEVICES II, 2000, 4078 :544-550

[7]

CHIKASHI Y, 2002, P 6 IFIP 2 6 WORK C, P357

[8]

COLEN JF, 2002, IEEE T FUZZY SYST, V2, P263

[9] View-based interpretation of real-time optical flow for gesture recognition [J].

Cutler, R ;

Turk, M .

AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS, 1998, :416-421

[10] Models for motion-based video indexing and retrieval [J].

Dagtas, S ;

Al-Khatib, W ;

Ghafoor, A ;

Kashyap, RL .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2000, 9 (01) :88-101

← 1 2 3 4 →