Integrating bottom-up and top-down visual stimulus for saliency detection in news video

被引：11

作者：

Wu, Bo ^{[1
,2
]}

Xu, Linfeng ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Elect Engn, Chengdu 610073, Peoples R China

[2] Henan Normal Univ, Coll Phys & Informat Engn, Xinxiang 453007, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2014年 / 73卷 / 03期

关键词：

Visual saliency; Bottom-up attention; Top-down attention; News video; FACE SEGMENTATION; ATTENTION; MODEL;

D O I：

10.1007/s11042-013-1530-9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a new attention model for detecting visual saliency in news video. In the proposed model, bottom-up (low level) features and top-down (high level) factors are used to compute bottom-up saliency and top-down saliency respectively. Then, the two saliency maps are fused after a normalization operation. In the bottom-up attention model, we use quaternion discrete cosine transform in multi-scale and multiple color spaces to detect static saliency. Meanwhile, multi-scale local motion and global motion conspicuity maps are computed and integrated into motion saliency map. To effectively suppress the background motion noise, a simple histogram of average optical flow is adopted to calculate motion contrast. Then, the bottom-up saliency map is obtained by combining the static and motion saliency maps. In the top-down attention model, we utilize high level stimulus in news video, such as face, person, car, speaker, and flash, to generate the top-down saliency map. The proposed method has been extensively tested by using three popular evaluation metrics over two widely used eye-tracking datasets. Experimental results demonstrate the effectiveness of our method in saliency detection of news videos compared to several state-of-the-art methods.

引用

页码：1053 / 1075

页数：23

共 53 条

[1] [Anonymous], P IEEE C COMP VIS PA
[2] [Anonymous], 2007, PROC IEEE C COMPUT V, DOI 10.1109/CVPR.2007.383267
[3] [Anonymous], BMVC
[4] [Anonymous], P 2011 JOINT ACM WOR
[5] [Anonymous], COLL RES COMP NEUR A
[6] [Anonymous], 2006, Advances in Neural Information Processing Systems
[7] Fusing bio-inspired vision data for simplified high level scene interpretation: Application to face motion analysis
Benoit, A.
Caplier, A.
[J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (07) : 774 - 789
[8] Visual saliency: a biologically plausible contourlet-like frequency domain approach
Bian, Peng
Zhang, Liming
[J]. COGNITIVE NEURODYNAMICS, 2010, 4 (03) : 189 - 198
[9] Borji A, 2012, PROC CVPR IEEE, P478, DOI 10.1109/CVPR.2012.6247711
[10] Saliency, attention, and visual search: An information theoretic approach
Bruce, Neil D. B.
Tsotsos, John K.
[J]. JOURNAL OF VISION, 2009, 9 (03):

← 1 2 3 4 5 6 →