Integrating bottom-up and top-down visual stimulus for saliency detection in news video

被引:11
作者
Wu, Bo [1 ,2 ]
Xu, Linfeng [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Elect Engn, Chengdu 610073, Peoples R China
[2] Henan Normal Univ, Coll Phys & Informat Engn, Xinxiang 453007, Peoples R China
关键词
Visual saliency; Bottom-up attention; Top-down attention; News video; FACE SEGMENTATION; ATTENTION; MODEL;
D O I
10.1007/s11042-013-1530-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a new attention model for detecting visual saliency in news video. In the proposed model, bottom-up (low level) features and top-down (high level) factors are used to compute bottom-up saliency and top-down saliency respectively. Then, the two saliency maps are fused after a normalization operation. In the bottom-up attention model, we use quaternion discrete cosine transform in multi-scale and multiple color spaces to detect static saliency. Meanwhile, multi-scale local motion and global motion conspicuity maps are computed and integrated into motion saliency map. To effectively suppress the background motion noise, a simple histogram of average optical flow is adopted to calculate motion contrast. Then, the bottom-up saliency map is obtained by combining the static and motion saliency maps. In the top-down attention model, we utilize high level stimulus in news video, such as face, person, car, speaker, and flash, to generate the top-down saliency map. The proposed method has been extensively tested by using three popular evaluation metrics over two widely used eye-tracking datasets. Experimental results demonstrate the effectiveness of our method in saliency detection of news videos compared to several state-of-the-art methods.
引用
收藏
页码:1053 / 1075
页数:23
相关论文
共 53 条
  • [1] [Anonymous], P IEEE C COMP VIS PA
  • [2] [Anonymous], 2007, PROC IEEE C COMPUT V, DOI 10.1109/CVPR.2007.383267
  • [3] [Anonymous], BMVC
  • [4] [Anonymous], P 2011 JOINT ACM WOR
  • [5] [Anonymous], COLL RES COMP NEUR A
  • [6] [Anonymous], 2006, Advances in Neural Information Processing Systems
  • [7] Fusing bio-inspired vision data for simplified high level scene interpretation: Application to face motion analysis
    Benoit, A.
    Caplier, A.
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (07) : 774 - 789
  • [8] Visual saliency: a biologically plausible contourlet-like frequency domain approach
    Bian, Peng
    Zhang, Liming
    [J]. COGNITIVE NEURODYNAMICS, 2010, 4 (03) : 189 - 198
  • [9] Borji A, 2012, PROC CVPR IEEE, P478, DOI 10.1109/CVPR.2012.6247711
  • [10] Saliency, attention, and visual search: An information theoretic approach
    Bruce, Neil D. B.
    Tsotsos, John K.
    [J]. JOURNAL OF VISION, 2009, 9 (03):