A scalable wavelet-based video distortion metric and applications

被引:64
作者
Masry, M [1 ]
Hemami, SS
Sermadevi, Y
机构
[1] Cornell Univ, Sch Mech & Aerosp Engn, Ithaca, NY 14853 USA
[2] Cornell Univ, Engn Theory Ctr, Ithaca, NY 14853 USA
[3] Microsoft Corp, Digital Media Div, Redmond, WA 98052 USA
关键词
human visual system (HVS); quality monitoring; reduced reference; scalable metric; time series; video coding; video quality assessment;
D O I
10.1109/TCSVT.2005.861946
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video distortion metrics based on models of the human visual system have traditionally used comparisons between the distorted signal and a reference signal to calculate distortions objectively. In video coding applications, this is not prohibitive. In quality monitoring applications, however, access to the reference signal is often limited. This paper presents a computationally efficient video distortion metric that can operate in full- or reduced-reference mode as required. The metric is based on a model of the human visual system implemented using the wavelet transform and separable filters. The visual model is parameterized using a set of video frames and the associated quality scores. The visual model's hierarchical structure, as well as the limited impact of fine scale distortions on quality judgments of severely impaired video, are exploited to build a framework for scaling the bitrate required to represent the reference signal. Two applications of the metric are also presented. In the first, the metric is used as the distortion measure in a rate-distortion optimized rate control algorithm for MPEG-2 video compression. The resulting compressed video sequences demonstrate significant improvements in visual quality over compressed sequences with allocations determined by the TM5 rate control algorithm operating with MPEG-2 at the same rate. In the second, the metric is used to estimate time series of objective. quality scores for distorted video sequences using reference bitrates as low as 10 kb/s. The resulting quality scores more accurately model subjective quality recordings than do those estimated using the mean squared error as a distortion metric, while requiring a fraction of the bitrate used to represent the reference signal. The reduced-reference metric's performance is comparable to that of the full-reference metrics tested in the first Video Quality Experts Group evaluation.
引用
收藏
页码:260 / 273
页数:14
相关论文
共 46 条
  • [1] [Anonymous], P SPIE WAV APPL SIGN
  • [2] BARTEN PGJ, 1989, P SOC PHOTO-OPT INS, V1077, P73
  • [3] BOLIN MR, 1998, P 25 ANN C COMP GRAP, P299
  • [4] Bradshaw RH, 1999, ANIM WELFARE, V8, P3
  • [5] Caviedes J, 2002, 2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, P53, DOI 10.1109/ICIP.2002.1038901
  • [6] Daly S., 1993, The visible differences predictor, P179
  • [7] DERIDDER H, 1992, P SOC PHOTO-OPT INS, V1666, P16, DOI 10.1117/12.135953
  • [8] HUMAN LUMINANCE PATTERN-VISION MECHANISMS - MASKING EXPERIMENTS REQUIRE A NEW MODEL
    FOLEY, JM
    [J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1994, 11 (06): : 1710 - 1719
  • [9] Estimating multiple temporal mechanisms in human vision
    Fredericksen, RE
    Hess, RF
    [J]. VISION RESEARCH, 1998, 38 (07) : 1023 - 1040
  • [10] Temporal detection in human vision: dependence on stimulus energy
    Fredericksen, RE
    Hess, RF
    [J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1997, 14 (10): : 2557 - 2569