Enhanced Itakura measure incorporating masking properties of human auditory system

被引:20
作者
Chen, G [1 ]
Koh, SN [1 ]
Soon, IY [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Commun Res Lab, Singapore 639798, Singapore
关键词
speech distortion measure; masking properties; Itakura measure;
D O I
10.1016/S0165-1684(03)00061-6
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A new enhanced Itakura (E-Itakura) speech distortion measure is proposed in this paper. It incorporates masking properties of the human auditory system into the original Itakura measure. Inaudible noise components masked by speech signals are excluded from the calculation of the E-Itakura measure, while the intrinsic advantage of the Itakura measure is retained. The proposed new measure has been compared with the original Itakura distortion, frequency-weighted Itakura spectral distortion, cepstral distance and Bark spectral distortion measures. The comparison results show that the correlation between the original Itakura measure with speech quality has been improved from 0.73 to 0.89 with the incorporation of the enhancement feature, and that the E-Itakura measure offers a more consistent indication of the subjective quality of speech. (C) 2003 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1445 / 1456
页数:12
相关论文
共 14 条
[1]   MINIMUM PREDICTION RESIDUAL PRINCIPLE APPLIED TO SPEECH RECOGNITION [J].
ITAKURA, F .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1975, AS23 (01) :67-72
[2]  
Itakura F, 1968, P 6 INT C AC TOK JAP, pc17
[4]   OBJECTIVE QUALITY EVALUATION FOR LOW-BIT-RATE SPEECH CODING SYSTEMS [J].
KITAWAKI, N ;
NAGABUCHI, H ;
ITOH, K .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1988, 6 (02) :242-248
[5]   Application of auditory masking in improved multiband excitation model [J].
Koh, SN ;
Chua, GH .
APPLIED ACOUSTICS, 2002, 63 (06) :693-698
[6]   A MODIFIED FREQUENCY-WEIGHTED ITAKURA SPECTRAL DISTORTION MEASURE [J].
LI, J ;
KRISHNAMURTHY, AK .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (10) :1614-1617
[7]   COMPARATIVE-STUDY OF SEVERAL DISTORTION MEASURES FOR SPEECH RECOGNITION [J].
NOCERINO, N ;
SOONG, FK ;
RABINER, LR ;
KLATT, DH .
SPEECH COMMUNICATION, 1985, 4 (04) :317-331
[8]  
Quackenbush S., 1988, Objective Measures of Speech Quality
[9]   MODELS OF HEARING [J].
SCHROEDER, MR .
PROCEEDINGS OF THE IEEE, 1975, 63 (09) :1332-1350
[10]   OPTIMIZING DIGITAL SPEECH CODERS BY EXPLOITING MASKING PROPERTIES OF THE HUMAN EAR [J].
SCHROEDER, MR ;
ATAL, BS ;
HALL, JL .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 66 (06) :1647-1652