Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms

被引:62
作者
Carnero, B [1 ]
Drygajlo, A
机构
[1] STMicroelect NV, Geneva, Switzerland
[2] Swiss Fed Inst Technol, Signal Proc Lab, CH-1015 Lausanne, Switzerland
关键词
D O I
10.1109/78.765133
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents new wideband speech coding and integrated speech coding-enhancement systems based on frame-synchronized fast wavelet packet transform algorithms. It also formulates temporal and spectral psychoacoustic models of masking adapted to wavelet packet analysis. The algorithm of the proposed FFT-like overlapped block orthogonal wavelet packet transform permits us to efficiently approximate the auditory critical band decomposition in the time and frequency domains, This allows us to make use of the temporal and spectral masking properties of the human auditory system to decrease the average hit rate of the encoder while perceptually hiding the quantization error, The same wavelet packet representation is used to merge speech enhancement and coding in the context of auditory modeling, The advantage of the method presented in this paper ol er previous approaches is that perceptual enhancement and coding, which is usually implemented as a cascade of two separate systems, are combined. This leads to a decreased computational load, Experiments show that the proposed wideband coding procedure by itself can achieve transparent coding of speech signals sampled at 16 kHz at an average bit rate of 39.4 kbit/s, The combined speech coding-enhancement procedure achieves higher bit rate values that depend on the residual noise characteristics at the output of the enhancement process.
引用
收藏
页码:1622 / 1635
页数:14
相关论文
共 66 条
[41]  
PAILLARD B, 1992, J AUDIO ENG SOC, V40, P21
[42]  
PAILLARD B, 1992, THESIS U SHERBROOKE
[43]   SIGNAL MODELING TECHNIQUES IN SPEECH RECOGNITION [J].
PICONE, JW .
PROCEEDINGS OF THE IEEE, 1993, 81 (09) :1215-1247
[44]   GENERALIZED WIENER FILTERING COMPUTATION TECHNIQUES [J].
PRATT, WK .
IEEE TRANSACTIONS ON COMPUTERS, 1972, C 21 (07) :636-+
[45]  
QUAKENBUSH SR, 1988, OBJECTIVE MEASURES S
[46]   A REMEZ EXCHANGE ALGORITHM FOR ORTHONORMAL WAVELETS [J].
RIOUL, O ;
DUHAMEL, P .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-ANALOG AND DIGITAL SIGNAL PROCESSING, 1994, 41 (08) :550-560
[47]  
SCHARF B, 1986, HDB PERCEPTION HUMAN, V15
[48]  
Scharf B., 1986, HDB PERVEPTION HUMAN
[49]   OPTIMIZING DIGITAL SPEECH CODERS BY EXPLOITING MASKING PROPERTIES OF THE HUMAN EAR [J].
SCHROEDER, MR ;
ATAL, BS ;
HALL, JL .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 66 (06) :1647-1652
[50]   LOW BIT-RATE TRANSPARENT AUDIO COMPRESSION USING ADAPTED WAVELETS [J].
SINHA, DP ;
TEWFIK, AH .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (12) :3463-3479