Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms

被引:62
作者
Carnero, B [1 ]
Drygajlo, A
机构
[1] STMicroelect NV, Geneva, Switzerland
[2] Swiss Fed Inst Technol, Signal Proc Lab, CH-1015 Lausanne, Switzerland
关键词
D O I
10.1109/78.765133
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents new wideband speech coding and integrated speech coding-enhancement systems based on frame-synchronized fast wavelet packet transform algorithms. It also formulates temporal and spectral psychoacoustic models of masking adapted to wavelet packet analysis. The algorithm of the proposed FFT-like overlapped block orthogonal wavelet packet transform permits us to efficiently approximate the auditory critical band decomposition in the time and frequency domains, This allows us to make use of the temporal and spectral masking properties of the human auditory system to decrease the average hit rate of the encoder while perceptually hiding the quantization error, The same wavelet packet representation is used to merge speech enhancement and coding in the context of auditory modeling, The advantage of the method presented in this paper ol er previous approaches is that perceptual enhancement and coding, which is usually implemented as a cascade of two separate systems, are combined. This leads to a decreased computational load, Experiments show that the proposed wideband coding procedure by itself can achieve transparent coding of speech signals sampled at 16 kHz at an average bit rate of 39.4 kbit/s, The combined speech coding-enhancement procedure achieves higher bit rate values that depend on the residual noise characteristics at the output of the enhancement process.
引用
收藏
页码:1622 / 1635
页数:14
相关论文
共 66 条
[1]  
ABDOUL JP, 1995, SPEECH CODING SYNTHE, P289
[2]  
AKANSU A, 1996, SUBBAND WAVELET TRAN
[3]  
[Anonymous], P IEEE INT C AC SPEE
[4]   ON OPTIMAL QUANTIZATION OF NOISY SOURCES [J].
AYANOGLU, E .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1990, 36 (06) :1450-1452
[5]  
*BDSONS, 1992, BAS DONN SONS FRANC
[6]  
BEERENDS JG, 1992, P 92 AES CONV MAR
[7]  
Black M., 1995, P INT C AC SPEECH SI, P3075
[8]  
BOLAND S, 1996, P ICASSP, P1041
[9]  
BOLAND S, 1997, P IEEE ICASSP MUN GE, P351
[10]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120