Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms

被引：62

作者：

Carnero, B ^{[1
]}

Drygajlo, A

机构：

[1] STMicroelect NV, Geneva, Switzerland

[2] Swiss Fed Inst Technol, Signal Proc Lab, CH-1015 Lausanne, Switzerland

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 1999年 / 47卷 / 06期

关键词：

D O I：

10.1109/78.765133

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper presents new wideband speech coding and integrated speech coding-enhancement systems based on frame-synchronized fast wavelet packet transform algorithms. It also formulates temporal and spectral psychoacoustic models of masking adapted to wavelet packet analysis. The algorithm of the proposed FFT-like overlapped block orthogonal wavelet packet transform permits us to efficiently approximate the auditory critical band decomposition in the time and frequency domains, This allows us to make use of the temporal and spectral masking properties of the human auditory system to decrease the average hit rate of the encoder while perceptually hiding the quantization error, The same wavelet packet representation is used to merge speech enhancement and coding in the context of auditory modeling, The advantage of the method presented in this paper ol er previous approaches is that perceptual enhancement and coding, which is usually implemented as a cascade of two separate systems, are combined. This leads to a decreased computational load, Experiments show that the proposed wideband coding procedure by itself can achieve transparent coding of speech signals sampled at 16 kHz at an average bit rate of 39.4 kbit/s, The combined speech coding-enhancement procedure achieves higher bit rate values that depend on the residual noise characteristics at the output of the enhancement process.

引用

页码：1622 / 1635

页数：14

共 66 条

[41]

PAILLARD B, 1992, J AUDIO ENG SOC, V40, P21

[42]

PAILLARD B, 1992, THESIS U SHERBROOKE

[43] SIGNAL MODELING TECHNIQUES IN SPEECH RECOGNITION [J].