FREQUENCY-DOMAIN CODING OF SPEECH

被引:79
作者
TRIBOLET, JM [1 ]
CROCHIERE, RE [1 ]
机构
[1] BELL TEL LABS INC,DEPT ACOUST RES,MURRAY HILL,NJ 07974
来源
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING | 1979年 / 27卷 / 05期
关键词
D O I
10.1109/TASSP.1979.1163283
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Frequency domain techniques for speech coding have recently received considerable attention. The basic concept of these methods is to divide the speech into frequency components by a filter bank (sub-band coding), or by a suitable transform (transform coding), and then encode them using adaptive PCM. Three basic factors are involved in the design of these coders: 1) the type of the filter bank or transform, 2) the choice of bit allocation and noise shaping properties involved in bit allocation, and 3) the control of the step-size of the encoders. This paper reviews the basic aspects of the design of these three factors for sub-band and transform coders. Concepts of short-time analysis/synthesis are first discussed and used to establish a basic theoretical framework. It is then shown how practical realizations of sub-band and transform coding are interpreted within this framework. Principles of spectral estimation and models of speech production and perception are then discussed and used to illustrate how the “side information” can be most efficiently represented and utilized in the design of the coder (particularly the adaptive transform coder) to control the dynamic bit allocation and quantizer step-sizes. Recent developments and examples of the “vocoder-driven” adaptive transform coder for low bit-rate applications are then presented. Copyright © 1979 by The Institute of Electrical and Electronics Engineers, Inc.
引用
收藏
页码:512 / 530
页数:19
相关论文
共 45 条