Transform Representation of the Spectra of Acoustic Speech Segments with Applications-I: General Approach and Application to Speech Recognition

被引：13

作者：

Algazi, V. Ralph ^{[1
,2
]}

Brown, Kathy L. ^{[1
]}

Ready, Michael J.

Irvine, David H. ^{[1
]}

Cadwell, Christie L.

Chung, Sang

机构：

[1] Univ Calif Davis, Ctr Image Proc & Integrated Computing, Speech Res Lab, Davis, CA 95616 USA

[2] Univ Calif Davis, Dept Elect Engn & Comp Sci, Davis, CA 95616 USA

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1993年 / 1卷 / 02期

关键词：

26;

D O I：

10.1109/89.222877

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present in this series of two papers a new approach for modeling and capturing the time-varying structure of the spectral envelope of speech. In this approach, we use an acoustic subword decomposition and the Karhunen-Loeve transform (UT) to extract and efficiently represent the highly correlated structure of the spectral envelope. Integration of the UT with acoustic subword modeling is a novel approach that concisely represents both steady-state and dynamic features of the spectra in a unified framework that very effectively captures acoustic-phonetic patterns. The organization of these two papers is as follows: the first paper, Part I presents the physiological and perceptual basis for the approach, the frame-based and acoustic-subword-based spectral representation, and applications to speaker-dependent recognition. The performance of the recognition algorithm based on this approach compares favorably to other existing techniques. Part II will present a frequency-domain coding technique by analysis/synthesis. This application of the new method produces good quality speech at low bit rates.

引用

页码：180 / 195

页数：16

共 26 条

[1] ALGAZI VR, 1988, P ICASSP 88 APR, P465
[2] ALGAZI VR, 1989, P ICASSP 89 MAY, P468
[3] ALGAZI VR, 1985, 10 GRETS S MAY, P847
[4] A NEW STATISTICAL APPROACH FOR THE AUTOMATIC SEGMENTATION OF CONTINUOUS SPEECH SIGNALS
ANDREOBRECHT, R
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1988, 36 (01): : 29 - 40
[5] Atal B. S., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing, P81
[6] SEQUENTIAL DETECTION OF ABRUPT CHANGES IN SPECTRAL CHARACTERISTICS OF DIGITAL SIGNALS
BASSEVILLE, M
BENVENISTE, A
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1983, 29 (05) : 709 - 724
[7] FRAME-SPECIFIC STATISTICAL FEATURES FOR SPEAKER INDEPENDENT SPEECH RECOGNITION
BOCCHIERI, EL
DODDINGTON, GR
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (04): : 755 - 764
[8] BROWN KL, 1989, P ICASSP 89 MAY, P104
[9] BROWN KL, 1985, IEEE T ACOUST SPEECH, P21
[10] BROWN KL, 1987, THESIS U CALIFORNIA

← 1 2 3 →