A Stochastic Model of Speech Incorporating Hierarchical Nonstationarity

被引：21

作者：

Deng, Li ^{[1
]}

机构：

[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1993年 / 1卷 / 04期

基金：

加拿大自然科学与工程研究理事会;

关键词：

D O I：

10.1109/89.242494

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The concept of two-level (global and local) hierarchical nonstationarity is introduced in this paper to describe the elastic and dynamic nature of the speech signal. A doubly stochastic process de l is developed to implement this concept. In this model, the global nonstationarity is embodied through an underlying Markov chain that governs evolution of the parameters in a set of output stochastic processes. The local nonstationarity is realized by utilizing state-conditioned, time-varying first- and second-order statistics in the output data-generation process models. For potential uses in automatic uncovering relatioanlly invariant properties from the speech signal and in speech recognition, the local nonstationarity is represented in a parametric form. Preliminary experiments on fitting the models to speech data demonstrate superior performance of the proposed model to several traditional types of hidden Markov models.

引用

页码：471 / 474

页数：5

共 13 条

[1] Baum L. E., 1972, INEQUALITIES, V3, P1
[2] BOX GEP, 1976, TIME SERIES ANAL FOR, P67
[3] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
DEMPSTER, AP
LAIRD, NM
RUBIN, DB
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
[4] DENG D, 1990, COMPUTER SPEECH LANG, V4, P345
[5] A GENERALIZED HIDDEN MARKOV MODEL WITH STATE-CONDITIONED TREND FUNCTIONS OF TIME FOR THE SPEECH SIGNAL
DENG, L
[J]. SIGNAL PROCESSING, 1992, 27 (01) : 65 - 78
[6] Deng L., 1991, Neural Networks for Signal Processing. Proceedings of the 1991 IEEE Workshop (Cat. No.91TH0385-5), P411, DOI 10.1109/NNSP.1991.239500
[7] DENG L, 1991, P ICASSP 91, P193
[8] Gupta V. N., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0), P697
[9] ISO K, 1990, P INT C AC SPEECH SI, P441
[10] A LINEAR PREDICTIVE HMM FOR VECTOR-VALUED OBSERVATIONS WITH APPLICATIONS TO SPEECH RECOGNITION
KENNY, P
LENNIG, M
MERMELSTEIN, P
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (02): : 220 - 225

← 1 2 →