A Stochastic Model of Speech Incorporating Hierarchical Nonstationarity

被引:21
作者
Deng, Li [1 ]
机构
[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1993年 / 1卷 / 04期
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1109/89.242494
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The concept of two-level (global and local) hierarchical nonstationarity is introduced in this paper to describe the elastic and dynamic nature of the speech signal. A doubly stochastic process de l is developed to implement this concept. In this model, the global nonstationarity is embodied through an underlying Markov chain that governs evolution of the parameters in a set of output stochastic processes. The local nonstationarity is realized by utilizing state-conditioned, time-varying first- and second-order statistics in the output data-generation process models. For potential uses in automatic uncovering relatioanlly invariant properties from the speech signal and in speech recognition, the local nonstationarity is represented in a parametric form. Preliminary experiments on fitting the models to speech data demonstrate superior performance of the proposed model to several traditional types of hidden Markov models.
引用
收藏
页码:471 / 474
页数:5
相关论文
共 13 条
  • [1] Baum L. E., 1972, INEQUALITIES, V3, P1
  • [2] BOX GEP, 1976, TIME SERIES ANAL FOR, P67
  • [3] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [4] DENG D, 1990, COMPUTER SPEECH LANG, V4, P345
  • [5] A GENERALIZED HIDDEN MARKOV MODEL WITH STATE-CONDITIONED TREND FUNCTIONS OF TIME FOR THE SPEECH SIGNAL
    DENG, L
    [J]. SIGNAL PROCESSING, 1992, 27 (01) : 65 - 78
  • [6] Deng L., 1991, Neural Networks for Signal Processing. Proceedings of the 1991 IEEE Workshop (Cat. No.91TH0385-5), P411, DOI 10.1109/NNSP.1991.239500
  • [7] DENG L, 1991, P ICASSP 91, P193
  • [8] Gupta V. N., 1987, Proceedings: ICASSP 87. 1987 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.87CH2396-0), P697
  • [9] ISO K, 1990, P INT C AC SPEECH SI, P441
  • [10] A LINEAR PREDICTIVE HMM FOR VECTOR-VALUED OBSERVATIONS WITH APPLICATIONS TO SPEECH RECOGNITION
    KENNY, P
    LENNIG, M
    MERMELSTEIN, P
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (02): : 220 - 225