基于帧特征、段特征联合建模的语音识别模型

被引:3
作者
韩疆
尹宝林
机构
[1] 北京航空航天大学计算机科学与工程系!北京100083
关键词
特征矢量; 显式建模; 特征组成; 本征矢量; 语音信号; 帧间相关性; 识别性能; 语音库; 韵母; 残差序列; 特征序列; 分段算法; 趋势函数; 联合建模; 条件密度函数; 语音识别;
D O I
10.15949/j.cnki.0371-0025.2000.02.016
中图分类号
H017 [实验语音学(仪器语音学)];
学科分类号
摘要
提出了基于帧特征、段特征联合建模的语音识别模型。该模型采用描述谱参数轨迹的段特征,在段尺度上实现了对语音信号帧间相关性的显式建模;采用段特征依赖的非平稳时间序列产生模型,实现了段特征与帧特征间的相关性建模,并在帧尺度上通过参数化的均值轨迹函数,实现了对语音信号帧间相关性的隐式建模。本文给出了基于帧特征、段特征联合统计距离优化的分段算法以及内嵌EM迭代的模型参数估计算法。对非特定人汉语孤立韵母以及多话者汉语基本音节的识别实验表明,该模型的识别性能优于标准HMM及趋势HMM。
引用
收藏
页码:182 / 190
页数:9
相关论文
共 10 条
[1]  
From HMM’s to segment models: a unified view of stochastic modeling for speech recognition. Ostendorf M,Digalakis V,Kimball OA. IEEE Transactions on Speech and Audio Proceessing . 1996
[2]  
Speech recognition using hidden Markov models with polynomial regression functions as non-stationary states. Deng L,Aksmanovic M,Sun D,et al. IEEE Transactions on Speech and Audio Processing . 1993
[3]  
Vector quantization of pitch information in Mandarin speech. Chen S H,Wang Y R. IEEE Transactions on Communications . 1990
[4]  
Explicit correlation in hidden Markovmodels for speech recognition.In Proceedings of ICASSP,San Francisco, CA,U. Wellekens P C. S.A . 1987
[5]  
Use of temporal correlation between successive frames in hidden Markov model based speechrecognizer.In Proceedings of ICASSP, Minneapolis,MN,U. Paliwal K K. S.A . 1993
[6]  
A linear predictiveMM for vector-valued observations with applications tospeech recognition. Kenny P,Lenning M,Mermelstein P. IEEE Trans. Acoust. SPeech SignalProcess . 1990
[7]  
The segmental K-means algorithm for estimating parameters of hidden markov models. Juang B H,Rabiner L R. IEEE Transactions on Acoustics Speech and Signal Processing . 1990
[8]  
Hidden Markov models using vector linear prediction and discriminative output distributions.In Proceedings of ICASSP, San Francisco, CA, U. Woodland P C. S.A . 1992
[9]  
Hidden Markov models with templates as non-stationary states: an application to speechrecognition. Ghitza O,Sondhi M M. Computer Speech and Language . 1993
[10]  
Speaker independent isolated word recognizer using dynamic features of speech spectrum. Furui S. IEEE Transactions on Acoustics Speech and Signal Processing . 1981