基于帧特征、段特征联合建模的语音识别模型

被引：3

作者：

韩疆

尹宝林

机构：

[1] 北京航空航天大学计算机科学与工程系!北京１０００８３

来源：

声学学报 | 2000年 / 02期

关键词：

特征矢量; 显式建模; 特征组成; 本征矢量; 语音信号; 帧间相关性; 识别性能; 语音库; 韵母; 残差序列; 特征序列; 分段算法; 趋势函数; 联合建模; 条件密度函数; 语音识别;

D O I：

10.15949/j.cnki.0371-0025.2000.02.016

中图分类号：

H017 [实验语音学（仪器语音学）];

学科分类号：

摘要：

提出了基于帧特征、段特征联合建模的语音识别模型。该模型采用描述谱参数轨迹的段特征，在段尺度上实现了对语音信号帧间相关性的显式建模；采用段特征依赖的非平稳时间序列产生模型，实现了段特征与帧特征间的相关性建模，并在帧尺度上通过参数化的均值轨迹函数，实现了对语音信号帧间相关性的隐式建模。本文给出了基于帧特征、段特征联合统计距离优化的分段算法以及内嵌ＥＭ迭代的模型参数估计算法。对非特定人汉语孤立韵母以及多话者汉语基本音节的识别实验表明，该模型的识别性能优于标准ＨＭＭ及趋势ＨＭＭ。

引用

页码：182 / 190

页数：9

共 10 条

[1]

From HMM’s to segment models: a unified view of stochastic modeling for speech recognition. Ostendorf M,Digalakis V,Kimball OA. IEEE Transactions on Speech and Audio Proceessing . 1996

[2]

Speech recognition using hidden Markov models with polynomial regression functions as non-stationary states. Deng L,Aksmanovic M,Sun D,et al. IEEE Transactions on Speech and Audio Processing . 1993

[3]

Vector quantization of pitch information in Mandarin speech. Chen S H,Wang Y R. IEEE Transactions on Communications . 1990

[4]

Explicit correlation in hidden Markovmodels for speech recognition.In Proceedings of ICASSP,San Francisco, CA,U. Wellekens P C. S.A . 1987

[5]

Use of temporal correlation between successive frames in hidden Markov model based speechrecognizer.In Proceedings of ICASSP, Minneapolis,MN,U. Paliwal K K. S.A . 1993

[6]

A linear predictiveMM for vector-valued observations with applications tospeech recognition. Kenny P,Lenning M,Mermelstein P. IEEE Trans. Acoust. SPeech SignalProcess . 1990

[7]

The segmental K-means algorithm for estimating parameters of hidden markov models. Juang B H,Rabiner L R. IEEE Transactions on Acoustics Speech and Signal Processing . 1990

[8]

Hidden Markov models using vector linear prediction and discriminative output distributions.In Proceedings of ICASSP, San Francisco, CA, U. Woodland P C. S.A . 1992

[9]

Hidden Markov models with templates as non-stationary states: an application to speechrecognition. Ghitza O,Sondhi M M. Computer Speech and Language . 1993

[10]

Speaker independent isolated word recognizer using dynamic features of speech spectrum. Furui S. IEEE Transactions on Acoustics Speech and Signal Processing . 1981

← 1 →