基于语音结构化模型的数字语音识别

被引:3
作者
姜莹
俞一彪
机构
[1] 苏州大学电子信息学院
关键词
语音结构化模型; 数字识别; 隐马尔可夫模型; 说话人差异; 巴氏距离;
D O I
10.16208/j.issn1000-7024.2012.04.014
中图分类号
TN912.34 [语音识别与设备];
学科分类号
0711 ;
摘要
提出一种新的基于语音结构化模型的语音识别方法,并应用于非特定人数字语音识别。每一个数字语音计算倒谱特征之后提取语音中存在的对说话人差异具有不变性的结构化特征——全局声学结构(acoustical universal structure,AUS),并建立结构化模型,识别时提取测试语音的全局声学结构,然后与各数字语音的结构化模型进行匹配。测试了少量语料训练下的识别性能并与传统HMM(hidden Markov model)方法进行比较,结果表明该方法可以取得优于HMM的性能,语音结构化模型可以有效消除说话人之间的差异。
引用
收藏
页码:1482 / 1485+1490 +1490
页数:5
相关论文
共 15 条
[1]  
噪声环境下说话人识别研究[D]. 芮贤义.苏州大学 2005
[2]  
Yet another acoustic representation ofspeech sounds. Nobuaki Minematsu. Proceedings of International Conferenceon Acoustics Speech and Signal Processing . 2004
[3]  
Implementation of robustspeech recognition by simulating infants’’speech perceptionbased on the invariant sound shape embedded in utterances. Minematsu N,Satoshi Asakawa. Proc of Speech and Computer . 2009
[4]  
On invariantstructural representation for speech recognition:theoretical vali-dation and experimental improvement. YU Qiao,Nobuaki Minematsu,Keikichi Hirose. 10th Annual Con-ference of the International Speech Communication Association . 2009
[5]  
Japanese vowel recognition using external structure ofspeech. Takao Murakami,Kazutaka Maruyama,Nobuaki Minematsu,et al. Proceedings of Automatic Speech Recognition andUnderstanding . 2005
[6]  
Structure to speechconversion-speech generation based on infant-like vocal imitation. Saito D,Asakawa S,Minematsu N,et al. 9th Annual Conference of the International Speech Com-munication Association . 2008
[7]  
Voice conversion using structuredGaussian mixture model. DAO Jianzeng,YU Yibiao. 10th International Con-ference on Signal Processing . 2010
[8]  
Dialect-based speaker classifica-tion of Chinese using structural representation of pronunciation. MA Xuebin,Nobuaki Minematsu. Proc of Speech and Computer . 2008
[9]  
Mathematical evidence of the acoustic universalstructure in speech. Nobuaki Minematsu. Proceedings of IEEE InternationalConference on Acoustics Speech and Signal Processing . 2005
[10]  
Improve-ment of structure to speech conversion using iterative optimiza-tion. Daisuke Saito,YU Qioa,Nobuaki Minematsu,et al. Proc of Speech and Computer . 2009