深层神经网络语音识别自适应方法研究

被引:13
作者
邓侃
欧智坚
机构
[1] 清华大学电子工程系
关键词
语音识别; 声学模型自适应; 深层神经网络;
D O I
暂无
中图分类号
TN912.34 [语音识别与设备];
学科分类号
0711 ;
摘要
为了解决语音识别中深层神经网络的说话人与环境自适应问题,从语音信号中的说话人与环境因素的固有特点出发,提出了使用长时特征的自适应方案。基于高斯混合模型建立说话人—环境联合补偿模型,对说话人与环境参数进行估计,将此参数作为长时特征,将估计出来的长时特征与短时特征一起送入深层神经网络进行训练。Aurora4实验表明,该方案可以有效地对说话人与环境因素进行分解,并提升自适应效果。
引用
收藏
页码:1966 / 1970
页数:5
相关论文
共 6 条
[1]  
An overview of noise-robust automatic speech recognition[J] . Jinyu Li,Li Deng,Yifan Gong,Reinhold Haeb-Umbach.IEEE/ACM Transactions on Audio, Speech and Langua . 2014 (4)
[2]   A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions [J].
Li, Jinyu ;
Deng, Li ;
Yu, Dong ;
Gong, Yifan ;
Acero, Alex .
COMPUTER SPEECH AND LANGUAGE, 2009, 23 (03) :389-405
[3]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[4]   Maximum likelihood linear transformations for HMM-based speech recognition [J].
Gales, MJF .
COMPUTER SPEECH AND LANGUAGE, 1998, 12 (02) :75-98
[5]   MAXIMUM-LIKELIHOOD LINEAR-REGRESSION FOR SPEAKER ADAPTATION OF CONTINUOUS DENSITY HIDDEN MARKOV-MODELS [J].
LEGGETTER, CJ ;
WOODLAND, PC .
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (02) :171-185
[6]  
Speech recognition in noisy environments: A survey[J] . Yifan Gong.Speech Communication . 1995 (3)