PHONE RECOGNITION USING RESTRICTED BOLTZMANN MACHINES

被引:37
作者
Mohamed, Abdel-rahman [1 ]
Hinton, Geoffrey [1 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 1A1, Canada
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
phone recognition; restricted Boltzmann machines; distributed representations; NETS;
D O I
10.1109/ICASSP.2010.5495651
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
For decades, Hidden Markov Models (HMMs) have been the state-of-the-art technique for acoustic modeling despite their unrealistic independence assumptions and the very limited representational capacity of their hidden states. Conditional Restricted Boltzmann Machines (CRBMs) have recently proved to be very effective for modeling motion capture sequences and this paper investigates the application of this more powerful type of generative model to acoustic modeling. On the standard TIMIT corpus, one type of CRBM outperforms HMMs and is comparable with the best other methods, achieving a phone error rate (PER) of 26.7% on the TIMIT core test set.
引用
收藏
页码:4354 / 4357
页数:4
相关论文
共 14 条
[1]  
[Anonymous], P NIPS
[2]  
Deng L, 2007, INT CONF ACOUST SPEE, P445
[3]  
Gillick L., 1989, P ICASSP, P532
[4]  
Halberstadt A K., 1998, P ICSLP
[5]   Training products of experts by minimizing contrastive divergence [J].
Hinton, GE .
NEURAL COMPUTATION, 2002, 14 (08) :1771-1800
[6]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[7]  
HY, 2009, IEEE T AUDIO SPEECH, V17, P354
[8]  
LEE K, 1989, IEEE T ACOUST SPEECH, V37, P1648
[9]  
MORIS J, 2006, P INT, P597
[10]   AN APPLICATION OF RECURRENT NETS TO PHONE PROBABILITY ESTIMATION [J].
ROBINSON, AJ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :298-305