CONNECTED-DIGIT SPEAKER-DEPENDENT SPEECH RECOGNITION USING A NEURAL NETWORK WITH TIME-DELAYED CONNECTIONS

被引:42
作者
UNNIKRISHNAN, KP [1 ]
HOPFIELD, JJ [1 ]
TANK, DW [1 ]
机构
[1] AT&T BELL LABS,MOLEC BIOPHYS RES DEPT,MURRAY HILL,NJ 07974
关键词
D O I
10.1109/78.80888
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An analog neural network that can be taught to recognize stimulus sequences has been used to recognize the digits in connected speech. The circuit computes in the analog domain, using linear circuits for signal filtering and nonlinear circuits for simple decisions, feature extraction, and noise suppression. An analog perceptron learning rule is used to organize the subset of connections used in the circuit that are specific to the chosen vocabulary. Computer simulations of the learning algorithm and circuit demonstrate recognition scores > 99% for a single speaker connected-digit data base. There is no clock; the circuit is data driven, and there is no necessity for endpoint detection or segmentation of the speech signal during recognition. Training in the presence of noise provides noise immunity up to the trained level. For the speech problem studied here, the circuit connections need only be accurate to about 3-b digitization depth for optimum performance. The algorithm used maps efficiently onto analog neural network hardware: single chip microelectronic circuits based upon this algorithm can probably be built with current technology.
引用
收藏
页码:698 / 713
页数:16
相关论文
共 35 条
[1]  
[Anonymous], 1987, COMPUT SPEECH LANG, DOI DOI 10.1016/0885-2308(87)90026-X
[2]  
BAKER JK, 1975, SPEEC RECOGNITION
[3]  
Burr D., 1986, P IEEE INT C SYST MA
[4]  
BURR DJ, 1987, P NEURAL INFORM PROC
[5]   NETWORK-BASED CONNECTED DIGIT RECOGNITION [J].
BUSH, MA ;
KOPEC, GE .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (10) :1401-1413
[6]   AXONAL DELAY-LINES FOR TIME MEASUREMENT IN THE OWLS BRAIN-STEM [J].
CARR, CE ;
KONISHI, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1988, 85 (21) :8311-8315
[7]  
Cook C.E., 1967, RADAR SIGNALS
[8]  
DAUTRICH BA, 1983, IEEE T ACOUST SPEECH, V31, P193
[9]  
ELLMAN JL, 1988, J ACOUST SOC AM, V83, P1615
[10]  
EPSYWILSON CY, 1987, THESIS MIT