A Projection-Based Likelihood Measure for Speech Recognition in Noise

被引:19
作者
Carlson, Beth A. [1 ]
Clements, Mark A. [1 ]
机构
[1] Georgia Inst Technol, Sch Elect Engn, Atlanta, GA 30332 USA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1994年 / 2卷 / 01期
关键词
D O I
10.1109/89.260341
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper investigates a projection-based likelihood measure that significantly improves automatic speech recognition performance in the presence of additive broadband noise. The measure was developed by modifying likelihood scores in continuous Gaussian density hidden Markov models (HMM's), resulting in the weighted projection measure (WPM). Experimental results using the proposed measure are reported for several performance factors: different cepstral-based parameters, normal and multistyle speech, and various noise signals, including white, jittering white, and broadband colored noise. In all cases, significant improvements in speaker-dependent, isolated word recognition were achieved using the WPM instead of the standard Gaussian likelihood measure (weighted Euclidean distance (WED)). As an example, at a SNR of 5 dB, the WPM resulted in improvement in recognition accuracy from 19.4 to 80.6% compared with the standard WED for the DFT mel-cepstral representation.
引用
收藏
页码:97 / 102
页数:8
相关论文
共 13 条
[1]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[2]  
CARLSON B, 1990, P ICASSP 91, P921
[3]  
CARLSON B, 1991, THESIS GEORGIA I TEC
[4]  
EPHRAIM Y, 1990, P IEEE INT C AC SPEE, V2, P829
[5]  
FORNEY J, 1973, P IEEE MAR, P268
[6]   CONSTRAINED ITERATIVE SPEECH ENHANCEMENT WITH APPLICATION TO SPEECH RECOGNITION [J].
HANSEN, JHL ;
CLEMENTS, MA .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (04) :795-805
[7]   SPECTRAL SLOPE DISTANCE MEASURES WITH LINEAR PREDICTION ANALYSIS FOR WORD RECOGNITION IN NOISE [J].
HANSON, BA ;
WAKITA, H .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (07) :968-973
[8]  
JUANG B, 1988, P 1988 DIG SIGN PROC
[9]   A FAMILY OF DISTORTION MEASURES BASED UPON PROJECTION OPERATION FOR ROBUST SPEECH RECOGNITION [J].
MANSOUR, D ;
BIING, HJ .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (11) :1659-1671
[10]  
Markel J. D., 1976, LINEAR PREDICTION SP