Integrated Models of Signal and Background with Application to Speaker Identification in Noise

被引:97
作者
Rose, R. C. [1 ]
Hofstetter, E. M. [2 ]
Reynolds, D. A. [2 ]
机构
[1] Bell Labs, Speech Res Dept, Murray Hill, NJ 07974 USA
[2] MIT, Lexington, MA 02173 USA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1994年 / 2卷 / 02期
关键词
External noise sources - Maximum likelihood - Noise compensation - Robust parametric model estimation - Signal model - Speaker classification performance - Speaker identification;
D O I
10.1109/89.279273
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper is concerned with the problem of robust parametric model estimation and classification in noisy acoustic environments. Characterization and modeling of the external noise sources in these environments is in itself an important issue in noise compensation. The techniques described here provide a mechanism for integrating parametric models of acoustic background with the signal model so that noise compensation is tightly coupled with signal model training and classification. Prior information about the acoustic background process is provided using a maximum likelihood parameter estimation procedure that integrates an a priori model of acoustic background with the signal model. An experimental study is presented in the paper on the application of this approach to text-independent speaker identification in noisy acoustic environments. Considerable improvement in speaker classification performance was obtained for classifying unlabeled sections of conversational speech utterances from a 16-speaker population under cross-environment training and testing conditions.
引用
收藏
页码:245 / 257
页数:13
相关论文
共 36 条
[1]   A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T ;
SOULES, G ;
WEISS, N .
ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01) :164-&
[2]   TIED MIXTURE CONTINUOUS PARAMETER MODELING FOR SPEECH RECOGNITION [J].
BELLEGARDA, JR ;
NAHAMOO, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (12) :2033-2045
[3]  
BISSON AE, 1981, AUDITORY VISUAL PATT
[4]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[5]  
DAVIS S, 1980, IEEE T ACOUST SPEECH, V4, P357
[6]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]  
DOBROTH KM, 1989, P AM VOIC INP OUTP S
[8]   GAIN-ADAPTED HIDDEN MARKOV-MODELS FOR RECOGNITION OF CLEAN AND NOISY SPEECH [J].
EPHRAIM, Y .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (06) :1303-1316
[9]   COMPUTER-STEERED MICROPHONE ARRAYS FOR SOUND TRANSDUCTION IN LARGE ROOMS [J].
FLANAGAN, JL ;
JOHNSTON, JD ;
ZAHN, R ;
ELKO, GW .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1985, 78 (05) :1508-1518
[10]  
Gish H., 1985, P ICASSP, P379