ROBUST ESTIMATION OF SPEECH IN NOISY BACKGROUNDS BASED ON ASPECTS OF THE AUDITORY PROCESS

被引:21
作者
HANSEN, JHL
NANDKUMAR, S
机构
[1] Robust Speech Processing Laboratory, Electrical Engineering, Duke University, Durham, North Carolina 27708-0291
关键词
D O I
10.1121/1.413108
中图分类号
O42 [声学];
学科分类号
070206 [声学]; 082403 [水声工程];
摘要
A new approach to speech enhancement is proposed where constraints based on aspects of the auditory process augment an iterative enhancement framework. The basic enhancement framework is based on a previously developed dual-channel scenario using a two-step iterative Wiener filtering algorithm. Constraints across broad speech sections and over iterations are then experimentally developed on a novel auditory representation derived by transforming the speech magnitude spectrum. The spectral transformations are based on modeling aspects of the human auditory process which include critical band filtering, intensity-to-loudness conversion, and lateral inhibition. The auditory transformations and perceptual based constraints are shown to result in a new set of auditory constrained and enhanced linear prediction (ACE-LP) parameters. The ACE-LP based speech spectrum is then incorporated into the iterative Wiener filtering framework. The improvements due to auditory constraints are demonstrated in several areas. The proposed auditory representation is shown to result in improved spectral characterization in background noise. The auditory constrained iterative enhancement (ACE-II) algorithm is shown to result in improved quality over all sections of enhanced speech. Adaptation of auditory based constraints to changing spectral characteristics over broad classes of speech is another novel aspect of the proposed algorithm. The consistency of speech quality improvement for the ACE-II algorithm is illustrated over time and across all phonemes classified over a large set of phonetically balanced sentences from the TIMIT database. This study demonstrates the application of auditory based perceptual properties of a human listener to speech enhancement in noise, resulting in improved and consistent speech quality over all regions of speech. © 1995, Acoustical Society of America. All rights reserved.
引用
收藏
页码:3833 / 3849
页数:17
相关论文
共 39 条
[1]
SPEECH ENHANCEMENT BASED CONCEPTUALLY ON AUDITORY EVIDENCE [J].
CHENG, YM ;
OSHAUGHNESSY, D .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (09) :1943-1954
[2]
APPLICATION OF AN AUDITORY MODEL TO SPEECH RECOGNITION [J].
COHEN, JR .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1989, 85 (06) :2623-2629
[3]
COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].
DAVIS, SB ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366
[4]
Ghitza O., 1986, Computer Speech and Language, V1, P109, DOI 10.1016/S0885-2308(86)80018-3
[5]
GHITZA O, 1988, 1988 P IEEE INT C AC, P91
[6]
GROCHOLEWSKI S, 1992, SIGNAL PROCESSING 4, V1, P299
[7]
Hansen J. H. L., 1988, THESIS GEORGIA I TEC
[8]
CONSTRAINED ITERATIVE SPEECH ENHANCEMENT WITH APPLICATION TO SPEECH RECOGNITION [J].
HANSEN, JHL ;
CLEMENTS, MA .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (04) :795-805
[9]
HANSEN JHL, 1991, 1991 P IEEE INT C AC, P901
[10]
HANSEN JHL, 1992, SIGNAL PROCESSING 4, V1, P515