Estimation of speech components by ACF analysis in a noisy environment

被引:10
作者
Kazama, M [1 ]
Tohyama, M [1 ]
机构
[1] Kogakuin Univ, Hachioji, Tokyo 1920015, Japan
关键词
D O I
10.1006/jsvi.2000.3275
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A speech signal can be decomposed into the Fundamental frequency and harmonics, and the autocorrelation function (ACF) is an effective tool for identifying the fundamental Frequency and the harmonics, This paper, thus, explains how ACF harmonic analysis can be applied to speech detection and reconstruction when speech communication technologies are used in noisy environments. The dominant sinusoidal components used for the ACF analysis can be picked out from the short-time Fourier spectrum records of a noisy speech signal by using a peak-picking method. Because the number of components usable for speech reconstruction depends on the signal-to-noise (S/N) ratio, we authors developed new methods for peak-picking method and for harmonic sieving. The number of components picked our is adjusted frame by frame depending on the short-time SN ratio, and harmonics are extracted From the short-time Fourier spectrum record by changing the frame length adaptively according to the fundamental frequency. Consequently, intelligible speech without "musical noise" could be reconstructed from noisy speech signals. (C) 2001 Academic Press.
引用
收藏
页码:41 / 52
页数:12
相关论文
共 10 条
[1]  
Ando Y, 1999, ADV ARC SER, V8, P63
[2]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[3]   A SIGNAL SUBSPACE APPROACH FOR SPEECH ENHANCEMENT [J].
EPHRAIM, Y ;
VANTREES, HL .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (04) :251-266
[4]   Postprocessing method for suppressing musical noise generated by spectral subtraction [J].
Goh, Z ;
Tan, KC ;
Tan, BTG .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03) :287-292
[5]  
KAZAMA M, 1997, 5 INT C SOUND VIBR, P2079
[6]  
KAZAMA M, 1999, 137 M AC SOC AM 2 CO
[7]  
Laroche J, 1999, J AUDIO ENG SOC, V47, P928
[8]  
OHNISHI T, 1997, 5 INT C SOUND VIBR, P2167
[9]   SPEECH TRANSFORMATIONS BASED ON A SINUSOIDAL REPRESENTATION [J].
QUATIERI, TF ;
MCAULAY, RJ .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (06) :1449-1464
[10]   SPEECH RECOGNITION WITH PRIMARILY TEMPORAL CUES [J].
SHANNON, RV ;
ZENG, FG ;
KAMATH, V ;
WYGONSKI, J ;
EKELID, M .
SCIENCE, 1995, 270 (5234) :303-304