Effect of vocal effort on spectral properties of vowels

被引:103
作者
Liénard, JS
Di Benedetto, MG
机构
[1] LIMSI, CNRS, F-91403 Orsay, France
[2] Univ La Sapienza, Dipartimento INFOCOM, I-00184 Rome, Italy
关键词
D O I
10.1121/1.428140
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The effects of variations in vocal effort corresponding to common conversation situations on spectral properties of vowels were investigated. A database in which three degrees of vocal effort were suggested to the speakers by varying the distance to their interlocutor in three steps (close-0.4 m, normal-1.5 m, and far-6 m) was recorded. The speech materials consisted of isolated French vowels, uttered by ten naive speakers in a quiet furnished room. Manual measurements of fundamental frequency F0, frequencies, and amplitudes of the first three formants (F1, F2, F3, A1, A2, and A3), and on total amplitude were carried out. The speech materials were perceptually validated in three respects: identity of the vowel, gender of the speaker, and vocal effort. Results indicated that the speech materials were appropriate for the study. Acoustic analysis showed that F0 and F1 were highly correlated with vocal effort and varied at rates close to 5 Hz/dB for F0 and 3.5 Hz/dB for F1. Statistically F2 and F3 did not vary significantly with vocal effort. Formant amplitudes A1, A2, and A3 increased significantly; The amplitudes in the high-frequency range increased more than those in the lower part of the spectrum, revealing a change in spectral tilt. On the average, when the overall amplitude is increased by 10 dB, A1, A2, and A3 are increased by 11, 12.4, and 13 dB, respectively. Using ''auditory'' dimensions, such as the F1-F0 difference, and a "spectral center of gravity" between adjacent formants for representing vowel features did not reveal a better constancy of these parameters with respect to the variations of vocal effort and speaker. Thus a global view is evoked, in which all of the aspects of the signal should be processed simultaneously. (C) 1999 Acoustical Society of America. [S0001-4966(99)02707-1].
引用
收藏
页码:411 / 422
页数:12
相关论文
共 15 条
[1]   VOWEL IDENTIFICATION - ORTHOGRAPHIC, PERCEPTUAL, AND ACOUSTIC ASPECTS [J].
ASSMANN, PF ;
NEAREY, TM ;
HOGAN, JT .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1982, 71 (04) :975-989
[2]  
CHISTOVICH LA, 1979, FRONTIERS SPEECH COM, P143
[3]  
DIBENEDETTO MG, 1994, J PHONETICS, V22, P205
[4]   NEGLECTED DIMENSIONS IN SPEECH SYNTHESIS [J].
GRANSTROM, B ;
NORD, L .
SPEECH COMMUNICATION, 1992, 11 (4-5) :459-462
[5]   THE LOMBARD REFLEX AND ITS ROLE ON HUMAN LISTENERS AND AUTOMATIC SPEECH RECOGNIZERS [J].
JUNQUA, JC .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1993, 93 (01) :510-524
[6]  
LIENARD JS, 1995, LEVELS SPEECH COMMUN
[7]  
LINDBLOM B, 1987, P 11 INT C PHON SCI, P9
[8]   ARTICULATORY DYNAMICS OF LOUD AND NORMAL SPEECH [J].
SCHULMAN, R .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1989, 85 (01) :295-312
[9]   Spectral balance as a cue in the perception of linguistic stress [J].
Sluijter, AMC ;
vanHeuven, VJ ;
Pacilly, JJA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 101 (01) :503-513
[10]   Spectral balance as an acoustic correlate of linguistic stress [J].
Sluijter, AMC ;
vanHeuven, VJ .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (04) :2471-2485