NEGLECTED DIMENSIONS IN SPEECH SYNTHESIS

被引：9

作者：

GRANSTROM, B

NORD, L

机构：

[1] Department of Speech Communication and Music Acoustics, Royal Institute of Technology, KTH, S-10044 Stockholm

来源：

SPEECH COMMUNICATION | 1992年 / 11卷 / 4-5期

关键词：

SPEECH SYNTHESIS; SPEAKING STYLE; TEXT-TO-SPEECH; INTENSITY; LOUDNESS;

D O I：

10.1016/0167-6393(92)90051-8

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In traditional accounts on speech prosody, fundamental frequency, duration and intensity have been described as the most important attributes. Among these, intensity has attracted the least attention. In perceptual studies both F0 and duration have had an undisputable role in signalling prosodic categories, but the role of intensity has been less clear. This has resulted in an emphasis on the former attributes in current speech synthesis schemes. We are in this study exploring the use of speech intensity and also other segmental correlates of prosody. Intensity has a dynamic aspect, discriminating emphasized and reduced stretches of speech. A more global aspect of intensity must be controlled when we try to model different speaking styles. Specifically, we have been trying to model the continuum from soft to loud speech.

引用

页码：459 / 462

页数：4