PURR - A method for prosody evaluation and investigation

被引:25
作者
Sonntag, GP [1 ]
Portele, T [1 ]
机构
[1] Univ Bonn, Inst Kommunikat Forsch & Phonet, D-53115 Bonn, Germany
关键词
D O I
10.1006/csla.1998.0107
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since intelligibility of synthetic speech is no longer the main criterion on which to base quality judgements, reliable methods for prosody evaluation become more important. We propose a method called PURR (Prosody Unveiling through Restricted Representation) to evaluate the prosodic component of a synthesis system without the interference of other system components. In PURR, the stimuli are reduced to their prosodic content. The method has proven to be suitable for test designs with naive listeners. It can be used for comparative studies as well as for diagnostic analyses and is, therefore, a useful tool for basic research on the perception of prosodic phenomena. In this paper we first describe how the best signal manipulation method was determined using perception tests. The appropriateness of the resulting signal is further assessed in a recognition test of syntactic structure. We then report on further validations of the proposed method: different ways of synthetic prosody modelling are evaluated, both in comparison with human prosody and amongst different synthesis systems. (C) 1998 Academic Press.
引用
收藏
页码:437 / 451
页数:15
相关论文
共 29 条
[1]  
[Anonymous], 1956, GRUNDZUGE HOCHDEUTSC
[2]  
BENOIT C, 1991, P EUROSPEECH, V2, P875
[3]  
BLADON A, 1990, SPEECH TECHNOLOGY, P215
[4]  
*CCITT, 1989, BLUE BOOK 5 TEL TRAN, P87
[5]   ON THE PERCEPTUAL STRENGTH OF PROSODIC BOUNDARIES AND ITS RELATION TO SUPRASEGMENTAL CUES [J].
DEPIJPER, JR ;
SANDERMAN, AA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 96 (04) :2037-2047
[6]  
FANT G, 1985, SPEECH TRANSMISSION, P1
[7]  
Fourcin A., 1989, SPEECH INPUT OUTPUT
[8]   EXPERIMENTS IN THE PERCEPTION OF STRESS [J].
FRY, DB .
LANGUAGE AND SPEECH, 1958, 1 (02) :126-152
[9]  
GUAITELLA I, 1992, TALKING MACHINES THE, P351
[10]  
Heuft B, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P1361, DOI 10.1109/ICSLP.1996.607866