Statistical parametric speech synthesis

被引:914
作者
Zen, Heiga [1 ,2 ]
Tokuda, Keiichi [1 ]
Black, Alan W. [3 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci & Engn, Showa Ku, Nagoya, Aichi 4668555, Japan
[2] Toshiba Res Europe Ltd, Cambridge Res Lab, Cambridge CB4 0GZ, England
[3] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
Speech synthesis; Unit selection; Hidden Markov models; MAXIMUM-LIKELIHOOD; SPEAKER ADAPTATION; COVARIANCE MATRICES; VOICE CONVERSION; SYNTHESIS SYSTEM; MARKOV-MODELS; HMM; GENERATION; INTERPOLATION; CONTOURS;
D O I
10.1016/j.specom.2009.04.004
中图分类号
O42 [声学];
学科分类号
070206 [声学];
摘要
This review gives a general overview of techniques used in statistical parametric speech synthesis. One instance of these techniques, called hidden Markov model (HMM)-based speech synthesis, has recently been demonstrated to be very effective in synthesizing acceptable speech. This review also contrasts these techniques with the more conventional technique of unit-selection synthesis that has dominated speech synthesis over the last decade. The advantages and drawbacks of statistical parametric synthesis are highlighted and we identify where we expect key developments to appear in the immediate future. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:1039 / 1064
页数:26
相关论文
共 238 条
[1]
ABDELHAMID O, 2006, P INT, P1332
[2]
Acero Alex., 1999, Proceedings of Eurospeech, P1047
[3]
NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[4]
AKAMINE M, 1998, P ICSLP, P139
[5]
ALLAUZEN C, 2004, P 42 M ACL
[6]
Anastasakos T, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P1137, DOI 10.1109/ICSLP.1996.607807
[7]
[Anonymous], P INT
[8]
[Anonymous], P INT ANTW BELG
[9]
[Anonymous], 1999, P EUROSPEECH
[10]
[Anonymous], 1997, Eurospeech97