Implementation of realtime STRAIGHT speech manipulation system: Report on its first implementation

被引:30
作者
Banno, Hideki [1 ]
Hata, Hiroaki [2 ]
Morise, Masanori [2 ]
Takahashi, Toru [2 ]
Irino, Toshio [2 ]
Kawahara, Hideki [2 ]
机构
[1] Meijo Univ, Fac Sci & Technol, Tempaku Ku, 1-501 Shiogamaguchi, Nagoya, Aichi 4688502, Japan
[2] Wakayama Univ, Fac Syst Engn, Wakayama 6408510, Japan
关键词
STRAIGHT speech manipulation system; Realtime; Pitch synchronous analysis; F-0; extraction; Voice conversion;
D O I
10.1250/ast.28.140
中图分类号
O42 [声学];
学科分类号
070206 [声学]; 082403 [水声工程];
摘要
A very high quality speech analysis, modification and synthesis system-STRAIGHT-has now been implemented in C language and operated in realtime. This article first provides a brief summary of STRAIGHT components and then introduces the underlying principles that enabled realtime operation. In STRAIGHT, the built-in extended pitch synchronous analysis, which does not require analysis window alignment, plays an important role in realtime implementation. A detailed description of the processing steps, which are based on the so-called "just-in-time'' architecture, is presented. Further, discussions on other issues related to realtime implementation and performance measures are also provided. The software will be available to researchers upon request.
引用
收藏
页码:140 / 146
页数:7
相关论文
共 17 条
[1]
Synthesis fidelity and time-varying spectral change in vowels [J].
Assmann, PF ;
Katz, WF .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2005, 117 (02) :886-895
[2]
Remaking speech [J].
Dudley, H .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1939, 11 (02) :169-177
[3]
Segregating information about the size and shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform [J].
Irino, T ;
Patterson, RD .
SPEECH COMMUNICATION, 2002, 36 (3-4) :181-203
[4]
Jin J., 2004, P ICSLP 2004 JEJ, V4, P530
[5]
Kawahara H, 1997, INT CONF ACOUST SPEE, P1303, DOI 10.1109/ICASSP.1997.596185
[6]
Kawahara H, 2003, 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P256
[7]
Kawahara H, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P13
[8]
Kawahara H, 1996, VOCAL FOLD, P263
[9]
Kawahara H, 1999, P EUR, P2781
[10]
Kawahara H., 2005, P INTERSPEECH, P537