Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction:: Possible role of a repetitive structure in sounds

被引:1326
作者
Kawahara, H [1 ]
Masuda-Katsuse, I [1 ]
de Cheveigné, A [1 ]
机构
[1] ATR, Human Informat Proc Res Labs, Kyoto 61902, Japan
关键词
speech analysis; pitch-synchronous; spline smoothing; instantaneous frequency; F0; extraction; speech synthesis; speech modification;
D O I
10.1016/S0167-6393(98)00085-5
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A set of simple new procedures has been developed to enable the real-time manipulation of speech parameters. The proposed method uses pitch-adaptive spectral analysis combined with a surface reconstruction method in the time-frequency region. The method also consists of a fundamental frequency (F0) extraction using instantaneous frequency calculation based on a new concept caned 'fundamentalness'. The proposed procedures preserve the details of time-frequency surfaces while almost perfectly removing fine structures due to signal periodicity. This close-to-perfect elimination of interferences and smooth FO trajectory allow for over 600% manipulation of such speech parameters as pitch, vocal tract length, and speaking rate, while maintaining high reproductive quality. (C) 1999 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:187 / 207
页数:21
相关论文
共 33 条
[1]  
ABE T, 1995, IEICE T INF SYST, VE78D, P1188
[2]  
Abe T, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P1277, DOI 10.1109/ICSLP.1996.607843
[3]  
ABRANTES AJ, 1991, P EUR 91 PAR, P231
[4]  
Albert S. Bregman, 1990, AUDITORY SCENE ANAL, P411, DOI [DOI 10.1121/1.408434, DOI 10.7551/MITPRESS/1486.001.0001]
[5]  
[Anonymous], 1982, VISION COMPUTATIONAL
[6]   SPEECH ANALYSIS AND SYNTHESIS BY LINEAR PREDICTION OF SPEECH WAVE [J].
ATAL, BS ;
HANAUER, SL .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 50 (02) :637-+
[7]   GROUP DELAY DISTORTIONS IN ELECTRO-ACOUSTICAL SYSTEMS [J].
BLAUERT, J ;
LAWS, P .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1978, 63 (05) :1478-1483
[8]   ESTIMATING AND INTERPRETING THE INSTANTANEOUS FREQUENCY OF A SIGNAL .1. FUNDAMENTALS [J].
BOASHASH, B .
PROCEEDINGS OF THE IEEE, 1992, 80 (04) :520-538
[9]  
BOASHASH B, 1992, P IEEE, V80, P550
[10]  
CASPERS B, 1987, P IEEE INT C AC SPEE, V4, P2388