Multicomponent AM-FM representations: An asymptotically exact approach

被引:91
作者
Gianfelici, Francesco [1 ]
Biagetti, Giorgio [1 ]
Crippa, Paolo [1 ]
Turchetti, Claudio [1 ]
机构
[1] Univ Politecn Marche, DEIT, I-60131 Ancona, Italy
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 03期
关键词
AM-FM speech model; envelope estimation; Gabor signal; Hilbert transform; multicomponent modeling; sinusoidal model;
D O I
10.1109/TASL.2006.889744
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents, on the basis of a rigorous mathematical formulation, a multicomponentsinusoidal model that allows an asymptotically exact reconstruction of nonstationary speech signals, regardless of their duration and without any limitation in the modeling of voiced, unvoiced, and transitional segments. The proposed approach is based on the application of the Hilbert transform to obtain an amplitude signal from which an AM component is extracted by filtering, so that the residue can then be iteratively processed in the same way. This technique permits a multicomponent AM-FM model to be derived in which the number of components (iterations) may be arbitrarily chosen. Additionally, the instantaneous frequencies of these components can be calculated with a given accuracy by segmentation of the phase signals. The validity of the proposed approach has been proven by some applications to both,synthetic signals and natural speech. Several comparisons show how this approach almost always has a higher performance than that obtained by current best practices, and does not need the complex filter optimizations required by other techniques.
引用
收藏
页码:823 / 837
页数:15
相关论文
共 33 条
[1]  
Abe K, 2003, KYOTO AREA STUDIES, V6, P133
[2]   CROSS-COUPLED PHASE-LOCKED LOOP WITH CLOSED-LOOP AMPLITUDE CONTROL [J].
BARNESS, Y ;
CASSARA, FA ;
SCHACHTER, H ;
DIFAZIO, R .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1984, 32 (02) :195-199
[3]   AM-FM ENERGY DETECTION AND SEPARATION IN NOISE USING MULTIBAND ENERGY OPERATORS [J].
BOVIK, AC ;
MARAGOS, P ;
QUATIERI, TF .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (12) :3245-3265
[4]  
BOYER R, 2003, P IEEE INT C AC SPEE, V6, P137
[5]  
BOYER R, 2002, P IEEE INT C AC SPEE, V2, P1729
[6]   Audio modeling based on delayed sinusoids [J].
Boyer, W ;
Abed-Meraim, K .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (02) :110-120
[7]  
Brockwell P. J., 1991, Times series in theory and methods, V2nd
[8]  
Campione E., 1998, P 5 INT C SPOK LANG, V7, P3163
[9]  
Chan D., 1995, P 4 EUR C SPEECH COM, V1, P867
[10]   Matching pursuits sinusoidal speech coding [J].
Etemoglu, ÇÖ ;
Cuperman, V .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05) :413-424