A generalized smoothness criterion for acoustic-to-articulatory inversion

被引:64
作者
Ghosh, Prasanta Kumar [1 ]
Narayanan, Shrikanth [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Signal Anal & Interpretat Lab, Los Angeles, CA 90089 USA
关键词
D O I
10.1121/1.3455847
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The many-to-one mapping from representations in the speech articulatory space to acoustic space renders the associated acoustic-to-articulatory inverse mapping non-unique. Among various techniques, imposing smoothness constraints on the articulator trajectories is one of the common approaches to handle the non-uniqueness in the acoustic-to-articulatory inversion problem. This is because, articulators typically move smoothly during speech production. A standard smoothness constraint is to minimize the energy of the difference of the articulatory position sequence so that the articulator trajectory is smooth and low-pass in nature. Such a fixed definition of smoothness is not always realistic or adequate for all articulators because different articulators have different degrees of smoothness. In this paper, an optimization formulation is proposed for the inversion problem, which includes a generalized smoothness criterion. Under such generalized smoothness settings, the smoothness parameter can be chosen depending on the specific articulator in a data-driven fashion. In addition, this formulation allows estimation of articulatory positions recursively over time without any loss in performance. Experiments with the MOCHA TIMIT database show that the estimated articulator trajectories obtained using such a generalized smoothness criterion have lower RMS error and higher correlation with the actual measured trajectories compared to those obtained using a fixed smoothness constraint. (C) 2010 Acoustical Society of America. [DOI: 10.1121/1.3455847]
引用
收藏
页码:2162 / 2172
页数:11
相关论文
共 39 条
  • [1] [Anonymous], P INT
  • [2] [Anonymous], 1989, Haskins Laboratories Status Report on Speech Research, DOI DOI 10.1017/S0952675700001019
  • [3] INVERSION OF ARTICULATORY-TO-ACOUSTIC TRANSFORMATION IN VOCAL-TRACT BY A COMPUTER-SORTING TECHNIQUE
    ATAL, BS
    CHANG, JJ
    MATHEWS, MV
    TUKEY, JW
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1978, 63 (05) : 1535 - 1555
  • [4] BROWMAN C, 1986, PHONOLOGY YB, V2, P219
  • [5] GESTURAL SPECIFICATION USING DYNAMICALLY-DEFINED ARTICULATORY STRUCTURES
    BROWMAN, CP
    GOLDSTEIN, L
    [J]. JOURNAL OF PHONETICS, 1990, 18 (03) : 299 - 320
  • [6] CHENNOUKH S, 1997, P EUR RHOD GREEC, P429
  • [7] Cover T.M., 2006, ELEMENTS INFORM THEO, V2nd ed
  • [8] Estimation of the information by an adaptive partitioning of the observation space
    Darbellay, GA
    Vajda, I
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1999, 45 (04) : 1315 - 1321
  • [9] Duda R.O., 1983, PATTERN CLASSIFICATI
  • [10] Dusan S., 2000, 5 SPEECH PRODUCTION, P237