Silent-speech enhancement using body-conducted vocal-tract resonance signals

被引:47
作者
Hirahara, Tatsuya [1 ]
Otani, Makoto [1 ]
Shimizu, Shota [1 ]
Toda, Tomoki [2 ]
Nakamura, Keigo [2 ]
Nakajima, Yoshitaka [2 ]
Shikano, Kiyohiro [2 ]
机构
[1] Toyama Prefectural Univ, Dept Intelligent Syst Design Engn, Toyama 9390398, Japan
[2] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara 6300192, Japan
关键词
Non-audible murmur; Body-conducted sound; Voice conversion; Talking aids;
D O I
10.1016/j.specom.2009.12.001
中图分类号
O42 [声学];
学科分类号
070206 [声学];
摘要
The physical characteristics of weak body-conducted vocal-tract resonance signals called non-audible murmur (NAM) and the acoustic characteristics of three sensors developed for detecting these signals have been investigated. NAM signals attenuate 50 dB at 1 kHz; this attenuation consists of 30-dB full-range attenuation due to air-to-body transmission loss and 10 dB/octave spectral decay due to a sound propagation loss within the body. These characteristics agree with the spectral characteristics of measured NAM signals. The sensors have a sensitivity of between 41 and 58 dB [V/Pa] at I kHz, and the mean signal-to-noise ratio of the detected signals was 15 dB. On the basis of these investigations, three types of silent-speech enhancement systems were developed: (1) simple, direct amplification of weak vocal-tract resonance signals using a wired urethane-elastomer NAM microphone, (2) simple, direct amplification using a wireless urethane-elastomer-duplex NAM microphone, and (3) transformation of the weak vocal-tract resonance signals sensed by a soft-silicone NAM microphone into whispered speech using statistical conversion. Field testing of the systems showed that they enable voice impaired people to communicate verbally using body-conducted vocal-tract resonance signals. Listening tests demonstrated that weak body-conducted vocal-tract resonance sounds can be transformed into intelligible whispered speech sounds. Using these systems, people with voice impairments can re-acquire speech communication with less effort. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:301 / 313
页数:13
相关论文
共 26 条
[1]
ABE M, 1990, TRI0166 ATR INT TEL
[2]
Enhancement of electrolaryngeal speech by adaptive filtering [J].
Espy-Wilson, CY ;
Chari, VR ;
MacAuslan, JM ;
Huang, CB ;
Walsh, MJ .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 1998, 41 (06) :1253-1264
[3]
Fant G., 1971, ACOUSTIC THEORY SPEE
[4]
Morphology and development of the human vocal tract: A study using magnetic resonance imaging [J].
Fitch, WT ;
Giedd, J .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (03) :1511-1522
[5]
FUJISAKA Y, 2004, TECHNICAL REPORT I E, V103, P13
[6]
Accurate hidden Markov models for non-audible murmur (NAM) recognition based on iterative supervised adaptation [J].
Heracleous, P ;
Nakajima, Y ;
Lee, A ;
Saruwatari, H ;
Shikano, K .
ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, :73-76
[7]
Acoustical-perceptual correlates of "whisper pitch" in synthetically generated vowels [J].
Higashikawa, M ;
Minifie, FD .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 1999, 42 (03) :583-591
[8]
Kikuchi Y., 2004, SP 2004, P761
[9]
Nakagiri M, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P2270
[10]
Non-audible murmur (NAM) recognition [J].
Nakajima, Y ;
Kashioka, H ;
Campbell, N ;
Shikano, K .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (01) :1-8