Silent-speech enhancement using body-conducted vocal-tract resonance signals

被引：47

作者：

Hirahara, Tatsuya ^{[1
]}

Otani, Makoto ^{[1
]}

Shimizu, Shota ^{[1
]}

Toda, Tomoki ^{[2
]}

Nakamura, Keigo ^{[2
]}

Nakajima, Yoshitaka ^{[2
]}

Shikano, Kiyohiro ^{[2
]}

机构：

[1] Toyama Prefectural Univ, Dept Intelligent Syst Design Engn, Toyama 9390398, Japan

[2] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara 6300192, Japan

来源：

SPEECH COMMUNICATION | 2010年 / 52卷 / 04期

关键词：

Non-audible murmur; Body-conducted sound; Voice conversion; Talking aids;

D O I：

10.1016/j.specom.2009.12.001

中图分类号：

O42 [声学];

学科分类号：

070206 [声学];

摘要：

The physical characteristics of weak body-conducted vocal-tract resonance signals called non-audible murmur (NAM) and the acoustic characteristics of three sensors developed for detecting these signals have been investigated. NAM signals attenuate 50 dB at 1 kHz; this attenuation consists of 30-dB full-range attenuation due to air-to-body transmission loss and 10 dB/octave spectral decay due to a sound propagation loss within the body. These characteristics agree with the spectral characteristics of measured NAM signals. The sensors have a sensitivity of between 41 and 58 dB [V/Pa] at I kHz, and the mean signal-to-noise ratio of the detected signals was 15 dB. On the basis of these investigations, three types of silent-speech enhancement systems were developed: (1) simple, direct amplification of weak vocal-tract resonance signals using a wired urethane-elastomer NAM microphone, (2) simple, direct amplification using a wireless urethane-elastomer-duplex NAM microphone, and (3) transformation of the weak vocal-tract resonance signals sensed by a soft-silicone NAM microphone into whispered speech using statistical conversion. Field testing of the systems showed that they enable voice impaired people to communicate verbally using body-conducted vocal-tract resonance signals. Listening tests demonstrated that weak body-conducted vocal-tract resonance sounds can be transformed into intelligible whispered speech sounds. Using these systems, people with voice impairments can re-acquire speech communication with less effort. (C) 2009 Elsevier B.V. All rights reserved.

引用

页码：301 / 313

页数：13

共 26 条

[1]

ABE M, 1990, TRI0166 ATR INT TEL

[2]

Enhancement of electrolaryngeal speech by adaptive filtering [J].

Espy-Wilson, CY ;

Chari, VR ;

MacAuslan, JM ;

Huang, CB ;

Walsh, MJ .

JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 1998, 41 (06) :1253-1264

[3]

Fant G., 1971, ACOUSTIC THEORY SPEE

[4]

Morphology and development of the human vocal tract: A study using magnetic resonance imaging [J].

Fitch, WT ;

Giedd, J .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 106 (03) :1511-1522

[5]

FUJISAKA Y, 2004, TECHNICAL REPORT I E, V103, P13

[6]

Accurate hidden Markov models for non-audible murmur (NAM) recognition based on iterative supervised adaptation [J].

Heracleous, P ;

Nakajima, Y ;

Lee, A ;

Saruwatari, H ;

Shikano, K .

ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, :73-76

[7]

Acoustical-perceptual correlates of "whisper pitch" in synthetically generated vowels [J].

Higashikawa, M ;

Minifie, FD .

JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 1999, 42 (03) :583-591

[8]

Kikuchi Y., 2004, SP 2004, P761

[9]

Nakagiri M, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P2270

[10]

Non-audible murmur (NAM) recognition [J].

Nakajima, Y ;

Kashioka, H ;

Campbell, N ;

Shikano, K .

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (01) :1-8

← 1 2 3 →