Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization)

被引：8

作者：

Deligne, S ^{[1
]}

Potamianos, G ^{[1
]}

Neti, C ^{[1
]}

机构：

[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

SAM2002: IEEE SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP PROCEEDINGS | 2002年

关键词：

D O I：

10.1109/SAM.2002.1191001

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In this paper, we introduce a non-linear enhancement technique called Audio-Visual Codebook Dependent Cepstral Normalization (AVCDCN) and we consider its use with both audio-only and audio-visual speech recognition. AVCDCN is inspired from CDCN [1] [2], an audio-only enhancement technique that approximates the non-linear effect of noise on speech with a piece-wise constant function. Our experiments show that the use of visual information in AVCDCN allows significant performance gains over CDCN.

引用

页码：68 / 71

页数：4