Studies on inter-speaker variability in speech and its application in automatic speech recognition

被引:7
作者
Umesh, S. [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
来源
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2011年 / 36卷 / 05期
关键词
Vowel-normalization; vocal-tract length normalization; speech-scale; frequency-warping; linear transformation of cepstra; speaker-adaptation; HIDDEN MARKOV-MODELS; CHILDRENS SPEECH; VOCAL-TRACT; ADAPTATION; NORMALIZATION; VOWEL; REPRESENTATION; TRANSFORMATION; CLASSIFICATION; MFCC;
D O I
10.1007/s12046-011-0049-x
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this paper, we give an overview of the problem of inter-speaker variability and its study in many diverse areas of speech signal processing. We first give an overview of vowel-normalization studies that minimize variations in the acoustic representation of vowel realizations by different speakers. We then describe the universal-warping approach to speaker normalization which unifies many of the vowel normalization approaches and also shows the relation between speech production, perception and auditory processing. We then address the problem of inter-speaker variability in automatic speech recognition (ASR) and describe techniques that are used to reduce these effects and thereby improve the performance of speaker-independent ASR systems.
引用
收藏
页码:853 / 883
页数:31
相关论文
共 61 条
[41]   PARAMETERS OF VOWEL QUALITY [J].
PETERSON, GE .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1961, 4 (01) :10-29
[42]  
PITZ M, 2001, P EUR, P2653
[43]  
Pitz M., 2005, THESIS RWTH AACHEN
[44]  
Rath S.P., 2009, P ANN C INT SPEECH C, P556
[45]  
Sanand D. R., 2009, P INT BRIGHT UK, P584
[46]  
SANAND DR, 2007, P INT ANTW, P1138
[47]  
SANAND DR, 2008, P INT BRISB AUSTR, P1233
[48]   RESONANCES OF A BENT VOCAL-TRACT [J].
SONDHI, MM .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1986, 79 (04) :1113-1116
[49]  
Stevens S., 1940, AM J PSYCHOL, V53, P329
[50]   A NEURONAL MODEL OF VOWEL NORMALIZATION AND REPRESENTATION [J].
SUSSMAN, HM .
BRAIN AND LANGUAGE, 1986, 28 (01) :12-23