Online unsupervised learning of hidden Markov models for adaptive speech recognition

被引:4
作者
Chien, JT [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
来源
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING | 2001年 / 148卷 / 05期
关键词
D O I
10.1049/ip-vis:20010560
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A novel framework of an online unsupervised learning algorithm is presented to flexibly adapt the existing speaker-independent hidden Markov models (HMMs) to nonstationary environments induced by varying speakers, transmission channels, ambient noises, etc. The quasi-Bayes (QB) estimate is applied to incrementally obtain word sequence and adaptation parameters for adjusting HMMs when a block of unlabelled data is enrolled. The underlying statistics of a nonstationary environment can be successively traced according to the newest enrolment data. To improve the QB estimate, the adaptive initial hyperparameters are employed in the beginning session of online learning. These hyperparameters are estimated from a cluster of training speakers closest to the test environment. Additionally, a selection process is developed to select reliable parameters from a list of candidates for unsupervised teaming. A set of reliability assessment criteria is explored for selection. In a series of speaker adaptation experiments, the effectiveness of the proposed method is confirmed and it is found that using the adaptive initial hyperparameters in online learning and the multiple assessments in parameter selection can improve the recognition performance.
引用
收藏
页码:315 / 324
页数:10
相关论文
共 30 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
Anastasakos T., 1998, P 5 INT C SPOK LANG, P599
[3]  
Chen SS, 1998, INT CONF ACOUST SPEE, P645, DOI 10.1109/ICASSP.1998.675347
[4]   A hybrid algorithm for speaker adaptation using MAP transformation and adaptation [J].
Chien, JT ;
Lee, CH ;
Wang, HC .
IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (06) :167-169
[5]   Unsupervised hierarchical adaptation using reliable selection of cluster-dependent parameters [J].
Chien, JT ;
Junqua, JC .
SPEECH COMMUNICATION, 2000, 30 (04) :235-253
[6]   Adaptation of hidden Markov model for telephone speech recognition and speaker adaptation [J].
Chien, JT ;
Wang, HC .
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1997, 144 (03) :129-135
[7]   Online hierarchical transformation of hidden Markov models for speech recognition [J].
Chien, JT .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (06) :656-667
[8]  
COX SJ, 1989, P IEEE INT C AC SPEE, P294
[9]  
DeGroot M., 1970, OPTIMAL STAT DECISIO
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38