Mean and variance adaptation within the MLLR framework

被引:251
作者
Gales, MJF
Woodland, PC
机构
[1] Cambridge Univ. Eng. Department, Cambridge CB2 1PZ, Trumpington Street
关键词
D O I
10.1006/csla.1996.0013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the key issues for adaptation algorithms is to modify a large number of parameters with only a small amount of adaptation data. Speaker adaptation techniques try to obtain near speaker-dependent (SD) performance with only small amounts of speaker-specific data, and are often based on initial speaker-independent (SI) recognition systems. Some of these speaker adaptation techniques may also be applied to the task of adaptation to a new acoustic environment. In this case an SI recognition system trained in, typically, a dean acoustic environment is adapted to operate in a new, noise-corrupted, acoustic environment. This paper examines the maximum likelihood linear regression (MLLR) adaptation technique. MLLR estimates linear transformations for groups of model parameters to maximize the likelihood of the adaptation data. Previously, MLLR has been applied to the mean parameters in mixture-Gaussian HMM systems. In this paper MLLR is extended to also update the Gaussian variances and re-estimation formulae are derived for these variance transforms. MLLR with variance compensation is evaluated on several large vocabulary recognition tasks. The use of mean and variance MLLR adaptation was found to give an additional 2% to 7% decrease in word error rate over mean-only MLLR adaptation. (C) 1996 Academic Press Limited
引用
收藏
页码:249 / 264
页数:16
相关论文
共 23 条
  • [1] [Anonymous], P 1995 ARPA HUM LANG
  • [2] [Anonymous], 1994, P HUMAN LANG TECHN W
  • [3] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [4] SPEAKER ADAPTATION USING CONSTRAINED ESTIMATION OF GAUSSIAN MIXTURES
    DIGALAKIS, VV
    RTISCHEV, D
    NEUMEYER, LG
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (05): : 357 - 366
  • [5] GALES M, 1996, THESIS CAMBRIDGE U C
  • [6] ROBUST SPEECH RECOGNITION IN ADDITIVE AND CONVOLUTIONAL NOISE USING PARALLEL MODEL COMBINATION
    GALES, MJF
    YOUNG, SJ
    [J]. COMPUTER SPEECH AND LANGUAGE, 1995, 9 (04) : 289 - 307
  • [7] GALES MJF, 1996, CUEDFINFENGTR242 CAM
  • [8] GALES MJF, 1995, P ICASSP, V1, P133
  • [9] Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains
    Gauvain, Jean-Luc
    Lee, Chin-Hui
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02): : 291 - 298
  • [10] HEWETT AJ, 1989, THESIS CAMBRIGE U CA