Exploiting latent semantic information in statistical language modeling

被引:208
作者
Bellegarda, JR [1 ]
机构
[1] Apple Comp Inc, Spoken Language Grp, Cupertino, CA 95014 USA
关键词
latent semantic analysis; multispan integration; n-grams; speech recognition; statistical language modeling;
D O I
10.1109/5.880084
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Statistical language models used in large-vocabulary speech recognition must properly encapsulate the various constraints, both focal and global, present in the language. While local constraints are readily captured through n-gram modeling, global constraints, such as long-term semantic dependencies, have been more difficult to handle within a data-driven formalism. This paper focuses on the use of latent semantic analysis, a paradigm that automatically uncovers the salient semantic relationships between words and documents in a given corpus. In this approach, (discrete) words and documents are mapped onto a (continuous) semantic vector space, in which familiar clustering techniques ran be applied. This leads to the specification of a powerful framework for automatic semantic classification, as well as the derivation of several language model families with various smoothing properties. Because of their large-span nature, these language models are well suited to complement conventional n-grams. An integrative formulation is proposed for harnessing this synergy, in which the latent semantic information is used to adjust the standard n-gram probability. Such hybrid language modeling compares favorably with the corresponding n-gram baseline: experiments conducted on the Wall Street Journal domain show a reduction in average word error rate of over 20%. This paper concludes with a discussion of intrinsic tradeoffs, such as the influence of training data selection on the resulting performance.
引用
收藏
页码:1279 / 1296
页数:18
相关论文
共 73 条
[1]  
[Anonymous], EUROSPEECH 95
[2]  
[Anonymous], AUTOMATIC SPEECH SPE, DOI DOI 10.1007/978-1-4613-1367-0_1
[3]   A MAXIMUM-LIKELIHOOD APPROACH TO CONTINUOUS SPEECH RECOGNITION [J].
BAHL, LR ;
JELINEK, F ;
MERCER, RL .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1983, 5 (02) :179-190
[4]  
Bellegarda J.-R., 1997, P 5 EUR C SPEECH COM, P1451
[5]  
Bellegarda JR, 1998, IEEE T SPEECH AUDI P, V6, P456, DOI 10.1109/89.709671
[6]   Large vocabulary speech recognition with multispan statistical language models [J].
Bellegarda, JR .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (01) :76-84
[7]  
Bellegarda JR, 1996, INT CONF ACOUST SPEE, P172, DOI 10.1109/ICASSP.1996.540318
[8]  
BELLEGARDA JR, IN PRESS ROBUSTNESS
[9]  
BELLEGARDA JR, 1997, AUTOMATIC SPEECH SPE, V3, P1451
[10]  
BELLEGARDA JR, 2000, P INT C SPOK LANG P