Large vocabulary speech recognition with multispan statistical language models

被引:44
作者
Bellegarda, JR [1 ]
机构
[1] Apple Comp Inc, Spoken Language Grp, Cupertino, CA 95014 USA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2000年 / 8卷 / 01期
关键词
latent semantic analysis; multispan integration; n-grams; speech recognition; statistical language modeling;
D O I
10.1109/89.817455
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Multispan language modeling refers to the integration of the various constraints, both local and global, present in the language. It was recently proposed to capture global constraints through the use of latent semantic analysis, while taking local constraints into account via the usual n-gram approach. This has led to several families of data-driven, multispan language models for large vocabulary speech recognition. Because of the inherent complementarity in the two types of constraints, the multispan performance, as measured by perplexity, has been shown to compare favorably with the corresponding n-gram performance, The objective of this work is to characterize the behavior of such multispan modeling in actual recognition. Major implementation issues are addressed, including search integration and context scope selection. Experiments are conducted on a subset of the Wall Street Journal (WSJ) speaker-independent, 20000-word vocabulary, continuous speech task. Results show that, compared to standard n-gram, the multispan framework can lead to a reduction in average word error rate of over 20%. The paper concludes with a discussion of intrinsic multi-span tradeoffs, such as the influence of training data selection on the resulting performance.
引用
收藏
页码:76 / 84
页数:9
相关论文
共 22 条
[1]  
Bellegarda J.-R., 1997, P 5 EUR C SPEECH COM, P1451
[2]   TIED MIXTURE CONTINUOUS PARAMETER MODELING FOR SPEECH RECOGNITION [J].
BELLEGARDA, JR ;
NAHAMOO, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (12) :2033-2045
[3]  
Bellegarda JR, 1998, IEEE T SPEECH AUDI P, V6, P456, DOI 10.1109/89.709671
[4]  
BELLEGARDA JR, 1996, P 1996 INT C AC SPEE, P1172
[5]  
BELLEGARDA JR, 1996, SPEECH SPEAKER RECOG, P133
[6]  
BELLEGARDA JR, 1998, P 1998 INT C AC SPEE, V2, P677
[7]   LARGE-SCALE SPARSE SINGULAR VALUE COMPUTATIONS [J].
BERRY, MW .
INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1992, 6 (01) :13-49
[8]   Using linear algebra for intelligent information retrieval [J].
Berry, MW ;
Dumais, ST ;
OBrien, GW .
SIAM REVIEW, 1995, 37 (04) :573-595
[9]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[10]  
2-9