Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition

被引：3

作者：

Zhou, GD ^{[1
]}

Lua, KT ^{[1
]}

机构：

[1] Natl Univ Singapore, Sch Comp, Dept Comp Sci, Singapore 119260, Singapore

来源：

COMPUTER SPEECH AND LANGUAGE | 1999年 / 13卷 / 02期

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While n-gram modeling is simple and dominant in speech recognition, it can only capture the short-distance context dependency within an n-word window where currently the largest practical n for natural language is three. However, many of the context dependencies in natural language occur beyond a three-word window. This paper proposes a new language modeling approach to capture the preferred relationships between words over a short or long distance through the concept of MI-Trigger pairs. Different MI-Trigger-based models are constructed in either a distance-dependent or a distance-independent way within a window from 1 to 10 words. This new MI-Trigger-based modeling is also compared and merged with word bigram modeling. It is found that the MI-Trigger-based modeling has better performance than word bigram modeling. It is also found that n-gram and MI-Trigger models have good complementarity and their proper merging can further increase the recognition rate when tested on Mandarin speech recognition. One advantage of MI-Trigger-based modeling is that the number of parameters needed for MI-Trigger modeling is much less than that of word bigram modeling. Another advantage is that the number of trigger pairs in an MI-Trigger model can be kept to a reasonable size without losing too much of its modeling power. (C) 1999 Academic Press.

引用

页码：125 / 141

页数：17

共 20 条

[1] [Anonymous], P AAAI WORKSH INT NA
[2] BRENT M, 1993, COMPUTATIONAL LINGUI, V19, P263
[3] Brown P. F., 1992, Computational Linguistics, V18, P467
[4] CALZOLORI N, 1990, P COLING AUG HELS FI, V2, P54
[5] Church K. W., 1991, Computer Speech and Language, V5, P19, DOI 10.1016/0885-2308(91)90016-J
[6] GALE WA, 1990, P DARPA SPEECH NAT L, P293
[7] Harper MP, 1994, P AAAI WORKSH INT NA, P139
[8] Hindle D., 1993, Computational Linguistics, V19, P103
[9] ESTIMATION OF PROBABILITIES FROM SPARSE DATA FOR THE LANGUAGE MODEL COMPONENT OF A SPEECH RECOGNIZER
KATZ, SM
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (03): : 400 - 401
[10] KOBAYASHI T, 1994, P COLING 5 9 AUG KYO, V6, P865

← 1 2 →