Inferential language models for information retrieval

被引:2
作者
Nie, Jian-Yun [1 ,2 ]
Cao, Guihong [1 ,2 ]
Bai, Jing [1 ,2 ]
机构
[1] DIRO, University of Montreal, Montreal, Que. H3C 3J7, succursale Centreville
来源
ACM Transactions on Asian Language Information Processing | 2006年 / 5卷 / 04期
关键词
Document expansion; Inference; Inferential model; Query expansion;
D O I
10.1145/1236181.1236183
中图分类号
学科分类号
摘要
Language modeling (LM) has been widely used in IR in recent years. An important operation in LM is smoothing of the document language model. However, the current smoothing techniques merely redistribute a portion of term probability according to their frequency of occurrences only in the whole document collection. No relationships between terms are considered and no inference is involved. In this article, we propose several inferential language models capable of inference using term relationships. The inference operation is carried out through a semantic smoothing either on the document model or query model, resulting in document or query expansion. The proposed models implement some of the logical inference capabilities proposed in the previous studies on logical models, but with necessary simplifications in order to make them tractable. They are a good compromise between inference power and efficiency. The models have been tested on several TREC collections, both in English and Chinese. It is shown that the integration of term relationships into the language modeling framework can consistently improve the retrieval effectiveness compared with the traditional language models. This study shows that language modeling is a suitable framework to implement basic inference operations in IR effectively. © 2006 ACM.
引用
收藏
页码:296 / 322
页数:26
相关论文
共 39 条
  • [1] BAI J., SONG D., BRUZA P., NIE J.-Y., CAO G., Query expansion using term relationships in language models for information retrieval, ACM CIKM Conference, pp. 688-695, (2005)
  • [2] BERGER A., LAFFERTY J., Information retrieval as statistical translation, ACM SIGIR Conference, pp. 222-229, (1999)
  • [3] BREMAUD P., Markov Chains: Gibbs Fields, Monte Carlo Simulations, and Queues, (1999)
  • [4] BROWN P.P.F., PIETRA S.A.D., PIETRA V.D.J., MERCER R.L., The mathematics of machine translation: Parameter estimation, Computat. Linguist, 19, pp. 263-312, (1993)
  • [5] BRUZA P.D., SONG D., WONG K.F., Aboutness from a commonsense perspective, J. Amer. Soc. Inform. Sci. Techn, 51, 12, pp. 1090-1105, (2000)
  • [6] BRUZA P., HUIBERS T.W.C., A study of aboutness in information retrieval, AI Rev, 10, (1996)
  • [7] CAO G., NIE J.Y., BAI J., Integrating word relationships into language models, ACM SIGIR Conference, pp. 298-305, (2005)
  • [8] CHEN S.F., GOODMAN J., An empirical study of smoothing techniques for language modeling, (1998)
  • [9] CRESTANI F., VAN RIJSBERGEN C.J., Information retrieval by logical imaging, Document. J, 51, pp. 3-17, (1995)
  • [10] CROFT W.B., LUCIA T.J., AND COHEN P.R., Retrieving documents by plausible inference: A preliminary study, ACM SIGIR Conference, pp. 481-494, (1988)