The impact on retrieval effectiveness of skewed frequency distributions

被引:9
作者
Sanderson, M [1 ]
van Rijsbergen, CJ
机构
[1] Univ Sheffield, Dept Informat Studies, Sheffield S10 2TN, S Yorkshire, England
[2] Univ Glasgow, Dept Comp Sci, Glasgow G12 8QQ, Lanark, Scotland
关键词
experimentation; measurement; pseudowords; word sense ambiguity; word sense disambiguation;
D O I
10.1145/326440.326447
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present an analysis of word senses that provides a Fresh insight into the impact of word ambiguity on retrieval effectiveness with potential broader implications for other processes of information retrieval. Using a methodology of forming artificially ambiguous words, known as pseudowords, and through reference to other researchers' work, the analysis illustrates that the distribution of the frequency of occurrence of the senses of a word plays a strong role in ambiguity's impact on effectiveness. Further investigation shows that this analysis may also be applicable to other processes of retrieval, such as Cross Language Information Retrieval, query expansion, retrieval of OCR'ed texts, and stemming. The analysis appears to provide a means of explaining, at least in part, reasons for the processes' impact (or lack of it) on effectiveness.
引用
收藏
页码:440 / 465
页数:26
相关论文
共 39 条
  • [1] [Anonymous], P 16 ANN INT ACM SIG
  • [2] [Anonymous], 1949, Human behaviour and the principle of least-effort
  • [3] [Anonymous], 1992, P 30 ANN M ASS COMP
  • [4] Ballesteros L, 1997, PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P84, DOI 10.1145/278459.258540
  • [5] DOCUMENT-RETRIEVAL EXPERIMENTS USING INDEXING VOCABULARIES OF VARYING SIZE .1. VARIETY GENERATION SYMBOLS ASSIGNED TO THE FRONTS OF INDEX TERMS
    BURNETT, JE
    COOPER, D
    LYNCH, MF
    WILLETT, P
    WYCHERLEY, M
    [J]. JOURNAL OF DOCUMENTATION, 1979, 35 (03) : 197 - 206
  • [6] CHURCH KW, 1995, P 18 ANN INT ACM SIG, P310
  • [7] CRESTANI F, 1997, P 6 TEXT RETR C
  • [8] GALE W, 1992, INTELLIGENT PROBABIL, P54
  • [9] Gale W. A., 1992, P WORKSH SPEECH NAT
  • [10] Grefenstette Gregory, 1994, EXPLORATIONS AUTOMAT