Toward a Semantic Granularity Model for Domain-Specific Information Retrieval

被引:136
作者
Yan, Xin [1 ]
Lau, Raymond Y. K. [2 ]
Song, Dawei [3 ]
Li, Xue [1 ]
Ma, Jian [2 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
[2] City Univ Hong Kong, Dept Informat Syst, Hong Kong, Hong Kong, Peoples R China
[3] Robert Gordon Univ, Sch Comp, Aberdeen AB9 1FR, Scotland
基金
英国工程与自然科学研究理事会;
关键词
Theory; Algorithms; Experimentation; Document ranking; domain-specific search; domain ontology; information retrieval; granular computing; WEB; ONTOLOGY;
D O I
10.1145/1993036.1993039
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Both similarity-based and popularity-based document ranking functions have been successfully applied to information retrieval (IR) in general. However, the dimension of semantic granularity also should be considered for effective retrieval. In this article, we propose a semantic granularity-based IR model that takes into account the three dimensions, namely similarity, popularity, and semantic granularity, to improve domain-specific search. In particular, a concept-based computational model is developed to estimate the semantic granularity of documents with reference to a domain ontology. Semantic granularity refers to the levels of semantic detail carried by an information item. The results of our benchmark experiments confirm that the proposed semantic granularity based IR model performs significantly better than the similarity-based baseline in both a bio-medical and an agricultural domain. In addition, a series of user-oriented studies reveal that the proposed document ranking functions resemble the implicit ranking functions exercised by humans. The perceived relevance of the documents delivered by the granularity-based IR system is significantly higher than that produced by a popular search engine for a number of domain-specific search tasks. To the best of our knowledge, this is the first study regarding the application of semantic granularity to enhance domain-specific IR.
引用
收藏
页数:46
相关论文
共 71 条
[1]  
ALLEN RB, 2002, P 5 INT C AS DIG LIB, P111
[2]  
[Anonymous], IEEE DATA ENG B
[3]  
[Anonymous], 1998, SIGIR 98 P 21 ANN IN, DOI DOI 10.1145/290941.291008
[4]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[5]  
BAILEY P, 2007, P 16 TEXT RETR C TRE
[6]   Toward a theory of granular computing for human-centered information processing [J].
Bargiela, Andrzej ;
Pedrycz, Witold .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2008, 16 (02) :320-330
[7]  
BEAULIEU M, 1999, P 8 TEXT RETR C TREC, P17
[8]  
BELKIN NJ, 1999, P 8 TEXT RETR C TREC, P565
[9]  
BELKIN NJ, 1998, P 7 TEXT RETR C TREC, P275
[10]  
BHATIA N, 2009, P AMIA SUMM TRANSL B