Information retrieval on Turkish texts

被引:58
作者
Can, Fazli [1 ]
Kocberber, Seyit [1 ]
Balcik, Erman [1 ]
Kaynak, Cihan [1 ]
Ocalan, H. Cagdas [1 ]
Vursavas, Onur M. [1 ]
机构
[1] Bilkent Univ, Dept Comp Engn, Bilkent Informat Retrieval Grp, TR-06800 Ankara, Turkey
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2008年 / 59卷 / 03期
关键词
D O I
10.1002/asi.20750
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, we investigate information retrieval (IR) on Turkish texts using a large-scale test collection that contains 408,305 documents and 72 ad hoc queries. We examine the effects of several stemming options and query-document matching functions on retrieval performance. We show that a simple word truncation approach, a word truncation approach that uses language-dependent corpus statistics, and an elaborate lemmatizer-based stemmer provide similar retrieval effectiveness in Turkish IR. We investigate the effects of a range of search conditions on the retrieval performance; these include scalability issues, query and document length effects, and the use of stopword list in indexing.
引用
收藏
页码:407 / 421
页数:15
相关论文
共 62 条
[1]   Indexing strategies for Swedish full text retrieval under different user scenarios [J].
Ahlgren, Per ;
Kekalainen, Jaana .
INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (01) :81-102
[2]  
ALTINGOVDE IS, 2007, P 30 ANN INF RETR, P891
[3]  
Altintas K., 2002, P 11 TURK S ART INT, P181
[4]  
ANDERSON S, 2006, REPORT TOP 200 CORPO
[5]  
Bitirim Y, 2002, LECT NOTES COMPUT SC, V2457, P93
[6]   The challenge of commercial document retrieval, Part 1: Major issues, and a framework based on search exhaustivity, determinacy of representation and document collection size [J].
Blair, DC .
INFORMATION PROCESSING & MANAGEMENT, 2002, 38 (02) :273-291
[7]  
BLAIR DC, 1985, COMMUN ACM, V28, P3
[8]   Cross-language evaluation forum: Objectives, results, achievements [J].
Braschler, M ;
Peters, C .
INFORMATION RETRIEVAL, 2004, 7 (1-2) :7-31
[9]   How effective is stemming and decompounding for German text retrieval? [J].
Braschler, M ;
Ripplinger, B .
INFORMATION RETRIEVAL, 2004, 7 (3-4) :291-316
[10]   Water demand management - philosophy or implementation? [J].
Buckle, JS .
EFFICIENT USE AND MANAGEMENT OF WATER FOR URBAN SUPPLY, 2004, 4 (03) :25-32