High-Recall Information Retrieval from Linked Big Data

被引:18
作者
Cuzzocrea, Alfredo [1 ,2 ]
Lee, Wookey [3 ]
Leung, Carson K. [4 ]
机构
[1] CNR, ICAR, Arcavacata Di Rende, CS, Italy
[2] Univ Calabria, Arcavacata Di Rende, CS, Italy
[3] Inha Univ, Inchon, South Korea
[4] Univ Manitoba, Winnipeg, MB, Canada
来源
39TH ANNUAL IEEE COMPUTERS, SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC 2015), VOL 2 | 2015年
关键词
Information retrieval; recall; big data; linked data; applications; SEARCH;
D O I
10.1109/COMPSAC.2015.152
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the current era of big data, high volumes of valuable information are available in collections of documents, the web, social networks, and high varieties of linked data. To search and retrieve useful information from these linked data, users often enter queries into information retrieval (IR) systems. Among the information retrieved by these systems, some information is relevant to the user queries (i.e., interested to the users), but some is not. Moreover, some relevant information may not be retrieved by the systems. The effectiveness of these IR systems is often measured by metrics such as precision and recall. Most of the conventional IR systems (e.g., for web searches) aim to achieve high precision (i. e., high percentage of the retrieved information is relevant) at the price of low recall (i. e., low percentage of the relevant information is retrieved). However, there are real-life situations (e.g., patent searches) in which having high recall is desirable. In this paper, we present two high-recall IR systems. Results of our evaluation show the effectiveness of our systems in providing high-recall IR from linked big data.
引用
收藏
页码:712 / 717
页数:6
相关论文
共 31 条
[1]  
[Anonymous], 2010, DOLAP 2010
[2]  
Arora Nidhi R., 2012, Database and Expert Systems Applications. Proceedings of the 23rd International Conference, DEXA 2012, P502, DOI 10.1007/978-3-642-32600-4_38
[3]   Multiscale Matrix Sampling and Sublinear-Time PageRank Computation [J].
Borgs, Christian ;
Brautbar, Michael ;
Chayes, Jennifer ;
Teng, Shang-Hua .
INTERNET MATHEMATICS, 2014, 10 (1-2) :20-48
[4]   The anatomy of a large-scale hypertextual Web search engine [J].
Brin, S ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :107-117
[5]  
Byunggu Yu, 2012, Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2012), P918, DOI 10.1109/CCGrid.2012.150
[6]   GA-Based Keyword Selection for the Design of an Intelligent Web Document Search System [J].
Chou, Chih-Hsun ;
Lee, Chang-Hsing ;
Chen, Ya-Hui .
COMPUTER JOURNAL, 2009, 52 (08) :890-901
[7]   Optimal aggregation algorithms for middleware [J].
Fagin, R ;
Lotem, A ;
Naor, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :614-656
[8]  
Fall CJ., 2003, SIGIR FORUM, V37, P10, DOI DOI 10.1145/945546.945547
[9]  
GAFF BM, 2012, COMPUTER, V45, P9
[10]   Intellectual Property, Part I [J].
Gaff, Brian M. ;
Loren, Ralph A. ;
McCool, Gabriel J. .
COMPUTER, 2012, 45 (01) :14-16