Evolutionary approach for semantic-based query sampling in large-scale information sources

被引:55
作者
Jung, Jason J. [1 ]
机构
[1] Yeungnam Univ, Dept Comp Engn, Knowledge Engn Lab, Dae Dong 712749, Gyeongsan, South Korea
关键词
Collective intelligence; Context; Query sampling; System interoperability; Evolutionary approach; ONTOLOGY EVOLUTION; FRAMEWORK; SELECTION;
D O I
10.1016/j.ins.2010.08.042
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Metadata about information sources (e.g., databases and repositories) can be collected by Query Sampling (QS). Such metadata can include topics and statistics (e.g., term frequencies) about the information sources. This provides important evidence for determining which sources in the distributed information space should be selected for a given user query. The aim of this paper is to find out the semantic relationships between the information sources in order to distribute user queries to a large number of sources. Thereby, we propose an evolutionary approach for automatically conducting QS using multiple crawlers and obtaining the optimized semantic network from the sources. The aim of combining QS and evolutionary methods is to collaboratively extract metadata about target sources and optimally integrate the metadata, respectively. For evaluating the performance of contextualized QS on 122 information sources, we have compared the ranking lists recommended by the proposed method with user feedback (i.e., ideal ranks), and also computed the precision of the discovered subsumptions in terms of the semantic relationships between the target sources. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:30 / 39
页数:10
相关论文
共 28 条
[1]  
Aksoy D, 2005, SIGMOD REC, V34, P15, DOI 10.1145/1107499.1107500
[2]  
Aleman-Meza B., 2003, Proceedings of the first International Workshop on Semantic Web and Databases, Co-located with the International Conference on Very Large Data Bases, P33
[3]  
Arnold DV, 2002, IEEE T EVOLUT COMPUT, V6, P30, DOI [10.1109/4235.985690, 10.1023/A:1015059928466]
[4]  
Azzopardi L., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P605, DOI 10.1145/1148170.1148277
[5]   Query-based sampling of text databases [J].
Callan, J ;
Connell, M .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2001, 19 (02) :97-130
[6]   Order statistics and selection methods of evolutionary algorithms [J].
Cantú-Paz, E .
INFORMATION PROCESSING LETTERS, 2002, 82 (01) :15-22
[7]   Prototyping an integrated information gathering system on CORBA [J].
Chang, YS ;
Liang, KC ;
Cheng, MC ;
Yuan, SM .
JOURNAL OF SYSTEMS AND SOFTWARE, 2004, 72 (02) :281-294
[8]  
Crespo A, 2005, LECT NOTES ARTIF INT, V3601, P1, DOI 10.1007/11574781_1
[9]  
Euzenat J, 2004, FRONT ARTIF INTEL AP, V110, P333
[10]   Ontology-based concept similarity in Formal Concept Analysis [J].
Formica, Anna .
INFORMATION SCIENCES, 2006, 176 (18) :2624-2641