关联数据资源集相似度计算方法研究

被引：6

作者：

邓兰兰 ^{[1
,2
]}

李春旺 ^{[1
]}

机构：

[1] 中国科学院国家科学图书馆

[2] 中国科学院研究生院

来源：

关键词：

关联数据; 资源集; 相似度; 算法;

D O I：

10.16353/j.cnki.1000-7490.2012.05.009

中图分类号：

G202 [信息处理技术];

学科分类号：

摘要：

文章提出的适用于关联数据资源集相似度计算的综合描述信息模型,分为基本描述、内容描述和外部链接3个模块描述资源集,并根据各信息项的特点挑选字符串相似度、集合相似度、向量空间模型和基于统计和语义的相似度等算法计算资源集相似度,在一定程度上解决了当前关联创建中相关资源集手工配置的问题。

引用

页码：112 / 116

页数：5

共 16 条

[1] Two approaches matching in example-based machine translation. Sergei Nirenburg,et al. Proc of TMI-93 . 1993
[2] Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Philip Resnik. Journal of Artificial Organs . 1999
[3] Relevant document distribution estimation method for resource selection. L. Si,and J. Callan. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval . 2003
[4] Web整合中的资源描述技术
张丽
汪语宇
[J]. 图书情报工作, 2005, (10) : 25 - 28
[5] 基于多层特征的字符串相似度计算模型[J]. 章成志. 报学报. 2005 (06)
[6] Searching distributed collections with inference networks. Callan J,Lu Z,Croft W. Proceedings of the 18th International ACM SIGIR Conference on Research and Development in Information Retrieval . 1995
[7] Searching distributed collections with inference networks. Callan J,Lu Z,Croft W. Proceedings of the 18th International ACM SIGIR Conference on Research and Development in Information Retrieval . 1995
[8] Query-based sampling of text databases
Callan, J
Connell, M
[J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2001, 19 (02) : 97 - 130
[9] 集成检索系统中资源选择技术及算法
汪语宇
张丽
[J]. 图书情报工作 , 2005, (10) : 29 - 32+66
[10] Extraction of information in large graphs auto-matic search for synonyms. SENEHART P P. . 2001