BOOTSTRAPPING FOR EXTRACTING RELATIONS FROM LARGE CORPORA

被引:5
作者
Li Weigang Liu Ting Li Sheng Information Retrieval Laboratory School of Computer Science and Technology Harbin Institute of Technology Harbin China [150001 ]
机构
关键词
Relation extraction; Bootstrapping; Patterns; Tuples;
D O I
暂无
中图分类号
TP311.5 [软件工程];
学科分类号
081202 ; 0835 ;
摘要
A new approach of relation extraction is described in this paper. It adopts a bootstrap- ping model with a novel iteration strategy, which generates more precise examples of specific relation. Compared with previous methods, the proposed method has three main advantages: first, it needs less manual intervention; second, more abundant and reasonable information are introduced to represent a relation pattern; third, it reduces the risk of circular dependency occurrence in bootstrapping. Scalable evaluation methodology and metrics are developed for our task with comparable techniques over TianWang 100G corpus. The experimental results show that it can get 90% precision and have excellent expansibility.
引用
收藏
页码:89 / 96
页数:8
相关论文
共 2 条
[1]   采用开放语料库的跨领域模式自动获取 [J].
曾兴杰 ;
李芳 ;
张冬茉 .
计算机仿真, 2005, (04) :259-263+293
[2]   一种自举的二元关系和二元关系模式获取方法 [J].
姜吉发 ;
王树西 .
中文信息学报, 2005, (02) :71-77