Information extraction with automatic knowledge expansion

被引:7
作者
Jung, H [1 ]
Yi, E [1 ]
Kim, D [1 ]
Lee, GG [1 ]
机构
[1] Pohang Univ Sci & Technol, Dept Comp Sci & Engn, Pohang 790784, Kyungbuk, South Korea
关键词
information extraction; question answering; user-oriented learning; lexico-semantic pattern; machine learning;
D O I
10.1016/S0306-4573(03)00066-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
POSIE (POSTECH Information Extraction System) is an information extraction system which uses multiple learning strategies, i.e., SmL, user-oriented learning, and separate-context learning, in a question answering framework. POSIE replaces laborious annotation with automatic instance extraction by the SmL from structured Web documents, and places the user at the end of the user-oriented learning cycle. Information extraction as question answering simplifies the extraction procedures for a set of slots. We introduce the techniques verified on the question answering framework, such as domain knowledge and instance rules, into an information extraction problem. To incrementally improve extraction performance, a sequence of the user-oriented learning and the separate-context learning produces context rules and generalizes them in both the learning and extraction phases. Experiments on the "continuing education" domain initially show that the F1-measure becomes 0.477 and recall 0.748 with no user training. However, as the size of the training documents grows, the F1-measure reaches beyond 0.75 with recall 0.772. We also obtain F-measure of about 0.9 for five out of seven slots on "job offering" domain. (C) 2003 Elsevier Ltd. All rights reserved.
引用
收藏
页码:217 / 242
页数:26
相关论文
共 40 条
[1]  
BLUM A, 1998, P C COMP LEARN THEOR
[2]  
BRIN S, 1998, P INT WORKSH WEB DAT
[3]  
Califf M., 1998, P AAAI SPRING S APPL
[4]   PRO-OPIOMELANOCORTIN MESSENGER-RNA SIZE HETEROGENEITY IN ACTH-DEPENDENT CUSHINGS-SYNDROME [J].
CLARK, AJL ;
LAVENDER, PM ;
BESSER, GM ;
REES, LH .
JOURNAL OF MOLECULAR ENDOCRINOLOGY, 1989, 2 (01) :3-9
[5]  
Cohen W. W., 1995, FAST EFFECTIVE RULE
[6]  
EIKVIL L, 1999, 945 NORW COMP CTR
[7]  
Finch S., 1995, P 7 C EUR CHAPT ASS
[8]  
FLYNN P, 1998, UNDERSTANDING SGML X
[9]  
FREITAG D, 1998, P 17 C COMP LING 36
[10]  
FREITAG D, 1998, P 15 C ART INT