Kernel approaches for genic interaction extraction

被引:48
作者
Kim, Seonho [1 ]
Yoon, Juntae [2 ]
Yang, Jihoon [1 ]
机构
[1] Sogang Univ, Dept Comp Sci, Seoul, South Korea
[2] Daumsoft Inc, Seoul, South Korea
关键词
D O I
10.1093/bioinformatics/btm544
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Automatic knowledge discovery and efficient information access such as named entity recognition and relation extraction between entities have recently become critical issues in the biomedical literature. However, the inherent difficulty of the relation extraction task, mainly caused by the diversity of natural language, is further compounded in the biomedical domain because biomedical sentences are commonly long and complex. In addition, relation extraction often involves modeling long range dependencies, discontiguous word patterns and semantic relations for which the pattern-based methodology is not directly applicable. Results: In this article, we shift the focus of biomedical relation extraction from the problem of pattern extraction to the problem of kernel construction. We suggest four kernels: predicate, walk, dependency and hybrid kernels to adequately encapsulate information required for a relation prediction based on the sentential structures involved in two entities. For this purpose, we view the dependency structure of a sentence as a graph, which allows the system to deal with an essential one from the complex syntactic structure by finding the shortest path between entities. The kernels we suggest are augmented gradually from the flat features descriptions to the structural descriptions of the shortest paths. As a result, we obtain a very promising result, a 77.5 F-score with the walk kernel on the Language Learning in Logic (LLL) 05 genic interaction shared task.
引用
收藏
页码:118 / 126
页数:9
相关论文
共 25 条
  • [1] AUBIN S, 2005, CHALLENGE LLL SYNTAC
  • [2] Blaschke C, 1999, Proc Int Conf Intell Syst Mol Biol, P60
  • [3] Comparative experiments on learning information extractors for proteins and their interactions
    Bunescu, R
    Ge, RF
    Kate, RJ
    Marcotte, EM
    Mooney, RJ
    Ramani, AK
    Wong, YW
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2005, 33 (02) : 139 - 155
  • [4] Bunescu R. C., 2005, P ADV NEURAL INFORM, P171
  • [5] Carreras Xavier, 2005, P CONLL 2005, P152
  • [6] COLLINS M, 2001, UCSCCRL0101
  • [7] Cormen T. H., 2001, Introduction to Algorithms, V2nd
  • [8] Cristianini N., 2000, Intelligent Data Analysis: An Introduction
  • [9] Culotta A., 2004, P 42 ANN M ASS COMP, P423, DOI DOI 10.3115/1218955.1219009
  • [10] Extracting biochemical interactions from MEDLINE using a link grammar parser
    Ding, J
    Berleant, D
    Xu, J
    Fulmer, AW
    [J]. 15TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2003, : 467 - 471