A neural joint model for entity and relation extraction from biomedical text

被引:276
作者
Li, Fei [1 ]
Zhang, Meishan [2 ]
Fu, Guohong [2 ]
Ji, Donghong [1 ]
机构
[1] Wuhan Univ, Sch Comp, Bayi Rd, Wuhan, Peoples R China
[2] Heilongjiang Univ, Sch Comp Sci & Technol, Xuefu Rd, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
Biomedical text; Entity recognition; Relation extraction; Neural network; Joint model; CORPUS;
D O I
10.1186/s12859-017-1609-9
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Background: Extracting biomedical entities and their relations from text has important applications on biomedical research. Previous work primarily utilized feature-based pipeline models to process this task. Many efforts need to be made on feature engineering when feature-based models are employed. Moreover, pipeline models may suffer error propagation and are not able to utilize the interactions between subtasks. Therefore, we propose a neural joint model to extract biomedical entities as well as their relations simultaneously, and it can alleviate the problems above. Results: Our model was evaluated on two tasks, i.e., the task of extracting adverse drug events between drug and disease entities, and the task of extracting resident relations between bacteria and location entities. Compared with the state-of-the-art systems in these tasks, our model improved the F1 scores of the first task by 5.1% in entity recognition and 8.0% in relation extraction, and that of the second task by 9.2% in relation extraction. Conclusions: The proposed model achieves competitive performances with less work on feature engineering. We demonstrate that the model based on neural networks is effective for biomedical entity and relation extraction. In addition, parameter sharing is an alternative method for neural models to jointly process this task. Our work can facilitate the research on biomedical text mining.
引用
收藏
页数:11
相关论文
共 41 条
[1]
All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning [J].
Airola, Antti ;
Pyysalo, Sampo ;
Bjoerne, Jari ;
Pahikkala, Tapio ;
Ginter, Filip ;
Salakoski, Tapio .
BMC BIOINFORMATICS, 2008, 9 (Suppl 11)
[2]
Andor D, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P2442
[3]
[Anonymous], P BIONLP SHAR TASK 2
[4]
[Anonymous], 2016, P 25 INT JOINT C ART
[5]
Bedmar Isabel Segura, 2013, P 7 INT WORKSH SEM E
[6]
Bengio Y., 2015, DEEP LEARNING
[7]
The Unified Medical Language System (UMLS): integrating biomedical terminology [J].
Bodenreider, O .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D267-D270
[8]
Collobert R, 2011, J MACH LEARN RES, V12, P2493
[9]
The Comparative Toxicogenomics Database's 10th year anniversary: update 2015 [J].
Davis, Allan Peter ;
Grondin, Cynthia J. ;
Lennon-Hopkins, Kelley ;
Saraceni-Richards, Cynthia ;
Sciaky, Daniela ;
King, Benjamin L. ;
Wiegers, Thomas C. ;
Mattingly, Carolyn J. .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D914-D920
[10]
Deleger L, 2016, P 4 BIONLP SHAR TASK, P12