基于深度学习的中文生物医学实体关系抽取系统

被引：18

作者：

丁泽源 ^{[1
]}

杨志豪 ^{[1
]}

罗凌 ^{[1
]}

王磊 ^{[2
]}

张音 ^{[2
]}

林鸿飞 ^{[1
]}

王健 ^{[1
]}

机构：

[1] 大连理工大学计算机科学与技术学院

[2] 军事医学科学院

来源：

中文信息学报 | 2021年 / 35卷 / 05期

基金：

国家重点研发计划;

关键词：

命名实体识别; 关系抽取; 条件随机场; 双向长短期记忆网络;

D O I：

暂无

中图分类号：

R318 [生物医学工程]; TP391.1 [文字信息处理]; TP18 [人工智能理论];

学科分类号：

0831 ; 081104 ; 0812 ; 0835 ; 1405 ;

摘要：

在生物医学文本挖掘领域,生物医学的命名实体和关系抽取具有重要意义。然而目前中文生物医学实体关系标注语料十分稀缺,这给中文生物医学领域的信息抽取任务带来许多挑战。该文基于深度学习技术搭建了中文生物医学实体关系抽取系统。首先利用公开的英文生物医学标注语料,结合翻译技术和人工标注方法构建了中文生物医学实体关系语料。然后在结合条件随机场(Conditional Random Fields, CRF)的双向长短期记忆网络(Bi-directional LSTM, BiLSTM)模型上加入了基于生物医学文本训练的中文ELMo (Embedding from Language Model)完成中文实体识别。最后使用结合注意力(Attention)机制的双向长短期记忆网络抽取实体间的关系。实验结果表明,该系统可以准确地从中文文本中抽取生物医学实体及实体间关系。

引用

页码：70 / 76

页数：7

共 15 条

[1] Learning to forget: continual prediction with LSTM[J] . Gers F A,Schmidhuber J,Cummins F. eural computation . 2000 (10)
[2] RoBERTa:a robustly optimized BERT pretraining approach . Liu Y,Ott M,Goyal N,et al. . 2019
[3] BERT:Pretraining of deep bidirectional transformers for language understanding . DEVLIN J,CHANG M W,LEE K,et al. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies . 2019
[4] A novel feature-based approach to extract drug-drug interactions from biomedical text
Bui, Quoc-Chinh
Sloot, Peter M. A.
van Mulligen, Erik M.
Kors, Jan A.
[J]. BIOINFORMATICS, 2014, 30 (23) : 3365 - 3371
[5] Attention-based bidirectional long short-term memory networks for relation classification . Zhou P,Shi W,Tian J,et al. The 54th Annual Meeting of the Association for Computational Linguistics . 2016
[6] Deep contextualized word representations . Peters M E,Neumann M,Iyyer M,et al. . 2018
[7] 情感词汇本体的构造 . 徐琳宏,林鸿飞,潘宇,任惠,陈建美. 情报学报 . 2008
[8] Chinese NER Using Lattice LSTM . Zhang Y,Yang J. . 2018
[9] Overview of the Bio Creative V Chemical Disease Relation(CDR)Task . Wei C H,Peng Y,Leaman R.et al. Proceedings of the 5th Bio Creative Challenge Evaluation Workshop . 2015
[10] 情感词汇本体的构造[J]. 徐琳宏,林鸿飞,潘宇,任惠,陈建美. 报学报. 2008 (02)

← 1 2 →