Automatic extraction of semantic relations between medical entities: A rule based approach

被引:77
作者
Ben Abacha A. [1 ]
Zweigenbaum P. [1 ]
机构
[1] CNRS, LIMSI, Orsay
关键词
Noun Phrase; Semantic Relation; Fosfomycin; Semantic Type; Treatment Relation;
D O I
10.1186/2041-1480-2-S5-S4
中图分类号
学科分类号
摘要
Background: Information extraction is a complex task which is necessary to develop high-precision information retrieval tools. In this paper, we present the platform MeTAE (Medical Texts Annotation and Exploration). MeTAE allows (i) to extract and annotate medical entities and relationships from medical texts and (ii) to explore semantically the produced RDF annotations. Results: Our annotation approach relies on linguistic patterns and domain knowledge and consists in two steps: (i) recognition of medical entities and (ii) identification of the correct semantic relation between each pair of entities. The first step is achieved by an enhanced use of MetaMap which improves the precision obtained by MetaMap by 19.59% in our evaluation. The second step relies on linguistic patterns which are built semi-automatically from a corpus selected according to semantic criteria. We evaluate our system's ability to identify medical entities of 16 types. We also evaluate the extraction of treatment relations between a treatment (e.g. medication) and a problem (e.g. disease): we obtain 75.72% precision and 60.46% recall. Conclusions: According to our experiments, using an external sentence segmenter and noun phrase chunker may improve the precision of MetaMap-based medical entity recognition. Our pattern-based relation extraction method obtains good precision and recall w.r.t related works. A more precise comparison with related approaches remains difficult however given the differences in corpora and in the exact nature of the extracted relations. The selection of MEDLINE articles through queries related to known drug-disease pairs enabled us to obtain a more focused corpus of relevant examples of treatment relations than a more general MEDLINE query. © 2011 Ben Abacha and Zweigenbaum; licensee BioMed Central Ltd.
引用
收藏
相关论文
共 21 条
[1]  
Engelbrecht R., Expert systems for medicine functions and developments, Zentralbl Gynakol, 119, 9, pp. 428-434, (1997)
[2]  
Hotvedt M.O., Continuing medical education: actually learning rather than simply listening, JAMA, 275, 21, (1996)
[3]  
Health On the Net.
[4]  
Aronson A.R., Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program., AMIA Annu Symp Proc., pp. 17-21, (2001)
[5]  
Pratt W., Yetisgen-Yildiz M., A Study of Biomedical Concept Identification: MetaMap vs. People., AMIA Annu Symp Proc., pp. 529-533, (2003)
[6]  
Meystre S.M., Haug P.J., Comparing natural language processing tools to extract medical problems from narrative text., AMIA Annu Symp Proc., pp. 525-529, (2005)
[7]  
Hindle D., Noun classification from predicate argument structures., Proceedings of the 28th annual meeting of the Association for Computational Linguistics., pp. 268-275, (1990)
[8]  
Zhu J., Nie Z., Liu X., Zhang B., Wen J.R., StatSnowball: a statistical approach to extracting entity relationships., Proceedings of the 18th international conference on World Wide Web., (2009)
[9]  
Hearst M.A., Automatic Acquisition of Hyponyms from Large Text Corpora., Proceedings of the 14th conference on Computational Linguistics., pp. 539-545, (1992)
[10]  
Suchanek F.M., Ifrim G., Weikum G., Combining Linguistic and Statistical Analysis to Extract Relations from Web Documents., Proceedings of the 12th ACM SIGKDD international conference on Knowledge Discovery and Data Mining., pp. 712-717, (2006)