Biological relation extraction and query answering from MEDLINE abstracts using ontology-based text mining

被引:26
作者
Abulaish, Muhammad
Dey, Lipika
机构
[1] Indian Inst Technol, Dept Math, New Delhi 110016, India
[2] Jamia Millia Islamia, Dept Math, New Delhi 110025, India
关键词
text mining; ontology; biological relation extraction; biological query processing;
D O I
10.1016/j.datak.2006.06.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid growth of the biological text data repository makes it difficult for human beings to access required information in a convenient and effective manner. The problem arises due to the fact that most of the information is embedded within unstructured or semi-structured text that computers cannot interpret very easily. In this paper we have presented an ontology-based Biological Information Extraction and Query Answering (BIEQA) System, which initiates text mining with a set of concepts stored in a biological ontology, and thereafter mines possible biological relations among those concepts using NLP techniques and co-occurrence-based analysis. The system extracts all frequently occurring biological relations among a pair of biological concepts through text mining. A mined relation is associated to a fuzzy membership value, which is proportional to its frequency of occurrence in the corpus and is termed a fuzzy biological relation. The fuzzy biological relations extracted from a text corpus along with other relevant information components like biological entities occurring within a relation, are stored in a database. The database is integrated with a query-processing module. The query-processing module has an interface, which guides users to formulate biological queries at different levels of specificity. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:228 / 262
页数:35
相关论文
共 36 条
[1]   Biological ontology enhancement with fuzzy relations: A text-mining framework [J].
Abulaish, M ;
Dey, L .
2005 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, PROCEEDINGS, 2005, :379-385
[2]  
Abulaish M, 2005, LECT NOTES ARTIF INT, V3642, P420, DOI 10.1007/11548706_44
[3]  
Allen J., 2004, NATURAL LANGUAGE UND
[4]   Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families [J].
Andrade, MA ;
Valencia, A .
BIOINFORMATICS, 1998, 14 (07) :600-607
[5]  
[Anonymous], P 5 NLPRS
[6]  
[Anonymous], 1998, GENOME INFORM
[7]  
[Anonymous], P COLING
[8]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[9]   An ontology for bioinformatics applications [J].
Baker, PG ;
Goble, CA ;
Bechhofer, S ;
Paton, NW ;
Stevens, R ;
Brass, A .
BIOINFORMATICS, 1999, 15 (06) :510-520
[10]  
Berners-Lee T., 1998, SEMANTIC WEB ROAD MA