A shallow parser based on closed-class words to capture relations in biomedical text

被引:64
作者
Leroy, G [1 ]
Chen, HC [1 ]
Martinez, JD [1 ]
机构
[1] Univ Arizona, Arizona Canc Ctr, Tucson, AZ 85721 USA
关键词
natural language processing; shallow parsing; finite state automata; biomedicine; free text; bottom-up parser; NLP;
D O I
10.1016/S1532-0464(03)00039-X
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Natural language processing for biomedical text currently focuses mostly on entity and relation extraction. These entities and relations are usually pre-specified entities, e.g., proteins, and pre-specified relations, e.g., inhibit relations. A shallow parser that captures the relations between noun phrases automatically from free text has been developed and evaluated. It uses heuristics and a noun phraser to capture entities of interest in the text. Cascaded finite state automata structure the relations between individual entities. The automata are based on closed-class English words and model generic relations not limited to specific words. The parser also recognizes coordinating conjunctions and captures negation in text, a feature usually ignored by others. Three cancer researchers evaluated 330 relations extracted from 26 abstracts of interest to them. There were 296 relations correctly extracted from the abstracts resulting in 90% precision of the relations and an average of 11 correct relations per abstract. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:145 / 158
页数:14
相关论文
共 57 条
[1]  
Abney S., 1996, Eighth European Summer School in Logic, Language and Information. Workshop on Robust Parsing. ESSLLI'96, P8
[2]  
ABNEY S, 1999, EMPIRICAL METHODS NA
[3]  
[Anonymous], PREPOSITIONAL ANAL F
[4]  
[Anonymous], P ARPA WORKSH HUM LA
[5]  
ARONSON AR, 2001, AMIA ANN S P, P17
[6]  
Ashburner M, 2001, GENOME RES, V11, P1425
[7]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[8]  
Barrows RC, 2000, J AM MED INFORM ASSN, P51
[9]   Can bibliographic pointers for known biological data be found automatically? Protein interactions as a case study [J].
Blaschke, C ;
Valencia, A .
COMPARATIVE AND FUNCTIONAL GENOMICS, 2001, 2 (04) :196-206
[10]  
BRILL E, 1994, COLING