Information extraction from biomedical text

被引:16
作者
Hobbs, JR [1 ]
机构
[1] Univ So Calif, Inst Informat Sci, Marina Del Rey, CA 90292 USA
关键词
D O I
10.1016/S1532-0464(03)00015-7
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Information extraction is the process of scanning text for information relevant to some interest, including extracting entities, relations, and events. It requires deeper analysis than key word searches, but its aims fall short of the very hard and long-term problem of full text understanding. Information extraction represents a midpoint on this spectrum, where the aim is to capture structured information without sacrificing feasibility. One of the key ideas in this technology is to separate processing into several stages, in cascaded finite-state transducers. The earlier stages recognize smaller linguistic objects and work in a largely domainin-dependent fashion. The later stages take these linguistic objects as input and find domain-dependent patterns among them. There are now initial efforts to apply this technology to biomedical text. In other domains, the technology plateaued at about 60% recall and precision. Even if applications to biomedical text do no better than this, they could still prove to be of immense help to curatorial activities. (C) 2003 Elsevier Science (USA). All rights reserved.
引用
收藏
页码:260 / 264
页数:5
相关论文
共 7 条
  • [1] ANANIADOU S, 2002, P INT SYST MOL BIOL
  • [2] Friedman C, 2001, Bioinformatics, V17 Suppl 1, pS74
  • [3] Hobbs JR, 1997, LANG SPEECH & COMMUN, P383
  • [4] Humphreys K, 2000, Pac Symp Biocomput, P505
  • [5] Pathway databases: A case study in computational symbolic theories
    Karp, PD
    [J]. SCIENCE, 2001, 293 (5537) : 2040 - 2044
  • [6] Sager N., 1981, Natural language information processing: a computer grammar of English and its applications
  • [7] THOMAS J, 2000, PAC S BIOC, V4, P517