Term identification in the biomedical literature

被引:166
作者
Krauthammer, M
Nenadic, G
机构
[1] Columbia Univ, Columbia Genome Ctr, Dept Biomed Informat, New York, NY 10027 USA
[2] Univ Manchester, Dept Computat, Manchester M60 1QD, Lancs, England
[3] Natl Ctr Text Min, Manchester, Lancs, England
关键词
term identification; term recognition; term classification; term mapping; acronym recognition; biomedical literature;
D O I
10.1016/j.jbi.2004.08.004
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Sophisticated information technologies are needed for effective data acquisition and integration from a growing body of the biomedical literature. Successful term identification is key to getting access to the stored literature information, as it is the terms (and their relationships) that convey knowledge across scientific articles. Due to the complexities of a dynamically changing biomedical terminology, term identification has been recognized as the current bottleneck in text mining, and-as a consequence-has become an important research topic both in natural language processing and biomedical communities. This article overviews state-of-the-art approaches in term identification. The process of identifying terms is analysed through three steps: term recognition, term classification, and term mapping. For each step, main approaches and general trends, along with the major problems, are discussed. By assessing previous work in context of the overall term identification process, the review also tries to delineate needs for future work in the field. (C) 2004 Published by Elsevier Inc.
引用
收藏
页码:512 / 526
页数:15
相关论文
共 75 条
  • [1] ADAR E, 2002, SIMPLE ROBUST ABBREV
  • [2] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [3] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [4] Ananiadou S, 1994, P 15 INT C COMP LING, P1034, DOI DOI 10.3115/991250.991317
  • [5] ANANIADOU S, 2000, GENOME INFORMATICS S
  • [6] Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families
    Andrade, MA
    Valencia, A
    [J]. BIOINFORMATICS, 1998, 14 (07) : 600 - 607
  • [7] [Anonymous], P 5 NLPRS
  • [8] [Anonymous], 2001, SPOTTING DISCOVERING
  • [9] [Anonymous], INT J DIGITAL LIB
  • [10] [Anonymous], P COLING