A survey of current work in biomedical text mining

被引:421
作者
Cohen, AM [1 ]
Hersh, WR [1 ]
机构
[1] Oregon Hlth & Sci Univ, Sch Med, Dept Med Informat & Clin Epidemiol, Portland, OR USA
关键词
text-mining; bioinformatics; natural language processing;
D O I
10.1093/bib/6.1.57
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The volume of published biomedical research, and therefore the underlying biomedical knowledge base, is expanding at an increasing rate. Among the tools that can aid researchers in coping with this information overload are text mining and knowledge extraction. Significant progress has been made in applying text mining to named entity recognition, text classification, terminology extraction, relationship extraction and hypothesis generation. Several research groups are constructing integrated flexible text-mining systems intended for multiple uses. The major challenge of biomedical text mining over the next 5-10 years is to make these systems useful to biomedical researchers. This will require enhanced access to full text, better understanding of the feature space of biomedical literature, better methods for measuring the usefulness of systems to users, and continued cooperation with the biomedical research community to ensure that their needs are addressed.
引用
收藏
页码:57 / 71
页数:15
相关论文
共 84 条
  • [1] Computer-assisted generation of a protein-interaction database for nuclear receptors
    Albert, S
    Gaudan, S
    Knigge, H
    Raetsch, A
    Delgado, A
    Huhse, B
    Kirsch, H
    Albers, M
    Rebholz-Schuhmann, D
    Koegl, M
    [J]. MOLECULAR ENDOCRINOLOGY, 2003, 17 (08) : 1555 - 1567
  • [2] HUGO - a UN for the human genome
    不详
    [J]. NATURE GENETICS, 2003, 34 (02) : 115 - 116
  • [3] Aronson AR, 2001, J AM MED INFORM ASSN, P17
  • [4] PubMatrix: a tool for multiplex literature mining
    Becker, KG
    Hosack, DA
    Dennis, G
    Lempicki, RA
    Bright, TJ
    Cheadle, C
    Engel, J
    [J]. BMC BIOINFORMATICS, 2003, 4 (1)
  • [5] BLASCHKE C, 2004, BIOCREATIVE CRITICAL
  • [6] *BRAND U, 2001, MEDSTR PROJ IN ANN C
  • [7] Brill E, 2003, LECT NOTES COMPUT SC, V2588, P360
  • [8] Brill E, 1997, AI MAG, V18, P13
  • [9] Brill E, 1995, COMPUT LINGUIST, V21, P543
  • [10] GAPSCORE:: finding gene and protein names one word at a time
    Chang, JT
    Schütze, H
    Altman, RB
    [J]. BIOINFORMATICS, 2004, 20 (02) : 216 - 225