Critical assessment of information extraction systems in biology

被引:5
作者
Blaschke, C [1 ]
Hirschman, L
Yeh, A
Valencia, A
机构
[1] CSIC, CNB, Prot Design Grp, Madrid, Spain
[2] Mitre Corp, Bedford, MA 01730 USA
来源
COMPARATIVE AND FUNCTIONAL GENOMICS | 2003年 / 4卷 / 06期
关键词
D O I
10.1002/cfg.337
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
An increasing number of groups are now working in the area of text mining, focusing on a wide range of problems and applying both statistical and linguistic approaches. However, it is not possible to compare the different approaches, because there are no common standards or evaluation criteria; in addition, the various groups are addressing different problems, often using private datasets. As a result, it is impossible to determine how well the existing systems perform, and particularly what performance level can be expected in real applications. This is similar to the situation in text processing in the late 1980s, prior to the Message Understanding Conferences (MUCs). With the introduction of a common evaluation and standardized evaluation metrics as part of these conferences, it became possible to compare approaches, to identify those techniques that did or did not work and to make progress. This progress has resulted in a common pipeline of processes and a set of shared tools available to the general research community. The field of biology is ripe for a similar experiment. Inspired by this example, the BioLINK group (Biological Literature, Information and Knowledge [1]) is organizing a CASP-like evaluation for the text data-mining community applied to biology. The two main tasks specifically address two major bottlenecks for text mining in biology: (1) the correct detection of gene and protein names in text; and (2) the extraction of functional information related to proteins based on the GO classification system. For further information and participation details, see http://www.pdg.cnb.uam.es/BioLink/BioCreative.eval.html Copyright (C) 2003 John Wiley Sons, Ltd.
引用
收藏
页码:674 / 677
页数:4
相关论文
共 9 条
[1]  
[Anonymous], MOUS GEN INF
[2]  
*ASS COMP LING, 2002, NAT LANG PROC BIOM D
[3]  
*GO, GO ANN HUM
[4]  
Mancini M. E., 2018, A Critical Assessment of Spectral Energy Distribution Fitting Tech
[5]  
*U PA, 2001, LANG MOD BIOL DAT
[6]  
*U TOK, 2002, NAT LANG PROC ONT BU
[7]  
2002, KNOWLEDGE DISCOVERY
[8]  
BIOL LIT INFORMATION
[9]  
2002, HUM LANG TECHN WORKS