U-Compare: A modular NLP workflow construction and evaluation system

被引:18
作者
Kano, Y. [1 ]
Miwa, M. [1 ]
Cohen, K. B. [2 ]
Hunter, L. E. [2 ]
Ananiadou, S. [3 ,4 ]
Tsujii, J. [1 ]
机构
[1] Univ Tokyo, Dept Comp Sci, Bunkyo Ku, Tokyo 1130033, Japan
[2] Univ Colorado, Sch Med, Computat Biosci Program, Aurora, CO 80045 USA
[3] Univ Manchester, Sch Comp Sci, Manchester M1 7DN, Lancs, England
[4] Natl Ctr Text Min, Manchester M1 7DN, Lancs, England
基金
英国生物技术与生命科学研究理事会;
关键词
D O I
10.1147/JRD.2011.2105691
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
During the development of natural language processing (NLP) applications, developers are often required to repeatedly perform certain tasks. Among these tasks, workflow comparison and evaluation are two of the most crucial because they help to discover the nature of NLP problems, which is important from both scientific and engineering perspectives. Although these tasks can potentially be automated, developers tend to perform them manually, repeatedly writing similar pieces of code. We developed tools to largely automate these subtasks. Promoting component reuse is another way to further increase NLP development efficiency. Building on the interoperability enhancing Unstructured Information Management Architecture (UIMA) framework, we have collected a large library of interoperable resources, developed several workflow creation utilities, added a customizable comparison and evaluation system, and built visualization utilities. These tools are modularly designed to accommodate various use cases and potential reuse scenarios. By integrating all these features into our U-Compare system, we hope to increase NLP developer efficiency. Simple to use and directly runnable from a web browser, U-Compare has already found uses in a range of applications.
引用
收藏
页数:10
相关论文
共 17 条
[1]  
Altintas I, 2004, 16TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, P423
[2]  
[Anonymous], U COMPARE SHARE COMP
[3]  
[Anonymous], 2008, P WORKSH ENH INT LAR
[4]  
[Anonymous], CHETA CHEM USING TEX
[5]  
Apache, AP UIMA
[6]  
Baumgartner William A Jr, 2008, J Biomed Discov Collab, V3, P1, DOI 10.1186/1747-5333-3-1
[7]  
Blankenberg D., 2010, CURR PROTOCOLS M JAN, P1910
[8]  
Cunningham H, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P168
[9]  
Farkas R, 2010, Proceedings of the Fourteenth Conference on Computational Natural Language Learning (CoNLL-2010): Shared Task
[10]  
FERRUCCI D, 2006, RC24122 IBM