Developing and evaluating an IT specification extraction system

被引:4
作者
Yang, Chyan [1 ]
Chen, Liang-Chu
Peng, Chun-Yen
机构
[1] Natl Chiao Tung Univ, Inst Business & Management, Taipei, Taiwan
[2] Natl Chiao Tung Univ, Inst Informat Management, Taipei, Taiwan
[3] Taiwan Semicond Mfg Co, Hsinchu, Taiwan
关键词
knowledge management; communication technologies;
D O I
10.1108/02640470610714251
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Purpose - This paper seeks to establish an extraction system for an information technology (IT) product specification named ITSIES which combines the natural language process (NLP) with the ontology concept and also to evaluate the system's effectiveness in advance. Design/methodology/approach - The development of the system is based on a prototype design and performance validation. This study adopts four classes of IT specification (PC, Unix server, Monitor, and Printer) that follow IBM's and HP's product lines as the baseline information in order to construct the extraction system in GATE (General Architecture for Text Engineering) tools and to examine the IT product specification with other brands and patterns. Additionally indices are adopted such as precision, recall, and F-measure as the matrices for evaluating system performance. Findings - The performance shows that the average recall, precision, and F-measure are all over 90 per cent, revealing that the JAPE (Java Annotation Patterns Engine) grammar rules in the IT domain are reasonably good and generally in line with expectations. Originality/value - The paper proposes an integrative framework to examine IT product specification information and demonstrates that the system is effective for IT application.
引用
收藏
页码:832 / 846
页数:15
相关论文
共 21 条
[1]   Automatic ontology-based knowledge extraction from web documents [J].
Alani, H ;
Kim, S ;
Millard, DE ;
Weal, MJ ;
Hall, W ;
Lewis, PH ;
Shadbolt, NR .
IEEE INTELLIGENT SYSTEMS, 2003, 18 (01) :14-21
[2]  
ATHOMPSON CA, 1999, P 16 INT C MACH LEAR
[3]  
BONTCHEVA K, 2003, IESL03 WORKSH INF EX
[4]   A visual framework for knowledge discovery on the Web: An empirical study of business intelligence exploration [J].
Chung, W ;
Chen, H ;
Nunamaker, JF .
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS, 2005, 21 (04) :57-84
[5]   Information extraction [J].
Cowie, J ;
Lehnert, W .
COMMUNICATIONS OF THE ACM, 1996, 39 (01) :80-91
[6]  
DESITTER A, 2004, 20040 U ANTW DEP MAT
[7]   Adopting ontology to facilitate knowledge sharing [J].
Edgington, T ;
Choi, B ;
Henson, K ;
Raghu, TS ;
Vinze, A .
COMMUNICATIONS OF THE ACM, 2004, 47 (11) :85-90
[8]   Automating the extraction of data from HTML']HTML tables with unknown structure [J].
Embley, DW ;
Tao, C ;
Liddle, SW .
DATA & KNOWLEDGE ENGINEERING, 2005, 54 (01) :3-28
[9]   Conceptual-model-based data extraction from multiple-record Web pages [J].
Embley, DW ;
Campbell, DM ;
Jiang, YS ;
Liddle, SW ;
Lonsdale, DW ;
Ng, YK ;
Smith, RD .
DATA & KNOWLEDGE ENGINEERING, 1999, 31 (03) :227-251
[10]   Information extraction with automatic knowledge expansion [J].
Jung, H ;
Yi, E ;
Kim, D ;
Lee, GG .
INFORMATION PROCESSING & MANAGEMENT, 2005, 41 (02) :217-242