Product named entity recognition in Chinese text

被引:19
作者
Zhao, Jun [1 ]
Liu, Feifan [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100080, Peoples R China
基金
中国国家自然科学基金;
关键词
information extraction; product named entity recognition; hierarchical hidden Markov model;
D O I
10.1007/s10579-008-9066-8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
There are many expressive and structural differences between product names and general named entities such as person names, location names and organization names. To date, there has been little research on product named entity recognition (NER), which is crucial and valuable for information extraction in the field of market intelligence. This paper focuses on product NER (PRO NER) in Chinese text. First, we describe our efforts on data annotation, including well-defined specifications, data analysis and development of a corpus with annotated product named entities. Second, a hierarchical hidden Markov model-based approach to PRO NER is proposed and evaluated. Extensive experiments show that the proposed method outperforms the cascaded maximum entropy model and obtains promising results on the data sets of two different electronic product domains (digital and cell phone).
引用
收藏
页码:197 / 217
页数:21
相关论文
共 17 条
[1]  
Aberdeen J., 1995, P 6 MESS UND C MUC 6, P141
[2]  
[Anonymous], P COLING
[3]  
[Anonymous], 2004, P 4 INT C LANG RES E
[4]  
[Anonymous], PATTERN RECOGNITION
[5]  
Bikel D.M., 1997, Proceedings of the fifth conference on Applied natural language processing. Association for Computational Linguistics, P194
[6]  
BORTHWICK A, 1999, THESIS NY U
[7]  
Carletta J, 1996, COMPUT LINGUIST, V22, P249
[8]   The hierarchical hidden Markov model: Analysis and applications [J].
Fine, S ;
Singer, Y ;
Tishby, N .
MACHINE LEARNING, 1998, 32 (01) :41-62
[9]  
McCallum A., 2000, Icml, V17, P591
[10]  
Niu C., 2003, 41 ANN M ASS COMP, P335