Extracting Named Entities and Relating Them over Time Based on Wikipedia

被引：1

作者：

Bhole, Abhijit ^{[1
]}

Fortuna, Blaz ^{[2
]}

Grobelnik, Marko ^{[2
]}

Mladenic, Dunja ^{[2
]}

机构：

[1] Indian Inst Technol, Bombay 400076, Maharashtra, India

[2] Jozef Stefan Inst, Ljubljana 1000, Slovenia

来源：

INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS | 2007年 / 31卷 / 04期

关键词：

text mining; document categorization; information extraction;

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

This paper presents an approach to mining information relating people, places, organizations and events extracted from Wikipedia and linking them on a time scale. The approach consists of two phases: (1) identifying relevant pages - categorizing the articles as containing people, places or organizations; (2) generating timeline - linking named entities and extracting events and their time frame. We illustrate the proposed approach on 1.7 million Wikipedia articles.

引用

页码：463 / 468

页数：6

共 9 条

[1]

Bhole A., 2007, P 10 INT MULTI INF S, P177

[2]

Grobelnik M., 2007, TEXT MINING RECIPES

[3]

Joachims T., 1999, ADV KERNEL METHODS S

[4]

Lenat Douglas B., 1995, 38 ACM, V38

[5] Feature selection on hierarchy of web documents [J].

Mladenic, D ;

Grobelnik, M .

DECISION SUPPORT SYSTEMS, 2003, 35 (01) :45-87

[6] ScentTrails: Integrating browsing and searching on the Web [J].

Olston, Christopher ;

Chi, Ed H. .

ACM Transactions on Computer-Human Interaction, 2003, 10 (03) :177-197

[7] Machine learning in automated text categorization [J].

Sebastiani, F .

ACM COMPUTING SURVEYS, 2002, 34 (01) :1-47

[8]

Shah P., 2006, P 19 INT FLAIRS C, P153

[9]

Zaragoza H., 2007, SEMANTICALLY ANNOTAT

← 1 →