An Adaptive Threshold Framework for Event Detection Using HMM-Based Life Profiles

被引:19
作者
Chen, Chien Chin [1 ]
Chen, Meng Chang [2 ]
Chen, Ming-Syan [3 ]
机构
[1] Natl Taiwan Univ, Dept Informat Management, Taipei 106, Taiwan
[2] Acad Sinica, Inst Informat Sci, Nankang 115, Taiwan
[3] Natl Taiwan Univ, Dept Elect Engn, Taipei 106, Taiwan
关键词
Algorithm; Design; Experimentation; Event detection; topic detection; TDT; life profiles; hidden Markov models; clustering;
D O I
10.1145/1462198.1462201
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
When an event occurs, it attracts attention of information sources to publish related documents along its lifespan. The task of event detection is to automatically identify events and their related documents from a document stream, which is a set of chronologically ordered documents collected from various information sources. Generally, each event has a distinct activeness development so that its status changes continuously during its lifespan. When an event is active, there are a lot of related documents from various information sources. In contrast when it is inactive, there are very few documents, but they are focused. Previous works on event detection did not consider the characteristics of the event's activeness, and used rigid thresholds for event detection. We propose a concept called life profile, modeled by a hidden Markov model, to model the activeness trends of events. In addition, a general event detection framework, LIPED, which utilizes the learned life profiles and the burst-and-diverse characteristic to adjust the event detection thresholds adaptively, can be incorporated into existing event detection methods. Based on the official TDT corpus and contest rules, the evaluation results show that existing detection methods that incorporate LIPED achieve better performance in the cost and F1 metrics, than without.
引用
收藏
页数:35
相关论文
共 39 条
[1]  
Aggarwal C. C., 2003, P 2003 ACM SIGMOD IN, P575, DOI DOI 10.1145/872757.872826
[2]   Traffic-based feedback on the web [J].
Aizen, J ;
Huttenlocher, D ;
Kleinberg, J ;
Novak, A .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 :5254-5260
[3]  
Allan J., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P37, DOI 10.1145/290941.290954
[4]  
Allan J, 2000, P TDT WORKSH
[5]  
[Anonymous], 2000, ICDE
[6]  
[Anonymous], P 23 ANN INT ACM SIG
[7]  
[Anonymous], 1998, P BROADC NEWS TRANSC
[8]  
[Anonymous], 1992, Information retrieval: Data structures and algorithms
[9]  
Baeza-Yates R., 1999, Modern information retrieval
[10]  
BARLAS Y, 1999, P 4 SYST SCI EUR C V, P269