An aging theory for event life-cycle modeling

被引:29
作者
Chen, Chien Chin [1 ]
Chen, Yao-Tsung [1 ]
Chen, Meng Chang [1 ]
机构
[1] Acad Sinica, Inst Sci Informat, Taipei 115, Taiwan
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS | 2007年 / 37卷 / 02期
关键词
clustering; knowledge life cycle; web mining;
D O I
10.1109/TSMCA.2006.886370
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
An event can be described by a sequence of chronological documents from several information sources that together describe a story or happening. The goal of event detection and tracking is to automatically identify events and their associated documents during their life cycles. Conventional document clustering and classification techniques cannot effectively detect and track sequential events, as they ignore the temporal relationships among documents related to an event. The life cycle of an event is analogous to living beings. With abundant nourishment (i.e., related documents for the event), the life cycle is prolonged; conversely,,an event or living fades away when nourishment is exhausted. Improper tracking algorithms often unnecessarily prolong or shorten the life cycle of detected events. In this paper, we propose an aging theory to model the life cycle of sequential events, which incorporates a traditional single-pass clustering algorithm to detect and track events. Our experiment results show that the proposed method achieves a better overall performance for both long-running and short-term events than previous approaches. Moreover, we find that the aging parameters of the aging schemes are profile dependent and that using proper profile-specific aging parameters improves the detection and tracking performance further.
引用
收藏
页码:237 / 248
页数:12
相关论文
共 31 条
[21]  
MTICHELL TM, 1997, MACHINE LEARNING
[22]  
RAKES WB, 1992, INFORM RETRIEVAL DAT
[23]  
RIMALDI RP, 1998, DISCRETE COMBINATORI
[24]  
Rocchio J. J., 1971, RELEVANCE FEEDBACK I
[25]  
Salton G., 1988, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer
[26]  
Selamat A, 2002, SICE 2002: PROCEEDINGS OF THE 41ST SICE ANNUAL CONFERENCE, VOLS 1-5, P2389
[27]  
SMTIH DA, 2002, P 25 ANN INT ACM SIG, P73
[28]  
*TDT, TOP DET TRACK TDT
[29]  
WU H, 2002, P 8 ACM SIGKDD INT C, P207, DOI DOI 10.1145/775047.
[30]  
YANG Y, 2000, P 23 ANN INT ACM SIG, P65