Probabilistic fault diagnosis in communication systems through incremental hypothesis updating

被引:75
作者
Steinder, M
Sethi, AS
机构
[1] IBM Corp, TJ Watson Res Ctr, Hawthorne, NY 10532 USA
[2] Univ Delaware, Newark, DE 19716 USA
关键词
fault localization; probabilistic reasoning; event correlation;
D O I
10.1016/j.comnet.2004.01.007
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a probabilistic event-driven fault localization technique, which uses a probabilistic symptom-fault map as a fault propagation model. The technique isolates the most probable set of faults through incremental updating of a symptom-explanation hypothesis. At any time, it provides a set of alternative hypotheses, each of which is a complete explanation of the set of symptoms observed thus far. The hypotheses are ranked according to a measure of their goodness. The technique allows multiple simultaneous independent faults to be identified and incorporates both negative and positive symptoms in the analysis. As shown in a simulation study, the technique offers close-to-optimal accuracy and is resilient both to noise in the symptom data and to inaccuracies of the probabilistic fault propagation model. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:537 / 562
页数:26
相关论文
共 32 条
[1]  
[Anonymous], 1998, GRID BLUEPRINT NEW C
[2]   Yemanja - A Layered Fault Localization System for Multi-Domain Computing Utilities [J].
Appleby K. ;
Goldszmidt G. ;
Steinder M. .
Journal of Network and Systems Management, 2002, 10 (02) :171-194
[3]   ALARM CORRELATION AND FAULT IDENTIFICATION IN COMMUNICATION-NETWORKS [J].
BOULOUTAS, AT ;
CALO, S ;
FINKEL, A .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1994, 42 (2-4) :523-533
[4]   Intelligent probing: A cost-effective approach to fault diagnosis in computer networks [J].
Brodie, M ;
Rish, I ;
Ma, S .
IBM SYSTEMS JOURNAL, 2002, 41 (03) :372-385
[5]  
BRODIE M, 2001, P 12 INT WORKSH DIST
[6]  
CASE JD, 1996, 1905 IETF RFC NETW W
[7]   An Automated Fault Diagnosis System Using Hierarchical Reasoning and Alarm Correlation [J].
C. S. Chao ;
D. L. Yang ;
A. C. Liu .
Journal of Network and Systems Management, 2001, 9 (2) :183-202
[8]   The new software paladins [J].
Comerford, R .
IEEE SPECTRUM, 2000, 37 (06) :56-61
[9]  
DENG RH, 1993, IFIP TRANS C, V12, P697
[10]  
DUPUY A, 1989, INTEGRATED NETWORK M, V1, P101