A survey of fault localization techniques in computer networks

被引:232
作者
Steinder, M [1 ]
Sethi, AS [1 ]
机构
[1] IBM Corp, TJ Watson Res Ctr, Hawthorne, NY USA
关键词
fault localization; event correlation; root cause analysis;
D O I
10.1016/j.scico.2004.01.010
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Fault localization, a central aspect of network fault management, is a process of deducing the exact source of a failure from a set of observed failure indications. It has been a focus of research activity since the advent of modem communication systems, which produced numerous fault localization techniques. However, as communication systems evolved becoming more complex and offering new capabilities, the requirements imposed on fault localization techniques have changed as well. It is fair to say that despite this research effort, fault localization in complex communication systems remains an open research problem. This paper discusses the challenges of fault localization in complex communication systems and presents an overview of solutions proposed in the course of the last ten years, while discussing their advantages and shortcomings. The survey is followed by the presentation of potential directions for future research in this area. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:165 / 194
页数:30
相关论文
共 104 条
  • [1] [Anonymous], 1995, IMPLEMENTATION
  • [2] [Anonymous], P 9 INT WORKSH DISTR
  • [3] Yemanja - A Layered Fault Localization System for Multi-Domain Computing Utilities
    Appleby K.
    Goldszmidt G.
    Steinder M.
    [J]. Journal of Network and Systems Management, 2002, 10 (02) : 171 - 194
  • [4] BAGCHI S, 2001, 12 INT WORKSH DISTR
  • [5] BIERMAN A, 2000, 2922 RFC IETF NETW W
  • [6] ALARM CORRELATION AND FAULT IDENTIFICATION IN COMMUNICATION-NETWORKS
    BOULOUTAS, AT
    CALO, S
    FINKEL, A
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 1994, 42 (2-4) : 523 - 533
  • [7] BOULOUTAS AT, 1995, J NETWORK SYSTEMS MA, V3
  • [8] Breitbart Y., 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064), P265, DOI 10.1109/INFCOM.2000.832196
  • [9] BROWN A, 2001, INTEGRATED NETWORK M, V7
  • [10] BRUGNONI S, 1993, INTEGRATED NETWORK M, V3