Quantitative analysis of faults and failures in a complex software system

被引:360
作者
Fenton, NE
Ohlsson, N
机构
[1] Queen Mary Univ London, Fac Informat & Math Sci, Dept Comp Sci, RADAR Grp, London E1 4NS, England
[2] GratisTel Int AB, S-10026 Stockholm, Sweden
基金
英国工程与自然科学研究理事会;
关键词
software faults and failures; software metrics; empirical studies;
D O I
10.1109/32.879815
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The dearth of published empirical data on major industrial systems has been one of the reasons that software engineering has failed to establish a proper scientific basis. In this paper, we hope to provide a small contribution to the body of empirical knowledge. We describe a number of results from a quantitative study of faults and failures in two releases of a major commercial system. We tested a range of basic software engineering hypotheses relating to: The Pareto principle of distribution of faults and failures; the use of early fault data to predict later fault and failure data; metrics for fault prediction; and benchmarking fault data. For example, we found strong evidence that a small number of modules contain most of the faults discovered in prerelease testing and that a very small number of modules contain most of the faults discovered in operation. However, in neither case is this explained by the size or complexity of the modules. We found no evidence to support previous claims relating module size to fault density nor did we find evidence that popular complexity metrics are good predictors of either fault-prone or failure-prone modules. We confirmed that the number of faults discovered in prerelease testing is an order of magnitude greater than the number discovered in 12 months of operational use. We also discovered fairly stable numbers of faults discovered at corresponding testing phases. Our most surprising and important result was strong evidence of a counter-intuitive relationship between pre- and postrelease faults: Those modules which are the most fault-prone prerelease are among the least fault-prone postrelease, while conversely, the modules which are most fault-prone postrelease are among the least fault-prone prerelease. This observation has serious ramifications for the commonly used fault density measure. Not only is it misleading to use it as a surrogate quality measure, but, its previous extensive use in metrics studies is shown to be flawed. Our results provide data-points in building up an empirical picture of the software development process. However, even the strong results we have observed are not generally valid as software engineering laws because they fail to take account of basic explanatory data, notably testing effort and operational usage. After all, a module which has not been tested or used will reveal no faults, irrespective of its size, complexity, or any other factor.
引用
收藏
页码:797 / 814
页数:18
相关论文
共 47 条
[1]   OPTIMIZING PREVENTIVE SERVICE OF SOFTWARE PRODUCTS [J].
ADAMS, EN .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1984, 28 (01) :2-14
[2]   PROJECTING SOFTWARE DEFECTS FROM ANALYZING ADA DESIGNS [J].
AGRESTI, WW ;
EVANCO, WM .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1992, 18 (11) :988-997
[3]  
[Anonymous], 1992, PRACTICAL SOFTWARE M
[4]  
[Anonymous], 1991, SOFTW TESTING VERIFI
[5]   SOFTWARE ERRORS AND COMPLEXITY - AN EMPIRICAL-INVESTIGATION [J].
BASILI, VR ;
PERRICONE, BT .
COMMUNICATIONS OF THE ACM, 1984, 27 (01) :42-52
[6]  
Carman DW, 1995, SIXTH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, P350, DOI 10.1109/ISSRE.1995.497677
[7]  
Christenson D. A., 1996, Bell Labs Technical Journal, V1, P130, DOI 10.1002/1538-7035(199622)1:1<130::AID-BLTJ2009>3.0.CO
[8]  
2-E
[9]   PREDICTION AND CONTROL OF ADA SOFTWARE DEFECTS [J].
COMPTON, BT ;
WITHROW, C .
JOURNAL OF SYSTEMS AND SOFTWARE, 1990, 12 (03) :199-207
[10]   A PRACTICAL VIEW OF SOFTWARE MEASUREMENT AND IMPLEMENTATION EXPERIENCES WITHIN MOTOROLA [J].
DASKALANTONAKIS, MK .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1992, 18 (11) :998-1010