Empirical assessment of machine learning based software defect prediction techniques

被引:47
作者
Challagulla, VUB [1 ]
Bastani, FB [1 ]
Yen, IL [1 ]
Paul, RA [1 ]
机构
[1] Univ Texas, Dept Comp Sci, Richardson, TX 75083 USA
来源
WORDS 2005: 10TH IEEE INTERNATIONAL WORKSHOP ON OBJECT-ORIENTED REAL-TIME DEPENDABLE, PROCEEDINGS | 2005年
关键词
D O I
10.1109/WORDS.2005.32
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The wide-variety of real-time software systems, including telecontrol/telepresence systems, robotic systems, and mission planning systems, can entail dynamic code synthesis based on runtime mission-specific requirements and operating conditions. This necessitates the need for dynamic dependability assessment to ensure that these systems will perform as specified and will not fail in catastrophic ways. One approach in achieving this is to dynamically assess the modules in the synthesized code using software defect prediction techniques. Statistical models, such as Stepwise Multi-linear Regression models and multivariate models, and machine learning approaches, such as Artificial Neural Networks, Instance-based Reasoning, Bayesian-Belief Networks, Decision Trees, and Rule Inductions, have been investigated for predicting software quality. However, there is still no consensus about the best predictor model for software defects. In this paper, we evaluate different predictor models on four different real-time software defect data sets. The results show that a combination of 1R and Instance-based Learning along with the Consistency-based Subset Evaluation technique provides a relatively - better consistency in accuracy prediction compared to other models. The results also show that "size" and "complexity" metrics are not sufficient for accurately predicting real-time software defects.
引用
收藏
页码:263 / 270
页数:8
相关论文
共 24 条
[1]   Prediction of software reliability: A comparison between regression and neural network non-parametric models [J].
Aljahdali, SH ;
Sheta, A ;
Rine, D .
ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2001, :470-473
[2]  
[Anonymous], THESIS U WAIKATO NZ
[3]  
Dumais S., 1998, Proceedings of the 1998 ACM CIKM International Conference on Information and Knowledge Management, P148, DOI 10.1145/288627.288651
[4]  
EMAN K, 2001, SYSTEMS SOFTWARE J, V55, P301
[5]   Software measurement: Uncertainty and causal modeling [J].
Fenton, N ;
Krause, P ;
Neil, M .
IEEE SOFTWARE, 2002, 19 (04) :116-+
[6]   A critique of software defect prediction models [J].
Fenton, NE ;
Neil, M .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1999, 25 (05) :675-689
[7]  
HALL M, THESIS U WAIKATO NZ
[8]   VERY SIMPLE CLASSIFICATION RULES PERFORM WELL ON MOST COMMONLY USED DATASETS [J].
HOLTE, RC .
MACHINE LEARNING, 1993, 11 (01) :63-91
[9]   PREDICTING SOFTWARE-DEVELOPMENT ERRORS USING SOFTWARE COMPLEXITY METRICS [J].
KHOSHGOFTAAR, TM ;
MUNSON, JC .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1990, 8 (02) :253-261
[10]   An application of zero-inflated Poisson regression for software fault prediction [J].
Khoshgoftaar, TM ;
Gao, KH ;
Szabo, RM .
12TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, 2001, :66-73