How well does test case prioritization integrate with statistical fault localization?

被引:49
作者
Jiang, Bo [2 ]
Zhang, Zhenyu [3 ]
Chan, W. K. [1 ]
Tse, T. H. [4 ]
Chen, Tsong Yueh [5 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[2] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China
[3] Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing, Peoples R China
[4] Univ Hong Kong, Dept Comp Sci, Pokfulam, Hong Kong, Peoples R China
[5] Swinburne Univ Technol, Ctr Software Anal & Testing, Melbourne, Vic, Australia
基金
澳大利亚研究理事会;
关键词
Software process integration; Continuous integration; Test case prioritization; Statistical fault localization; Adaptive random testing; Coverage; FAMILY;
D O I
10.1016/j.infsof.2012.01.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Effective test case prioritization shortens the time to detect failures, and yet the use of fewer test cases may compromise the effectiveness of subsequent fault localization. Objective: The paper aims at finding whether several previously identified effectiveness factors of test case prioritization techniques, namely strategy, coverage granularity, and time cost, have observable consequences on the effectiveness of statistical fault localization techniques. Method: This paper uses a controlled experiment to examine these factors. The experiment includes 16 test case prioritization techniques and four statistical fault localization techniques using the Siemens suite of programs as well as grep, gzip, sed, and flex as subjects. The experiment studies the effects of the percentage of code examined to locate faults from these benchmark subjects after a given number of failures have been observed. Results: We find that if testers have a budgetary concern on the number of test cases for regression testing, the use of test case prioritization can save up to 40% of test case executions for commit builds without significantly affecting the effectiveness of fault localization. A statistical fault localization technique using a smaller fraction of a prioritized test suite is found to compromise its effectiveness seriously. Despite the presence of some variations, the inclusion of more failed test cases will generally improve the fault localization effectiveness during the integration process. Interestingly, during the variation periods, adding more failed test cases actually deteriorates the fault localization effectiveness. In terms of strategies, Random is found to be the most effective, followed by the ART and Additional strategies, while the Total strategy is the least effective. We do not observe sufficient empirical evidence to conclude that using different coverage granularity levels have different overall effects. Conclusion: The paper empirically identifies that strategy and time cost of test case prioritization techniques are key factors affecting the effectiveness of statistical fault localization, while coverage granularity is not a significant factor. It also identifies a mid-range deterioration in fault localization effectiveness when adding more test cases to facilitate debugging. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:739 / 758
页数:20
相关论文
共 51 条
[1]   Spectrum-based Multiple Fault Localization [J].
Abreu, Rui ;
Zoeteweij, Peter ;
van Gemund, Arjan J. C. .
2009 IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, PROCEEDINGS, 2009, :88-99
[2]   A practical evaluation of spectrum-based fault localization [J].
Abreu, Rui ;
Zoeteweij, Peter ;
Golsteijn, Rob ;
van Gemund, Arjan J. C. .
JOURNAL OF SYSTEMS AND SOFTWARE, 2009, 82 (11) :1780-1792
[3]  
[Anonymous], P 2011 ACM S APPL CO
[4]  
[Anonymous], DEPLOYMENT PIPELINE
[5]  
[Anonymous], TUDSERG2010007 DELFT
[6]  
Baudry B., 2006, 28th International Conference on Software Engineering Proceedings, P82, DOI 10.1145/1134285.1134299
[7]  
Bo Jiang, 2010, Proceedings of the Tenth International Conference on Quality Software (QSIC 2010), P377, DOI 10.1109/QSIC.2010.64
[8]  
Bo Jiang, 2010, Proceedings of the Tenth International Conference on Quality Software (QSIC 2010), P62, DOI 10.1109/QSIC.2010.55
[9]  
Cadar Cristian, 2008, 8 USENIX OSDI SAN DI, P209
[10]   Quasi-random testing [J].
Chen, Tsong Yueh ;
Merkel, Robert .
IEEE TRANSACTIONS ON RELIABILITY, 2007, 56 (03) :562-568