Using mutation analysis for assessing and comparing testing coverage criteria

被引:297
作者
Andrews, James H. [1 ]
Briand, Lionel C.
Labiche, Yvan
Namin, Akbar Siami
机构
[1] Univ Western Ontario, Dept Comp Sci, London, ON N6A 5B7, Canada
[2] Fornebu, Dept Software Engn, Simula Res Lab, N-1325 Lysaker, Norway
[3] Carleton Univ, Software Qual Engn Lab, Ottawa, ON K1S 5B6, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
testing and debugging; testing strategies; test coverage of code; experimental design;
D O I
10.1109/TSE.2006.83
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The empirical assessment of test techniques plays an important role in software testing research. One common practice is to seed faults in subject software, either manually or by using a program that generates all possible mutants based on a set of mutation operators. The latter allows the systematic, repeatable seeding of large numbers of faults, thus facilitating the statistical analysis of fault detection effectiveness of test suites; however, we do not know whether empirical results obtained this way lead to valid, representative conclusions. Focusing on four common control and data flow criteria ( Block, Decision, C-Use, and P-Use), this paper investigates this important issue based on a middle size industrial program with a comprehensive pool of test cases and known faults. Based on the data available thus far, the results are very consistent across the investigated criteria as they show that the use of mutation operators is yielding trustworthy results: Generated mutants can be used to predict the detection effectiveness of real faults. Applying such a mutation analysis, we then investigate the relative cost and effectiveness of the above-mentioned criteria by revisiting fundamental questions regarding the relationships between fault detection, test suite size, and control/data flow coverage. Although such questions have been partially investigated in previous studies, we can use a large number of mutants, which helps decrease the impact of random variation in our analysis and allows us to use a different analysis approach. Our results are then compared with published studies, plausible reasons for the differences are provided, and the research leads us to suggest a way to tune the mutation analysis process to possible differences in fault detection probabilities in a specific environment.
引用
收藏
页码:608 / 624
页数:17
相关论文
共 34 条
[1]   Is mutation an appropriate tool for testing experiments? [J].
Andrews, JH ;
Briand, LC ;
Labiche, Y .
ICSE 05: 27TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2005, :402-411
[2]   General test result checking with log file analysis [J].
Andrews, JH ;
Zhang, YJ .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2003, 29 (07) :634-648
[3]  
[Anonymous], 1998, Software Metrics: A Rigorous and Practical Approach, DOI 10.1201/b17461
[4]  
[Anonymous], P MUT 2000 MUT TEST
[5]  
Beizer B., 2003, Software Testing Techniques
[6]   Using simulation to empirically investigate test coverage criteria based on statechart [J].
Briand, LC ;
Labiche, Y ;
Wang, Y .
ICSE 2004: 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2004, :86-95
[7]   2 NOTIONS OF CORRECTNESS AND THEIR RELATION TO TESTING [J].
BUDD, TA ;
ANGLUIN, D .
ACTA INFORMATICA, 1982, 18 (01) :31-45
[8]  
CAMPBELL DT, 1990, EXPT QUASIEXPERIMENT
[9]   Different temporal expressions of tilapia (Oreochromis mossambicus) insulin-like growth factor-I and IGF binding protein-3 after growth hormone induction [J].
Cheng, RS ;
Chang, KM ;
Wu, JL .
MARINE BIOTECHNOLOGY, 2002, 4 (03) :218-225
[10]   HINTS ON TEST DATA SELECTION - HELP FOR PRACTICING PROGRAMMER [J].
DEMILLO, RA ;
LIPTON, RJ .
COMPUTER, 1978, 11 (04) :34-41