Where Should We Fix This Bug? A Two-Phase Recommendation Model

被引:139
作者
Kim, Dongsun [1 ]
Tao, Yida [1 ]
Kim, Sunghun [1 ]
Zeller, Andreas [2 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
[2] Univ Saarlandes Informat, D-66123 Saarbrucken, Germany
关键词
Bug reports; machine learning; patch file prediction; PROBABILISTIC RANKING; PREDICTING FAULTS; IMPACT ANALYSIS; ERRORS; TOOL;
D O I
10.1109/TSE.2013.24
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
To support developers in debugging and locating bugs, we propose a two-phase prediction model that uses bug reports' contents to suggest the files likely to be fixed. In the first phase, our model checks whether the given bug report contains sufficient information for prediction. If so, the model proceeds to predict files to be fixed, based on the content of the bug report. In other words, our two-phase model "speaks up" only if it is confident of making a suggestion for the given bug report; otherwise, it remains silent. In the evaluation on the Mozilla "Firefox" and "Core" packages, the two-phase model was able to make predictions for almost half of all bug reports; on average, 70 percent of these predictions pointed to the correct files. In addition, we compared the two-phase model with three other prediction models: the Usual Suspects, the one-phase model, and BugScout. The two-phase model manifests the best prediction performance.
引用
收藏
页码:1597 / 1610
页数:14
相关论文
共 66 条
[31]   Exploiting the Essential Assumptions of Analogy-Based Effort Estimation [J].
Kocaguneli, Ekrem ;
Menzies, Tim ;
Bener, Ayse Basar ;
Keung, Jacky W. .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2012, 38 (02) :425-438
[32]  
Lee T., 2011, Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, P311, DOI DOI 10.1145/2025113.2025156
[33]  
Lewis D. D., 1998, Machine Learning: ECML-98. 10th European Conference on Machine Learning. Proceedings, P4, DOI 10.1007/BFb0026666
[34]   Scalable statistical bug isolation [J].
Liblit, B ;
Naik, M ;
Zheng, AX ;
Aiken, A ;
Jordon, MI .
ACM SIGPLAN NOTICES, 2005, 40 (06) :15-26
[35]   Bug isolation via remote program sampling [J].
Liblit, B ;
Aiken, A ;
Zheng, AX ;
Jordan, MI .
ACM SIGPLAN NOTICES, 2003, 38 (05) :141-154
[36]  
Liebchen Gemot, 2007, 2007 First International Symposium on Empirical Software Engineering and Measurement, P99
[37]  
Liebchen G. A., 2008, P 4 INT WORKSH PRED, P39, DOI DOI 10.1145/1370788.1370799
[38]   Bug localization using latent Dirichlet allocation [J].
Lukins, Stacy K. ;
Kraft, Nicholas A. ;
Etzkorn, Letha H. .
INFORMATION AND SOFTWARE TECHNOLOGY, 2010, 52 (09) :972-990
[39]  
Manevich R., 2004, Software Engineering Notes, V29, P63, DOI 10.1145/1041685.1029907
[40]   ON A TEST OF WHETHER ONE OF 2 RANDOM VARIABLES IS STOCHASTICALLY LARGER THAN THE OTHER [J].
MANN, HB ;
WHITNEY, DR .
ANNALS OF MATHEMATICAL STATISTICS, 1947, 18 (01) :50-60