Learning to Rank Relevant Files for Bug Reports using Domain Knowledge

被引:226
作者
Ye, Xin [1 ]
Bunescu, Razvan [1 ]
Liu, Chang [1 ]
机构
[1] Ohio Univ, Sch Elect Engn & Comp Sci, Athens, OH 45701 USA
来源
22ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (FSE 2014) | 2014年
关键词
bug reports; software maintenance; learning to rank; PROBABILISTIC RANKING; PREDICTING FAULTS; LOCATION;
D O I
10.1145/2635868.2635874
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
When a new bug report is received, developers usually need to reproduce the bug and perform code reviews to find the cause, a process that can be tedious and time consuming. A tool for ranking all the source files of a project with respect to how likely they are to contain the cause of the bug would enable developers to narrow down their search and potentially could lead to a substantial increase in productivity. This paper introduces an adaptive ranking approach that leverages domain knowledge through functional decompositions of source code files into methods, API descriptions of library components used in the code, the bug-fixing history, and the code change history. Given a bug report, the ranking score of each source file is computed as a weighted combination of an array of features encoding domain knowledge, where the weights are trained automatically on previously solved bug reports using a learning-to-rank technique. We evaluated our system on six large scale open source Java projects, using the before-fix version of the project for every bug report. The experimental results show that the newly introduced learning-to-rank approach significantly outperforms two recent state-of-the-art methods in recommending relevant files for bug reports. In particular, our method makes correct recommendations within the top 10 ranked source files for over 70% of the bug reports in the Eclipse Platform and Tomcat projects.
引用
收藏
页码:689 / 699
页数:11
相关论文
共 46 条
[1]  
Anh Tuan Nguyen, 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering, P263, DOI 10.1109/ASE.2011.6100062
[2]  
[Anonymous], 2006, P ACMSIGKDD INT C KN
[3]  
[Anonymous], 1999, TECH REPORT STANFORD
[4]  
[Anonymous], 2002, P ACM SIGKDD KDD 200, DOI 10.1145/775047.775067
[5]  
[Anonymous], 2008, Introduction to information retrieval
[6]  
[Anonymous], 2009, OBJECT ORIENTED SOFT
[7]  
Ashok B, 2009, 7TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, P373
[8]  
Bacchelli A, 2013, PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), P712, DOI 10.1109/ICSE.2013.6606617
[9]  
Bell R., 2006, Proc. 2006 International Symposium on Software Testing and Analysis, P61, DOI DOI 10.1145/1146238.1146246
[10]  
Bettenburg N., 2008, P 16 ACM SIGSOFT INT, P308