Predicting source code changes by mining change history

被引:292
作者
Ying, ATT
Murphy, GC
Ng, R
Chu-Carroll, MC
机构
[1] IBM TJ Watson Res Ctr, Hawthorne, NY 10532 USA
[2] Univ British Columbia, Dept Comp Sci, Vancouver, BC V6T 1Z4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
enhancement; maintainability; clustering; classification; association rules; data mining;
D O I
10.1109/TSE.2004.52
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software developers are often faced with modification tasks that involve source which is spread across a code base. Some dependencies between source code, such as those between source code written in different languages, are difficult to determine using existing static and dynamic analyses. To augment existing analyses and to help developers identify relevant source code during a modification task, we have developed an approach that applies data mining techniques to determine change patterns-sets of files that were changed together frequently in the past-from the change history of the code base. Our hypothesis is that the change patterns can be used to recommend potentially relevant source code to a developer performing a modification task. We show that this approach can reveal valuable dependencies by applying the approach to the Eclipse and Mozilla open source projects and by evaluating the predictability and interestingness of the recommendations produced for actual modification tasks on these systems.
引用
收藏
页码:574 / 586
页数:13
相关论文
共 29 条
[1]  
Agrawal Hiralal., 1990, ACM SIGPLAN Notices, V25, P246, DOI 10.1145/93542.93576
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]  
Agrawal R, 1994, P 20 INT C VER LARG, V1215, P487
[4]  
Arnold Robert S., 1996, SOFTWARE CHANGE IMPA
[5]  
BAKER BS, 1992, COMPUTING SCIENCE AND STATISTICS : VOL 24, P49
[6]   Clone detection using abstract syntax trees [J].
Baxter, ID ;
Yahin, A ;
Moura, L ;
Sant'Anna, M ;
Bier, L .
INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, :368-377
[7]  
Brin S., 1997, P 1997 ACM SIGMOD IN, P265, DOI DOI 10.1145/253262.253327
[8]   Incremental mining of frequent patterns without candidate generation or support constraint [J].
Cheung, W ;
Zaïane, OR .
SEVENTH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2003, :111-116
[9]  
CUBRANIC D, P INT C SOFTW ENG, P408
[10]   Analyzing and relating bug report data for feature tracking [J].
Fischer, M ;
Pinzger, M ;
Gall, H .
10TH WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, 2003, :90-99