Improving Software Diagnosability via Log Enhancement

被引:86
作者
Yuan, Ding [1 ,2 ]
Zheng, Jing [2 ]
Park, Soyeon [2 ]
Zhou, Yuanyuan [2 ]
Savage, Stefan [2 ]
机构
[1] Univ Illinois, Urbana, IL USA
[2] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2012年 / 30卷 / 01期
基金
美国国家科学基金会;
关键词
Reliability; Languages; Log; failure diagnostics; debugging; software diagnosability; program analysis; EXECUTION;
D O I
10.1145/2110356.2110360
中图分类号
TP301 [理论、方法];
学科分类号
080201 [机械制造及其自动化];
摘要
Diagnosing software failures in the field is notoriously difficult, in part due to the fundamental complexity of troubleshooting any complex software system, but further exacerbated by the paucity of information that is typically available in the production setting. Indeed, for reasons of both overhead and privacy, it is common that only the run-time log generated by a system (e.g., syslog) can be shared with the developers. Unfortunately, the ad-hoc nature of such reports are frequently insufficient for detailed failure diagnosis. This paper seeks to improve this situation within the rubric of existing practice. We describe a tool, LogEnhancer that automatically "enhances" existing logging code to aid in future post-failure debugging. We evaluate LogEnhancer on eight large, real-world applications and demonstrate that it can dramatically reduce the set of potential root failure causes that must be considered while imposing negligible overheads.
引用
收藏
页数:28
相关论文
共 61 条
[1]
Aguilera M. K., 2003, Operating Systems Review, V37, P74, DOI 10.1145/1165389.945454
[2]
An Overview of the Saturn Project [J].
Aiken, Alex ;
Bugrara, Suhabe ;
Dillig, Isil ;
Dillig, Thomas ;
Hackett, Brian ;
Hawkins, Peter .
PASTE'07 PROCEEDINGS OF THE 2007 ACM SIGPLAN- SIGSOFT WORKSHOP ON PROGRAM ANALYSIS FOR SOFTWARE TOOLS & ENGINEERING, 2007, :43-48
[3]
[Anonymous], CISC SYST LOG MAN
[4]
[Anonymous], P OSDI
[5]
[Anonymous], DWARF DEB FORM
[6]
[Anonymous], SLOCCount
[7]
[Anonymous], MOZ QUAL FEEDB AG
[8]
[Anonymous], 2004, TN2123 APPL INC
[9]
[Anonymous], GOOGLEBREAKPAD
[10]
TraceBack: First fault diagnosis by reconstruction of distributed control flow [J].
Ayers, A ;
Schooler, R ;
Metcalf, C ;
Agarwal, A ;
Rhee, J ;
Witchel, E .
ACM SIGPLAN NOTICES, 2005, 40 (06) :201-212