Autonomous recovery in componentized Internet applications

被引:14
作者
Candea, G [1 ]
Kiciman, E [1 ]
Kawamoto, S [1 ]
Fox, A [1 ]
机构
[1] Stanford Univ, Comp Syst Lab, Stanford, CA 94305 USA
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2006年 / 9卷 / 02期
关键词
D O I
10.1007/s10586-006-7562-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we show how to reduce downtime of J2EE applications by rapidly and automatically recovering from transient and intermittent software failures, without requiring application modifications. Our prototype combines three application-agnostic techniques: macroanalysis for fault detection and localization, microrebooting for rapid recovery, and external management of recovery actions. The individual techniques are autonomous and work across a wide range of componentized Internet applications, making them well-suited to the rapidly changing software of Internet services. The proposed framework has been integrated with Moss, an open-source J2EE application server. Our prototype provides an execution platform that can automatically recover J2EE applications within seconds of the manifestation of a fault. Our system can provide a subset of a system's active end users with the illusion of continuous uptime, in spite of failures occurring behind the scenes, even when there is no functional redundancy in the system.
引用
收藏
页码:175 / 190
页数:16
相关论文
共 57 条
[1]  
AGUILERA MK, 2003, P 19 ACM S OP SYST
[2]  
[Anonymous], 2001, P 18 ACM S OP SYST P
[3]  
[Anonymous], 2003, USENIX Symposium on Internet Technologies and Systems
[4]  
BAKER M, 1992, P SUMM USENIX TECHN
[5]  
BARNES M, 2004, J2EE APPL SERVERS MA
[6]  
BARTLEE JF, 1981, P 8 ACM S OP SYST PR
[7]  
BOULOUTAS A, 1994, IEEE T COMMUNICATION
[8]   Lessons from giant-scale services [J].
Brewer, EA .
IEEE INTERNET COMPUTING, 2001, 5 (04) :46-55
[9]  
BROWN A, 2001, P 7 IFIP IEEE INT S
[10]  
*BUS INT GROUP, 2003, BLACK FRID REP WEB A