Building reliable mobile-aware applications using the Rover toolkit

被引:3
作者
Joseph, Anthony D. [1 ]
Kaashoek, M. Frans [1 ]
机构
[1] MIT, Comp Sci Lab, Cambridge, MA 02139 USA
关键词
Mobile Host; Server Failure; Client Application; Stable Variable; Failure Recovery;
D O I
10.1023/A:1019142209023
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper discusses extensions to the Rover toolkit for constructing reliable mobile-aware applications. The extensions improve upon the existing failure model, which addresses client or communication failures and guarantees reliable message delivery from clients to server, but does not address server failures (e. g., the loss of an incoming message due to server failure) (Joseph et al., 1997). Due to the unpredictable, intermittent communication connectivity typically found in mobile client environments, it is inappropriate to make clients responsible for guaranteeing request completion at servers. The extensions discussed in this paper provide both system-and language-level support for reliable operation in the form of stable logging of each message received by a server, per-application stable variables, programmer-supplied failure recovery procedures, server process failure detection, and automatic server process restart. The design and implementation of fault-tolerance support is optimized for high performance in the normal case (network connectivity provided by a high latency, low bandwidth, wireless link): measurements show a best-case overhead of less than 7% for a reliable null RPC over wired and cellular dialup links. Experimental results from both micro-benchmarks and applications, such as the Rover Web Browser proxy, show that support for reliable operation can be provided at an overhead of only a few percent of execution time during normal operation.
引用
收藏
页码:405 / 419
页数:15
相关论文
共 26 条
[1]   FAULT-TOLERANT ATOMIC COMPUTATIONS IN AN OBJECT-BASED DISTRIBUTED SYSTEM [J].
AHAMAD, M ;
DASGUPTA, P ;
LEBLANC, RJ .
DISTRIBUTED COMPUTING, 1990, 4 (02) :69-80
[2]  
[Anonymous], P 15 ACM S OP SYST P
[3]  
[Anonymous], 1990, 1144 RFC
[4]  
Arnold Ken., 1996, The Java Programming Language
[5]  
Avizienis Algirdas, 1989, P IFIP C 89, P491
[6]   RELIABLE COMMUNICATION IN THE PRESENCE OF FAILURES [J].
BIRMAN, KP ;
JOSEPH, TA .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1987, 5 (01) :47-76
[7]   IMPLEMENTING REMOTE PROCEDURE CALLS [J].
BIRRELL, AD ;
NELSON, BJ .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1984, 2 (01) :39-59
[8]  
Card R., 1994, PROCESSDINGS 1 DUTCH, P5
[9]  
DEMERS A, 1994, WORKSH MOB COMP SYST, P2
[10]  
DOUGLIS F, 1994, 1 S OP SYST DES IMPL, P25