Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining

被引:228
作者
Van Renesse, R [1 ]
Birman, KP [1 ]
Vogels, W [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2003年 / 21卷 / 02期
关键词
algorithms; design; management; performance; reliability; security;
D O I
10.1145/762483.762485
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Scalable management and self-organizational capabilities are emerging as central requirements for a generation of large-scale, highly dynamic, distributed applications. We have developed an entirely new distributed information management system called Astrolabe. Astrolabe collects large-scale system state, permitting rapid updates and providing on-the-fly attribute aggregation. This latter capability permits an application to locate a resource, and also offers a scalable way to track system state as it evolves over time. The combination of features makes it possible to solve a wide variety of management and self-configuration problems. This paper describes the design of the system with a focus upon its scalability. After describing the Astrolabe service, we present examples of the use of Astrolabe for locating resources, publish-subscribe, and distributed synchronization in large systems. Astrolabe is implemented using a peer-to-peer protocol, and uses a restricted form of mobile code based on the SQL query language for aggregation. This protocol gives rise to a novel consistency model. Astrolabe addresses several security considerations using a built-in PKI. The scalability of the system is evaluated using both simulation and experiments; these confirm that Astrolabe could scale to thousands and perhaps millions of nodes, with information propagation delays in the tens of seconds.
引用
收藏
页码:164 / 206
页数:43
相关论文
共 33 条
[1]  
ADJIEWINOTO W, 1999, P 17 ACM S OP SYST P
[2]  
AGUILERA M, 1999, P 18 ACM S PRINC DIS
[3]   An investigation of factors affecting how engineers and scientists seek information [J].
Anderson, CJ ;
Glassman, M ;
McAfee, RB ;
Pinelli, T .
JOURNAL OF ENGINEERING AND TECHNOLOGY MANAGEMENT, 2001, 18 (02) :131-155
[4]  
[Anonymous], 2001, UCBCSD011141
[5]  
BALAZINSKA M, 2002, LNCS, V2414, P195
[6]  
Birman K., 1987, P 11 ACM S OP SYST P, P123
[7]   Bimodal multicast [J].
Birman, KP ;
Hayden, M ;
Ozkasap, O ;
Xiao, Z ;
Budiu, M ;
Minsky, Y .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1999, 17 (02) :41-88
[8]   GRAPEVINE - AN EXERCISE IN DISTRIBUTED COMPUTING [J].
BIRRELL, AD ;
LEVIN, R ;
NEEDHAM, RM ;
SCHROEDER, MD .
COMMUNICATIONS OF THE ACM, 1982, 25 (04) :260-274
[9]   SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS [J].
BLOOM, BH .
COMMUNICATIONS OF THE ACM, 1970, 13 (07) :422-&
[10]  
Bonnet P., 2001, P 2 INT C MOB DAT MA