A two-level directory architecture for highly scalable cc-NUMA multiprocessors

被引:27
作者
Acacio, ME
González, J
García, JM
Duato, J
机构
[1] Univ Murcia, Dept Ingn & Tecnol Comp, Fac Informat, E-30071 Murcia, Spain
[2] Intel Labs Barcelona, Intel Barcelona Res Ctr, Barcelona 08034, Spain
[3] Univ Politecn Valencia, Dept Informat Sistemas & Comp, Valencia 46010, Spain
关键词
scalability; directory memory overhead; two-level directory architecture; compressed sharing codes; unnecessary coherence messages; cc-NUMA multiprocessor;
D O I
10.1109/TPDS.2005.4
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
One important issue the designer of a scalable shared-memory multiprocessor must deal with is the amount of extra memory required to store the directory information. It is desirable that the directory memory overhead be kept as low as possible, and that it scales very slowly with the size of the machine. Unfortunately, current directory architectures provide scalability at the expense of performance. This work presents a scalable directory architecture that significantly reduces the size of the directory for large-scale configurations of a multiprocessor without degrading performance. First, we propose multilayer clustering as an effective approach to reduce the width of directory entries. Based on this concept, we derive three new compressed sharing codes, some of them with a space complexity of O(log(2)(log(2)(N))) for an N-node system. Then, we present a novel two-level directory architecture to eliminate the penalty caused by compressed directories in general. The proposed organization consists of a small full-map first-level directory (which provides precise information for the most recently referenced lines) and a compressed second-level directory (which provides in-excess information for all the lines). The proposals are evaluated based on extensive execution-driven simulations (using RSIM) of a 64-node cc-NUMA multiprocessor. Results demonstrate that a system with a two-level directory architecture achieves the same performance as a multiprocessor with a big and nonscalable full-map directory, with a very significant reduction of the memory overhead.
引用
收藏
页码:67 / 79
页数:13
相关论文
共 34 条
[1]   A new scalable directory architecture for large-scale multiprocessors [J].
Acacio, ME ;
González, J ;
García, JM ;
Duato, J .
HPCA: SEVENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTING ARCHITECTURE, PROCEEDINGS, 2001, :97-106
[2]  
AGARWAL A, 1995, ACM COMP AR, P2, DOI 10.1109/ISCA.1995.524544
[3]  
AGARWAL A, 1988, P 15 INT S COMP ARCH, P280
[4]  
Barroso LA, 2000, PROCEEDING OF THE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, P282, DOI [10.1109/ISCA.2000.854398, 10.1145/342001.339696]
[5]  
CENSIER LM, 1978, IEEE T COMPUT, V27, P1112
[6]  
CHAIKEN D, 1991, P 4 INT C ARCH SUPP, P224
[7]   An efficient tree cache coherence protocol for distributed shared memory multiprocessors [J].
Chang, YK ;
Bhuyan, LN .
IEEE TRANSACTIONS ON COMPUTERS, 1999, 48 (03) :352-360
[8]  
Choi JH, 1999, IPPS PROC, P258, DOI 10.1109/IPPS.1999.760473
[9]  
*CONV COMP CORP, 1993, DHW014 CONV COMP COR
[10]  
Culler D. E., 1993, Proceedings SUPERCOMPUTING '93, P262