Design alternatives for virtual interface architecture and an implementation on IBM netfinity NT cluster

被引：4

作者：

Banikazemi, M ^{[1
]}

Abali, B

Herger, L

Panda, DK

机构：

[1] Ohio State Univ, Dept Comp & Informat Sci, Columbus, OH 43210 USA

[2] IBM Corp, TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING | 2001年 / 61卷 / 11期

关键词：

cluster computing; virtual interface architecture; performance evaluation; scalable architecture;

D O I：

10.1006/jpdc.2001.1745

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The Virtual Interface Architecture (VIA) specification has been developed to standardize user-level network interfaces that provide low-latency. high-bandwidth communications. Few hardware and software implementations of VIA exist. Since the VIA specification is flexible, different choices exist for implementing various components of VIA such as doorbells. address translation methods, and completion queues. Although previous studies have evaluated the overall performance of different VIA implementations. there has not been a comparative study on the performance of VIA components. In this paper. we evaluate and compare the performance of different implementations of essential VIA components. We discuss the pros and coils of each design approach and describe the required support for implementing each of them. Then, we discuss an experimental implementation of the Virtual Interface Architecture for the IBM SP Switch-Connected NT cluster. one of the newest clustering platforms available. We discuss different design issues involved in this implementation. In particular. we explain how the virtual-to-physical address translation is implemented efficiently with a minimum Network Interface Card (NIC) memory requirement. We show how caching the VIA descriptors on the NIC can reduce the communication latency. We also present an efficient scheme for implementing the VIA doorbells without any hardware support. We provide a comprehensive performance evaluation study and discuss the impact of several hardware improvements on the performance of our implementation. The performance of the implemented VIA surpasses that of other existing software implementations of the VIA and is comparable to that of a hardware VIA implementation. The peak measured bandwidth for our system is 101.4 MBytes/s and the one-way latency for short messages is 18.2 mus. (C) 2001 Academic Press.

引用

页码：1512 / 1545

页数：34

共 24 条

[1]

BAILEY DH, 1994, 94006 RNR

[2]

Banikazemi M., 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000, P33, DOI 10.1109/IPDPS.2000.845962

[3] Implementing efficient MPI on LAPI for IBM RS/6000 SP systems: Experiences and performance evaluation [J].

Banikazemi, M ;

Govindaraju, RK ;

Blackmore, R ;

Panda, DK .

IPPS/SPDP 1999: 13TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM & 10TH SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 1999, :183-190

[4]

BANIKAZEMI M, 2000, P CANPC WORKSH CONJ

[5] VIRTUAL-MEMORY-MAPPED NETWORK INTERFACES [J].

BLUMRICH, MA ;

DUBNICKI, C ;

FELTEN, EW ;

LI, K ;

MESARINA, MR .

IEEE MICRO, 1995, 15 (01) :21-28

[6] MYRINET - A GIGABIT-PER-SECOND LOCAL-AREA-NETWORK [J].

BODEN, NJ ;

COHEN, D ;

FELDERMAN, RE ;

KULAWIK, AE ;

SEITZ, CL ;

SEIZOVIC, JN ;

SU, WK .

IEEE MICRO, 1995, 15 (01) :29-36

[7]

Buonadonna P, 1999, PROCEEDINGS OF THE 3RD USENIX WINDOWS NT SYMPOSIUM, P83

[8]

BUONDONNAA P, 1998, P SUP SC NOV, P7

[9]

*COMP SERV 2 SAN, COMP SERV 2 SAN INT

[10]

KUTIUG SN, 2000, OSUCISRC0100TR02 DEP

← 1 2 3 →