A system for evaluating performance and cost of SIMD array designs

被引:8
作者
Herbordt, MC [1 ]
Cravy, J [1 ]
Sam, R [1 ]
Kidwai, O [1 ]
Lin, C [1 ]
机构
[1] Univ Houston, Dept Elect & Comp Engn, Houston, TX 77204 USA
基金
美国国家科学基金会;
关键词
SIMD architecture; parallel computer architecture; computer design evaluation; computer simulation; domain specific systems; virtual prototyping;
D O I
10.1006/jpdc.1999.1602
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
SIMD arrays are likely to become increasingly important as coprocessors in domain specific systems as architects continue to leverage RAM technology in their design. The problem this work addresses is the efficient evaluation of SIMD arrays with respect to complex applications while accounting for operating frequency and chip area. The underlying issues include the size of the architecture space. the lack of portability of the test programs, and the inherent complexity of simulating up to hundreds of thousands of processing elements. The overall method we use is to combine architecture level and Electronic Design Automation (EDA) level modeling by using an EDA-based tool to calibrate architectural simulations. The resulting system retains much of the high throughput of the architecture level simulator but it also has accuracy similar to that of an early pass EDA synthesis and circuit simulation. The particular problem of computational cost of the architectural level simulation is addressed with a novel approach to trace-based simulation (we call it trace compilation), which we find to be one to two orders of magnitude faster than instruction level simulation while still retaining much of the accuracy of the model. Furthermore, traces must be generated for only a small fraction of the possible parameter combinations. Using trace compilation also addresses program portability by allowing the user to code in a single data parallel language with a single compiler, regardless of the target architecture. We have used our system to evaluate thousands of potential SIMD array designs with respect to real applications and present some Sample results. (C) 2000 Academic Press.
引用
收藏
页码:217 / 246
页数:30
相关论文
共 41 条
[1]   Issues in the design of high performance SIMD architectures [J].
Allen, JD ;
Schimmel, DE .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (08) :818-829
[2]  
[Anonymous], P COMPC SAN FRANC CA
[3]  
BOLOTSKI M, 1996, THESIS MIT
[4]  
CHANG H, 1997, P COMP ARCH MACH PER, P253
[5]   Conceptual prototyping of scalable embedded DSP systems [J].
Dung, LR ;
Madisetti, VK .
IEEE DESIGN & TEST OF COMPUTERS, 1996, 13 (03) :54-65
[6]   Computational RAM: Implementing processors in memory [J].
Elliott, DG ;
Stumm, M ;
Snelgrove, WM ;
Cojocaru, C ;
McKenzie, R .
IEEE DESIGN & TEST OF COMPUTERS, 1999, 16 (01) :32-41
[7]  
FOSTER CC, 1986, CONTENT ADDRESSABLE
[8]  
FOUNTAIN TJ, 1981, LANGUAGES ARCHITECTU
[9]   A 10 GIPS SIMD processor for PC-based real-time vision applications - Architecture, algorithm implementation and language support [J].
Fujita, Y ;
Kyo, S ;
Yamashita, N ;
Okazaki, S .
CAMP'97 - FOURTH IEEE INTERNATIONAL WORKSHOP ON COMPUTER ARCHITECTURE FOR MACHINE PERCEPTION, PROCEEDINGS, 1997, :22-32
[10]  
Fujita Y., 1995, Proceedings. CAMP '95 Computer Architectures for Machine Perception (Cat. No.95TB8093), P242, DOI 10.1109/CAMP.1995.521046