GAARP: A power-aware GALS architecture for real-time algorithm-specific tasks

被引:7
作者
Bhunia, S [1 ]
Datta, A [1 ]
Banerjee, N [1 ]
Roy, K [1 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47906 USA
关键词
asynchronous/synchronous operations; algorithms implemented in hardware; fault tolerance; energy-aware systems;
D O I
10.1109/TC.2005.99
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Reducing the energy consumption of a real-time system has emerged as an important design concern. In this paper, we propose GAARP, an adaptive scalable architecture targeted toward algorithm-specific tasks for just-in-time performance using the right amount of power. The architecture consists of Globally Asynchronous and Locally Synchronous ( GALS) building blocks, where the processing hardware is realized by a set of smaller slices of similar structure, each running synchronously with independent clocks. We demonstrate that, for different real-time commercial applications with algorithm-specific jobs like online transaction processing, digital filtering, Fourier transform, etc., the proposed architecture allows dynamic load-balancing and adaptive intertask voltage scaling based on the load in each of the processing units. Compared to a synchronous implementation of the same functionality, we show that the proposed hardware can achieve higher efficiency in terms of power and performance by exploiting the flexibility to balance the load and change the supply voltage. The architecture also lends itself to process tolerance since it can detect process-shifts for the individual processing units and determine the appropriate operating voltage/frequency for each unit. Simulation results for two representative applications show that, for a modest system configuration and random job distribution, we obtain up to 67 percent improvement in MOPS/W ( millions of operations per second per watt) over a fully synchronous implementation.
引用
收藏
页码:752 / 766
页数:15
相关论文
共 32 条
[21]   Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor [J].
Magklis, G ;
Scott, ML ;
Semeraro, G ;
Albonesi, DH ;
Dropsho, S .
30TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2003, :14-25
[22]   Self calibrating clocks for globally asynchronous locally synchronous systems [J].
Moore, SW ;
Taylor, GS ;
Cunningham, PA ;
Mullins, RD ;
Robinson, P .
2000 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS & PROCESSORS, PROCEEDINGS, 2000, :73-78
[23]  
POUWELSE J, 2001, P 7 ANN INT C MOB CO, P251, DOI DOI 10.1145/381677.381701
[24]  
Roy Kaushik, 2000, Low-Power CMOS VLSI Circuit Design
[25]  
Seizovic J. N., 1994, Proceedings of the International Symposium on Advanced Research in Asynchronous Circuits and Systems (Cat. No.94TH06627), P87, DOI 10.1109/ASYNC.1994.656289
[26]   Hiding synchronization delays in a GALS processor microarchitecture [J].
Semeraro, G ;
Albonesi, DH ;
Magklis, G ;
Scott, ML ;
Dropsho, SG ;
Dwarkadas, S .
10TH INTERNATIONAL SYMPOSIUM ON ASYNCHRONOUS CIRCUITS AND SYSTEMS, PROCEEDINGS, 2004, :159-169
[27]  
Semeraro G, 2002, EIGHTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, P29
[28]   MICROPIPELINES [J].
SUTHERLAND, IE .
COMMUNICATIONS OF THE ACM, 1989, 32 (06) :720-738
[29]   CMOS sensors for on-line thermal monitoring of VLSI circuits [J].
Szekely, V ;
Marta, C ;
Kohari, Z ;
Rencz, M .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 1997, 5 (03) :270-276
[30]   A 130 nm generation logic technology featuring 70nm transistors, dual Vt transistors and 6 layers of Cu interconnects [J].
Tyagi, S ;
Alavi, M ;
Bigwood, R ;
Bramblett, T ;
Brandenburg, J ;
Chen, W ;
Crew, B ;
Hussein, M ;
Jacob, P ;
Kenyon, C ;
Lo, C ;
Mcintyre, B ;
Ma, Z ;
Moon, P ;
Nguyen, P ;
Rumaner, L ;
Schweinfurth, R ;
Sivakumar, S ;
Stettler, M ;
Thompson, S ;
Tufts, B ;
Xu, J ;
Yang, S ;
Bohr, M .
INTERNATIONAL ELECTRON DEVICES MEETING 2000, TECHNICAL DIGEST, 2000, :567-570