Series approximation methods for divide and square root in the Power3™ processor

被引:18
作者
Agarwal, RC [1 ]
Gustavson, FG [1 ]
Schmookler, MS [1 ]
机构
[1] IBM Corp, Div Res, Yorktown, NY USA
来源
14TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS | 1999年
关键词
D O I
10.1109/ARITH.1999.762836
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Power3 processor is a 64-bit implementation of the PowerPC(TM) architecture and is the successor to the Power2(TM) processor for workstations and sewers which require high performance floating point capability. The previous processors used Newton-Raphson algorithms for their implementations of divide and square root. The Power3 processor has a longer pipeline latency, which would substantially increase the latency for these instructions. Instead, new algorithms based on power series approximations were developed which provide significantly better performance than the Newton-Raphson algorithm for this processor. This paper describes the algorithms, and then shows how both the series based algorithms and the Newton-Raphson algorithms are affected by pipeline length. For the Power3, the power series algorithms reduce the divide latency by over 20% and the square root latency by 35%.
引用
收藏
页码:116 / 123
页数:8
相关论文
共 18 条
[1]   NEW SCALAR AND VECTOR ELEMENTARY-FUNCTIONS FOR THE IBM SYSTEM/370 [J].
AGARWAL, RC ;
COOLEY, JW ;
GUSTAVSON, FG ;
SHEARER, JB ;
SLISHMAN, G ;
TUCKERMAN, B .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1986, 30 (02) :126-144
[2]  
AGARWAL RC, 1996, Patent No. 5563818
[3]   IBM SYSTEM/360 MODEL 91 - FLOATING-POINT EXECUTION UNIT [J].
ANDERSON, SF ;
EARLE, JG ;
GOLDSCHMIDT, RE ;
POWERS, DM .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1967, 11 (01) :34-+
[4]  
*ANSI IEEE, 1985, 7541985 ANSIIEEE
[5]  
BJORKSTEN AA, 1998, Patent No. 5764549
[6]  
CORNEAHASEGAN M, 1998, INTEL TECHNOLOGY J
[7]  
FLYNN MJ, 1958, IEEE T COMPUTERS, V19
[8]   SRT division architectures and implementations [J].
Harris, DL ;
Oberman, SF ;
Horowitz, MA .
13TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 1997, :18-25
[9]  
HICKS TN, 1990, IBM J RES DEV, V38, P525
[10]   LEADING-ZERO ANTICIPATOR (LZA) IN THE IBM RISC SYSTEM-6000 FLOATING-POINT EXECUTION UNIT [J].
HOKENEK, E ;
MONTOYE, RK .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1990, 34 (01) :71-77