Parallel computing experiences with CUDA

被引:330
作者
Garland, Michael [1 ]
Le Grand, Scott [1 ]
Nickolls, John [1 ]
Anderson, Joshua [2 ]
Hardwick, Jim [3 ]
Morton, Scott
Phillips, Everett
Zhang, Yao [4 ]
Volkov, Vasily [5 ]
机构
[1] NVIDIA, Santa Clara, CA 95050 USA
[2] Iowa State Univ, Ames, IA 50011 USA
[3] TechniScan Med Syst, Salt Lake City, UT USA
[4] Univ Calif Davis, Dept Elect & Comp Engn, Davis, CA 95616 USA
[5] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
关键词
D O I
10.1109/MM.2008.57
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The CUDA programming model provides a straightforward means of describing inherently parallel computations, and NVIDIA's Tesla GPU architecture delivers high computational throughput on massively parallel problems. This article surveys experiences gained in applying CUDA to a diverse set of problems and the parallel speedups over sequential codes running on traditional CPU architectures attained by executing key computations on the GPU.
引用
收藏
页码:13 / 27
页数:15
相关论文
共 22 条
[1]  
ANDERSON JA, 2008, J CHEM PHYS, V128
[2]   General purpose molecular dynamics simulations fully implemented on graphics processing units [J].
Anderson, Joshua A. ;
Lorenz, Chris D. ;
Travesset, A. .
JOURNAL OF COMPUTATIONAL PHYSICS, 2008, 227 (10) :5342-5359
[3]  
BRANDVIK T, 2008, P 48 AIAA AER SCI M, P607
[4]  
Cataranzo B, 2008, P 25 INT C MACH LEAR, P104, DOI DOI 10.1145/1390156.1390170
[5]  
Frenkel D., 2000, Computational Science Series
[6]  
HE B, 2008, P ACM SIGMOD I008
[7]  
Johnson S. A., 2003, U.S. patent, Patent No. [6,636,584, 6636584]
[8]  
KASS M, 2006, 0601 PIX AN STUD
[9]   NVIDIA Tesla: A unified graphics and computing architecture [J].
Lindholm, Erik ;
Nickolls, John ;
Oberman, Stuart ;
Montrym, John .
IEEE MICRO, 2008, 28 (02) :39-55
[10]   CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment [J].
Manavski, Svetlin A. ;
Valle, Giorgio .
BMC BIOINFORMATICS, 2008, 9 (Suppl 2)