Performance evaluation of image processing algorithms on the GPU

被引:75
作者
Castano-Diez, Daniel [1 ]
Moser, Dominik [1 ]
Schoenegger, Andreas [1 ]
Pruggnaller, Sabine [1 ]
Frangakis, Achilleas S. [1 ]
机构
[1] European Mol Biol Lab, D-69117 Heidelberg, Germany
关键词
GPGPU; CUDA; electron tomography; high performance computing; image processing; multivariate statistical analysis; pattern recognition;
D O I
10.1016/j.jsb.2008.07.006
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The graphics processing unit (GPU), which originally was used exclusively for visualization purposes, has evolved into an extremely powerful co-processor. In the meanwhile, through the development of elaborate rate interfaces, the GPU can be used to process data and deal with computationally intensive applications. The speed-up factors attained compared to the central processing unit (CPU) are dependent on the particular application, as the GPU architecture gives the best performance for algorithms that exhibit high data parallelism and high arithmetic intensity. Here, we evaluate the performance of the GPU on a number of common algorithms used for three-dimensional image processing. The algorithms were developed on a new software platform called "CUDA", which allows a direct translation from C code to the GPU. The implemented algorithms include spatial transformations, real-space and Fourier operations, as well as pattern recognition procedures, reconstruction algorithms and classification procedures. In our implementation, the direct porting of C code in the GPU achieves typical acceleration values in the order of 10-20 times compared to a state-of-the-art conventional processor, but they vary depending on the type of the algorithm. The gained speed-up comes with no additional costs, since the software runs on the GPU of the graphics card of common workstations. (c) 2008 Elsevier Inc. All rights reserved.
引用
收藏
页码:153 / 160
页数:8
相关论文
共 29 条
[1]  
[Anonymous], 1997, ARPACK Users' Guide: Solution of Large Scale Eigenvalue Problems by Implicitly Restarted Arnoldi Methods, DOI 10.1137/1.9780898719628
[2]   Toward detecting and identifying macromolecules in a cellular context:: Template matching applied to electron tomograms [J].
Böhm, J ;
Frangakis, AS ;
Hegerl, R ;
Nickell, S ;
Typke, D ;
Baumeister, W .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (26) :14245-14250
[3]  
BOLZ J, 2003, SPARSE MATRIX SOLVER
[4]  
BUCK L, 2004, ACM SIGGRAPH 2004 PA
[5]   AN ALGORITHM FOR MACHINE CALCULATION OF COMPLEX FOURIER SERIES [J].
COOLEY, JW ;
TUKEY, JW .
MATHEMATICS OF COMPUTATION, 1965, 19 (90) :297-&
[6]  
Davis TA, 2006, FUND ALGORITHMS, V2, P1, DOI 10.1137/1.9780898718881
[7]   Implementation and performance evaluation of reconstruction algorithms on graphics processors [J].
Diez, Daniel Castano ;
Mueller, Hannes ;
Frangakis, Achilleas S. .
JOURNAL OF STRUCTURAL BIOLOGY, 2007, 157 (01) :288-295
[8]  
FATAHALIAN K, 2004, P ACM SIGGRAPH EUROG
[9]  
Fernando R, 2003, CG TUTORIAL DEFINITI
[10]   Classification of cryo-electron sub-tomograms using constrained correlation [J].
Foerster, Friedrich ;
Pruggnaller, Sabine ;
Seybert, Anja ;
Frangakis, Achilleas S. .
JOURNAL OF STRUCTURAL BIOLOGY, 2008, 161 (03) :276-286