TY - JOUR
T1 - Fast multipole methods on a cluster of GPUs for the meshless simulation of turbulence
AU - Yokota, R.
AU - Narumi, T.
AU - Sakamaki, R.
AU - Kameoka, S.
AU - Obi, S.
AU - Yasuoka, K.
N1 - Funding Information:
This study was partially supported by the Core Research for Evolution Science and Technology (CREST) of the Japan Science and Technology Corporation (JST). We thank Dr. Hamada and Dr. Taiji for the fruitful discussions on GPU computing. The authors also thank the reviewers for their suggestive comments.
PY - 2009/11
Y1 - 2009/11
N2 - Recent advances in the parallelizability of fast N-body algorithms, and the programmability of graphics processing units (GPUs) have opened a new path for particle based simulations. For the simulation of turbulence, vortex methods can now be considered as an interesting alternative to finite difference and spectral methods. The present study focuses on the efficient implementation of the fast multipole method and pseudo-particle method on a cluster of NVIDIA GeForce 8800 GT GPUs, and applies this to a vortex method calculation of homogeneous isotropic turbulence. The results of the present vortex method agree quantitatively with that of the reference calculation using a spectral method. We achieved a maximum speed of 7.48 TFlops using 64 GPUs, and the cost performance was near $9.4/GFlops. The calculation of the present vortex method on 64 GPUs took 4120 s, while the spectral method on 32 CPUs took 4910 s.
AB - Recent advances in the parallelizability of fast N-body algorithms, and the programmability of graphics processing units (GPUs) have opened a new path for particle based simulations. For the simulation of turbulence, vortex methods can now be considered as an interesting alternative to finite difference and spectral methods. The present study focuses on the efficient implementation of the fast multipole method and pseudo-particle method on a cluster of NVIDIA GeForce 8800 GT GPUs, and applies this to a vortex method calculation of homogeneous isotropic turbulence. The results of the present vortex method agree quantitatively with that of the reference calculation using a spectral method. We achieved a maximum speed of 7.48 TFlops using 64 GPUs, and the cost performance was near $9.4/GFlops. The calculation of the present vortex method on 64 GPUs took 4120 s, while the spectral method on 32 CPUs took 4910 s.
KW - Fast multipole method
KW - Graphics processing unit
KW - Particle method
KW - Pseudo-particle method
UR - http://www.scopus.com/inward/record.url?scp=70149090358&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70149090358&partnerID=8YFLogxK
U2 - 10.1016/j.cpc.2009.06.009
DO - 10.1016/j.cpc.2009.06.009
M3 - Article
AN - SCOPUS:70149090358
SN - 0010-4655
VL - 180
SP - 2066
EP - 2078
JO - Computer Physics Communications
JF - Computer Physics Communications
IS - 11
ER -