You are here: TUCS > PUBLICATIONS > Publication Search > SSE Vectorized and GPU Impleme...
SSE Vectorized and GPU Implementations of Arakawa's Formula for Numerical Integration of Equations of Fluid Motion
Evren Yurtesen, Matti Ropo, Mats Aspnäs, Jan Westerholm, SSE Vectorized and GPU Implementations of Arakawa's Formula for Numerical Integration of Equations of Fluid Motion. In: Yiannis Danuletto Marco Papadopoullos George Cotronis (Ed.), Euromicro International Conference on Parallel, Distributed and Network-Based Computing, 341–348, IEEE Computer Society, 2011.
Abstract:
The numerical method presented by Arakawa in
1966 implements a finite difference scheme of the Jacobian
for the solution of the equation of motion for two-dimensional
incompressible flows, which diminishes nonlinear computational
instability and permits long-term numerical integrations.
This paper presents an efficient implementation of Arakawa’s
formula using vectorized Streaming SIMD Extension (SSE) and
Advanced Vector Extension (AVX) instructions. Additionally,
we have improved the performance of memory access in the
code. Performance measurements show that the vectorized
implementation is close to two times more efficient compared
to an implementation without SSE. The AVX version will in the
near future further improve the vectorized performance with
an estimated factor of up to 1.8. Finally we compare our results
to an implementation on a general purpose graphics processor
(GPGPU) and to auto-vectorization by two compilers.
BibTeX entry:
@INPROCEEDINGS{inpYuRoAsWe11a,
title = {SSE Vectorized and GPU Implementations of Arakawa's Formula for Numerical Integration of Equations of Fluid Motion},
booktitle = {Euromicro International Conference on Parallel, Distributed and Network-Based Computing},
author = {Yurtesen, Evren and Ropo, Matti and Aspnäs, Mats and Westerholm, Jan},
editor = {Cotronis, Yiannis Danuletto Marco Papadopoullos George},
publisher = {IEEE Computer Society},
pages = {341–348},
year = {2011},
keywords = {SSE, AVX, numerical integration, vectorization, GPGPU},
}
Belongs to TUCS Research Unit(s): High Performance Computing and Communication
Publication Forum rating of this publication: level 1