首页 | 本学科首页   官方微博 | 高级检索  

Auto-tuned Krylov methods on cluster of graphics processing unit
Authors:Frédéric Magoulès  Abal-Kassim Cheik Ahamed  Roman Putanowicz
Affiliation:1. Ecole Centrale Paris, Paris, France;2. Institute for Computational Civil Engineering, Cracow University of Technology, Cracow, Poland
Abstract:Exascale computers are expected to have highly hierarchical architectures with nodes composed by multiple core processors (CPU; central processing unit) and accelerators (GPU; graphics processing unit). The different programming levels generate new difficult algorithm issues. In particular when solving extremely large linear systems, new programming paradigms of Krylov methods should be defined and evaluated with respect to modern state of the art of scientific methods. Iterative Krylov methods involve linear algebra operations such as dot product, norm, addition of vectors and sparse matrix–vector multiplication. These operations are computationally expensive for large size matrices. In this paper, we aim to focus on the best way to perform effectively these operations, in double precision, on GPU in order to make iterative Krylov methods more robust and therefore reduce the computing time. The performance of our algorithms is evaluated on several matrices arising from engineering problems. Numerical experiments illustrate the robustness and accuracy of our implementation compared to the existing libraries. We deal with different preconditioned Krylov methods: Conjugate Gradient for symmetric positive-definite matrices, and Generalized Conjugate Residual, Bi-Conjugate Gradient Conjugate Residual, transpose-free Quasi Minimal Residual, Stabilized BiConjugate Gradient and Stabilized BiConjugate Gradient (L) for the solution of sparse linear systems with non symmetric matrices. We consider and compare several sparse compressed formats, and propose a way to implement effectively Krylov methods on GPU and on multicore CPU. Finally, we give strategies to faster algorithms by auto-tuning the threading design, upon the problem characteristics and the hardware changes. As a conclusion, we propose and analyse hybrid sub-structuring methods that should pave the way to exascale hybrid methods.
Keywords:Krylov methods  iterative methods  linear algebra  sparse matrix–vector product  GPU  CUDA  auto-tuning  compressed-sparse row format  ELLPACK (ELL) format  hybrid (HYB) format  coordinate (Coo) format  Cusp  CUSPARSE  CUBLAS
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号