首页 | 本学科首页   官方微博 | 高级检索  
     


Parallel iterative refinement linear least squares solvers based on all-reduce operations
Affiliation:1. Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States;2. UltraScale Systems Research Center, Los Alamos National Laboratory, Los Alamos, NM 87544, United States;3. Applied Computer Science Group, Los Alamos National Laboratory, Los Alamos, NM 87544, United States;1. Illinois Institute of Technology, Chicago, IL, USA;2. Argonne National Laboratory, Argonne, IL, USA;3. Northern Illinois University, DeKalb, IL, USA
Abstract:We present the novel parallel linear least squares solvers ARPLS-IR and ARPLS-MPIR for dense overdetermined linear systems. All internode communication of our ARPLS solvers arises in the context of all-reduce operations across the parallel system and therefore they benefit directly from efficient implementations of such operations. Our approach is based on the semi-normal equations, which are in general not backward stable. However, the method is stabilised by using iterative refinement. We show that performing iterative refinement in mixed precision also increases the parallel performance of the algorithm. We consider different variants of the ARPLS algorithm depending on the conditioning of the problem and in this context also evaluate the method of normal equations with iterative refinement. For ill-conditioned systems, we demonstrate that the semi-normal equations with standard iterative refinement achieve the best accuracy compared to other parallel solvers.We discuss the conceptual advantages of ARPLS-IR and ARPLS-MPIR over alternative parallel approaches based on QR factorisation or the normal equations. Moreover, we analytically compare the communication cost to an approach based on communication-avoiding QR factorisation. Numerical experiments on a high performance cluster illustrate speed-ups up to 3820 on 2048 cores for ill-conditioned tall and skinny matrices over state-of-the-art solvers from DPLASMA or ScaLAPACK.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号