首页 | 本学科首页   官方微博 | 高级检索  
     

一种改进的适合并行计算的TFQMR算法
引用本文:刘杰,迟利华,胡庆丰,李晓梅.一种改进的适合并行计算的TFQMR算法[J].计算机研究与发展,2005,42(7):1235-1240.
作者姓名:刘杰  迟利华  胡庆丰  李晓梅
作者单位:1. 国防科学技术大学计算机学院,长沙,410073
2. 装备指挥技术学院,北京,101416
基金项目:国家自然科学基金项目(40245023),计算物理国家重点实验室基金项目(51479040103KG0201)
摘    要:TFQMR算法是一种Krylov子空间算法,常用来求解大型稀疏线性方程组.通过改变TFQMR算法的计算次序,提出了一种改进的TFQMR(ITFQMR)算法.对比TFQMR算法,ITFQMR算法的数值稳定性和TFQMR算法相同,几乎没有增加计算量,但考虑了在MIMD并行机上实现时并行算法的性能,其同步开销减少为TFQMR算法的一半,并且所有内积计算以及矩阵向量乘是独立的,没有数据相关性,可以进行计算与通信的重叠.从理论和实验两个角度来讨论ITFQMR算法的性能,当处理机台数较多时,ITFQMR算法的计算速度快于TFQMR算法.实验说明了在有64台处理机机群上进行,最快的并行ITFQMR算法的计算速度大约比TFQMR算法快20%.

关 键 词:TFQMR算法  同步开销  并行计算  机群  大型稀疏线性方程组

An Improved TFQMR Algorithm for Large Linear Systems Suited to Parallel Computing
Liu Jie,Chi Lihua,Hu Qingfeng,Li Xiaomei.An Improved TFQMR Algorithm for Large Linear Systems Suited to Parallel Computing[J].Journal of Computer Research and Development,2005,42(7):1235-1240.
Authors:Liu Jie  Chi Lihua  Hu Qingfeng  Li Xiaomei
Affiliation:Liu Jie1,Chi Lihua1,Hu Qingfeng1,and Li Xiaomei2 1
Abstract:The transpose-free QMR(TFQMR) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems with very large and very sparse coefficient matrices. In this paper, by changing the computation sequence in the TFQMR algorithm, an improved transpose-free QMR (ITFQMR) algorithm is proposed. The numerical stability of the ITFQMR algorithm is the same as TFQMR algorithm, but the synchronization overhead that represents the bottleneck of the parallel performance is effectively reduced by a factor of two. And all inner products of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. From the theoretical and experimental analysis it is found that the ITFQMR algorithm is faster than the TFQMR algorithm as the number of processors increases. The experiments performed on a 64-processor cluster indicate that the ITFQMR is approximately 20% faster than the TFQMR.
Keywords:transpose-free QMR algorithm  synchronization overhead  parallel computing  cluster  large sparse linear systems
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号