首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
求解大型稀疏线性代数方程组 AX=C (1)常采用的直接解法有二种,最普遍的是LU分解法(即Gauss消去法,或称非正交型消元法).为了提高精度,常采用部分主元素或全部主元素法.但是,这一措施对一个非零元有确定分布的方程组,每次分解而形成的三角阵L和U其稀疏结构却是不同的.因此,每次分解用于寻找主元以及修改阵中非零元连接表的额外工作量很大,严重地降低了求解速度.文的作者做过一些试验,其结果见表1.他们的试验是在111机上用BCY语言实现的.试验表明,按主容限法(为兼顾稀疏性而对全主元降低了要求的全主元素法)作LU分解,总工作量往往正比于n~q(n为方程组阶数),其中q>1.为了避免选主元  相似文献   

2.
秦忠国  姜弘道 《计算机工程》1998,24(3):21-22,40
ScaLapack是一个并行计算软件包,适用于分布存储的MIMD并行机。ScaLapack提供若干线性代数求解功能,具有高效,可 移植,可伸缩,高可靠性的优点。  相似文献   

3.
计算空气动力学的高阶面元法中,将原来位流升力面理论中求解积分方程的问题近似改成求解一组线性代数方程组。针对系数矩阵的特点,采用与所分网络块对应的数据分配方式,并用部分选主元的Gauss-Jordan算法求逆。分别在4台和8台Pentium166微机组成的并行虚拟机上运行。当矩阵阶达到2100时,并行效率分别为95.4%和91%。  相似文献   

4.
求解正则式方程式集合的面向矩阵高斯主元消去法   总被引:1,自引:0,他引:1  
本文在论述利用系数矩阵进行消元变换求解正则表达式方程式集合的高斯消去法的基础上,提出了一种选取系数矩阵中主元素进行消元变换求解正则表达式方程式集合的高斯主元素消去法,并给出易编程的算法。  相似文献   

5.
1.引言ScaLAPACK是村卫fableLinearAlgebraMage的缩写,是为在基于消息传递的MIMD并行计算机系统上解数值线性代数问题,并由美国橡树岭国家实验室和田纳西大学等联合开发.它支持对()线性代数方程组问题()最小二乘问题(3特征值问题(4)奇异值分解等问题的求解(参见文献[1—31).这些问题由于在科学与工程计算中经常出现,它们的高效求解成了应用程序获得高性能的关键.随着计算机的发展,相继开发了LinPack;EisPack;LAPACK和ScaLAPACK等数值软件包,利…  相似文献   

6.
高斯消去法,又称高斯消元法,实际上就是我们俗称的加减消元法。数学上,高斯消去法或称高斯-约当消去法,由高斯和约当得名(很多人将高斯消去作为完整的高斯-约当消去的前半部分),它是线性代数中的一个算法,用于决定线性方程组的解,决定矩阵的秩,以及决定可逆方矩阵的逆。当用于一个矩阵时,高斯消去产生行消去梯形形式。用高斯消去法求解线性方程组的解是一种比较常见的解线性方程组的方法,这种方法尤其在利用计算机求解线性方程组时是更是常用。但大多数情况下都是用串行的算法来解方程组,该文介绍了利用高斯消去法并行求解线性方程组的方法。  相似文献   

7.
UML在分布式系统中的应用与研究   总被引:5,自引:0,他引:5  
UML(Unified  Modeling  Language)是一种标准的、功能强大的建模语言。ISO RM-ODP(ISO开放分布式处理参考模型)为开放、灵活的分布式系统提供了主框架。文章提出了一种用UML为ODP系统建模的方法,它以ODP的概念和UML的符号为基础。  相似文献   

8.
本文介绍一个CIMS工程数据库的分布式在线事务处理控制系统DOTS,它是一个基于客户/服务器结构的、能与CIM全局协调的集成数据分布式在线事务处理控制系统的原型  相似文献   

9.
在阵列机或细胞结构向量机上用高斯消去法求解线性代数方程组的基本操作是进行大量的行向量变换。若并行处理机台数S远超过方程组的阶数N,则因在行变换时至少有S-N台处理机不工作而造成系统效率极低。本文提出一种可以在“虚共存细胞结构纵横加工向量机”(以下简称“虚共存机系统”)上实现的高效并行算法,使系统效率大大提高。  相似文献   

10.
二阶椭圆型微分方程边值问题的数值求解在实践中具有重要的意义。当用差分法解这类问题时,结果就要求解一类线性代数方程组,这类方程组的系数矩阵具有一些特殊的结构和性质。以矩形区域上的二维问题为例,若用矩形网格,节点按自然次序编号,用通常的五点格式所得方程组的系数矩阵是块三对角的。用“矩阵追赶法”解这类问题效果很差,即计算量和存储量相当大而精度差。问题在于,这种解法中有许多矩阵求逆运算,而这些矩阵中有些可能是病态的。矩阵追赶法的一些变形(见[2]、[3]等),结果也常归结到一个病态方程组的求解,因而大大影响精度。同时,仍有要求存储量大和计算过程不稳定等缺点。用Gauss主元消去法或Crout方法等,由于非零元素的大量充入,破坏原来矩阵的稀疏性,使存储量增大。  相似文献   

11.
PeiZong Lee 《Parallel Computing》1995,21(12):1895-1923
It is widely accepted that distributed memory parallel computers will play an important role in solving computation-intensive problems. However, the design of an algorithm in a distributed memory system is time-consuming and error-prone, because a programmer is forced to manage both parallelism and communication. In this paper, we present techniques for compiling programs on distributed memory parallel computers. We will study the storage management of data arrays and the execution schedule arrangement of Do-loop programs on distributed memory parallel computers. First, we introduce formulas for representing data distribution of specific data arrays across processors. Then, we define communication cost for some message-passing communication operations. Next, we derive a dynamic programming algorithm for data distribution. After that, we show how to improve the communication time by pipelining data, and illustrate how to use data-dependence information for pipelining data. Jacobi's iterative algorithm and the Gauss elimination algorithm for linear systems are used to illustrate our method. We also present experimental results on a 32-node nCUBE-2 computer.  相似文献   

12.
New books     
Two variants of the Gauss-Jordan type methods, the Purcell method and the Gauss -Huard method, solve a linear system with a similar number of flop counts to the Gauss elimination method (i.e., less than the basic Gauss-Jordan method). In this paper, we first show that the unpivoted versions of the two variants are actually equivalent and their pivoted versions are in general different. Then we demonstrate how one can reproduce a Gauss-Huard method with partial pivoting by modifying the Purcell method and conversely recover the Purcell method by modifying the Gauss-Huard method. It turns out that the latter relationship gives rise to an efficient variant of the Purcell algorithm in terms of memory and an improved variant of the Gauss-Huard method in terms of a smaller growth factor (stability). Some parallel algorithms and results are presented for the new variant  相似文献   

13.
This paper presents a parallel algorithm for computing the inversion of a dense matrix based on modified Jordan's elimination which requires fewer calculation steps than the standard one. The algorithm is proposed for the implementation on the linear array with a small to moderate number of processors which operate in a parallel-pipeline fashion. A communication between neighboring processors is achieved by a common memory module implemented as a FIFO memory module. For the proposed algorithm we define a task scheduling procedure and prove that it is time optimal. In order to compute the speedup and efficiency of the system, two definitions (Amdahl's and Gustafson's) were used. For the proposed architecture, involving two to 16 processors, estimated Gustafson's (Amdahl's) speedups are in the range 1.99 to 13.76 (1.99 to 9.69).  相似文献   

14.
总线互连机群系统上的静态任务调度   总被引:1,自引:1,他引:0  
与大规模并行处理(MPP)系统相比,基于总线互连的机群系统是一种较为廉价的并行计算环境,文中提出了一个基于总线互连机群系统上的静态任务调度算法。在该算法具有3个主要特点:(1)由于不同处理机之间的通信都必须通过共享总线,故在调度时将总线与处理机一些看成是资源加以分配;(2)针对总线适合于广播的特点,在调度中考虑了广播,地于某些应用而言可以大大通信次数,(3)在确定任务在某个处理机上的开始执行时间以  相似文献   

15.
This paper considers elimination methods to solve dense linear systems, in particular a variant of Gaussian elimination due to Huard [13]. This variant reduces the system to an equivalent diagonal system just like Gauss-Jordan elimination, but does not require more floating-point operations than Gaussian elimination. To preserve stability, a pivoting strategy using column interchanges, proposed by Hoffmann [10], is incorporated in the original algorithm. An error analysis is given showing that Huard’s elimination method is as stable as Gauss-Jordan elimination with the appropriate pivoting strategy. This result is proven in a similar way as the proof of stability for Gauss-Jordan elimination given in [4]. Numerical experiments are reported which verify the theoretical error analysis of the Gauss-Huard algorithm.  相似文献   

16.
基于PVM的稠密线性方程组网上并行求解   总被引:4,自引:1,他引:3  
将求解线性方程组的Gauss-Jordan消去法与Gauss列主元消去法结合起来,提出了利用并行计算支撑软件PVM在局域网上高效并行求解稠密线性方程组的算法.该算法处理机间的通信开销较少,实现了负载平衡和各处理机间的全并行工作.用1~24台桌面PC机按两种网络布局方式连接成的局域网,在PVM3.4 on Windows2000、VC 6.0并行计算平台上编程对该算法进行了数值试验,得到了正确的结果.  相似文献   

17.
Gaussian elimination is a canonical linear algebra procedure for solving linear systems of equations. In the last few years, the algorithm has received a lot of attention in an attempt to improve its parallel performance. This article surveys recent developments in parallel implementations of Gaussian elimination for shared memory architecture. Five different flavors are investigated. Three of them are based on different strategies for pivoting: partial pivoting, incremental pivoting, and tournament pivoting. The fourth one replaces pivoting with the Partial Random Butterfly Transformation, and finally, an implementation without pivoting is used as a performance baseline. The technique of iterative refinement is applied to recover numerical accuracy when necessary. All parallel implementations are produced using dynamic, superscalar, runtime scheduling and tile matrix layout. Results on two multisocket multicore systems are presented. Performance and numerical accuracy is analyzed. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

18.
Molecular dynamics simulation is a class of applications that require reducing the execution time of fixed-size problems. This reduction in execution time is important to drug design and protein interaction studies. Many implementations of parallel molecular dynamics have been developed, but very little work has addressed issues related to the use of machines with 50,000 processors for modest-sized problems in the range of 50,000 atoms. Current massively parallel machines present a major obstacle to achieving good performance:communication overhead. In this paper we quantify the communication latency and network bandwidth necessary to achieve 30–40% efficiency on future message-passing machines with sizes on the order of tens of thousands of processors, for executing molecular dynamics problems with the same order of atoms. We derive an analytical model of a benchmark application that simulates a system of helium atoms executing on the Intel Touchstone Delta using an interaction decomposition method. This model is validated and used to extrapolate information on the startup time and network bandwidth. The results indicate that for an MPP with a four-dimensional mesh topology using 400 MHz processors, the communication startup time must be at most 30 clock cycles and the network bandwidth at least 2.3 GB/s. This configuration results in 30–40% efficiency of the MPP for a problem with 50,000 atoms executing on 50,000 processors.  相似文献   

19.
为了解决串行部分选主元的高斯消去算法不能充分利用多核处理器的问题,提出并实现了并行多线程的部分选主元的高斯消去算法,并将整个算法进行了分析和优化,使数据的存储布局和算法的访存模式匹配,从而大幅提高了程序的性能。通过对本地Linux服务器以及美国亚马逊EC2云的多种平台上的实验结果的比较和分析,确定了部分选主元的高斯消去算法受缓存影响较大,所以在CPU和内存/缓存配置较为均衡的平台上运行性能最好。文中展现了一种高效率、扩展性好的多线程并行部分选主元的高斯消去算法以及将一般性串行算法进行并行化和优化的方法。  相似文献   

20.
Task graph pre-scheduling, using Nash equilibrium in game theory   总被引:1,自引:1,他引:0  
Prescheduling algorithms are targeted at restructuring of task graphs for optimal scheduling. Task graph scheduling is a NP-complete problem. This article offers a prescheduling algorithm for tasks to be executed on the networks of homogeneous processors. The proposed algorithm merges tasks to minimize their earliest start time while reducing the overall completion time. To this end, considering each task as a player attempting to reduce its earliest time as much as possible, we have applied the idea of Nash equilibrium in game theory to determine the most appropriate merging. Also, considering each level of a task graph as a player, seeking for distinct parallel processors to execute each of its independent tasks in parallel with the others, the idea of Nash equilibrium in game theory can be applied to determine the appropriate number of processors in a way that the overall idle time of the processors is minimized and the throughput is maximized. The communication delay will be explicitly considered in the comparisons. Our experiments with a number of known benchmarks task graphs and also two well-known problems of linear algebra, LU decomposition and Gauss–Jordan elimination, demonstrate the distinguished scheduling results provided by applying our algorithm. In our study, we consider ten scheduling algorithms: min–min, chaining, A ?, genetic algorithms, simulated annealing, tabu search, HLFET, ISH, DSH with task duplication, and our proposed algorithm (PSGT).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号