共查询到20条相似文献,搜索用时 125 毫秒
1.
提出了一种大规模热传导方程并行求解的策略,采用了分布式内存和压缩矩阵技术解决超大规模稀疏矩阵的存储及其计算,整合了多种Krylov子空间方法和预条件子技术来并行求解大规模线性方程组,基于面向对象设计实现了具体应用与算法的低耦合.在Linux机群系统上进行了性能测试,程序具有良好的加速比和计算性能. 相似文献
2.
热传导方程在地下水流动数值模拟、油藏数值模拟等工程计算中有着广泛应用,其并行实现是加速问题求解速度、提高问题求解规模的重要手段,因此热传导方程的并行求解具有重要意义。对Krylov子空间方法中的CG和GMRES算法进行并行分析,并对不同的预处理CG算法作了比较。在Linux集群系统上,以三维热传导模型为例进行了数值实验。实验结果表明,CG算法比GMRES算法更适合建立三维热传导模型的并行求解。此外,CG算法与BJACOBI预条件子的整合在求解该热传导模型时,其并行程序具有良好的加速比和效率。因此,采用BJACOBI预处理技术的CG算法是一种较好的求解三维热传导模型的并行方案。 相似文献
3.
4.
5.
稀疏线性方程组求解中的预处理技术综述 总被引:1,自引:0,他引:1
稀疏线性方程组的高效求解是数值计算方向的研究热点之一,其中包括预处理技术的研究。本文从技术分类的角度,总结了稀疏线性方程组求解中的预处理技术。首先,介绍了填充元缩减策略,旨在减少求解过程中存储量的同时,仍能保持矩阵的稀疏结构;其次,介绍了不同结构系数矩阵的多种匹配技术,旨在获得矩阵的对角优势性;最后,介绍了具有天然并行性的因子分解近似逆预条件子构造方法和不完全分解预条件中的并行求解技术等。 相似文献
6.
模块化多电平换流器(MMC)的各半桥子模块均由两个开关组(1个IGBT和1个反并联的二极管)构成。针对MMC在包含的子模块规模较大的情况下,对采用电路模型分割法对MMC进行分割后的子模块进行仿真求解时仍然会占用较多资源,效率不高的问题,提出了基于数值计算模型的MMC半桥子模块仿真验证方法。首先通过分析三相MMC及其半桥子模块(HBSM)的工作机制,把半桥型子模块中的两个开关组等效为在高、低阻态不断切换的等效电阻并给出了其等效电路。然后针对电容支路的离散化问题,根据梯形积分法推导了MMC半桥型子模块的数值求解公式,给出了数值计算电路模型。最后基于MATLAB仿真平台建立了基于数值计算模型的半桥子模块仿真验证模型,通过与详细模型子模块的仿真波形对比分析,结果表明了所建立的子模块数值计算模型是可行的。 相似文献
7.
采用双向区域重叠组合法,基于三维层次式块边界元法实现了芯片级的互连电容提取.该方法将芯片切分为大量小规模区域。用全局场求解器计算各子区域电容矩阵,可方便地组合出整个芯片的电容矩阵;同时分析了其计算量和精度,并进行了并行计算实验.对实际版图结构的数值实验验证了有关分析结论,表明该方法高效、可靠、并行性能好. 相似文献
8.
针对传统串行迭代法求解大波数Helmholtz方程存在效率低下且受限于单机内存的问题,提出了一种基于消息传递接口(Message Passing Interface,MPI) 的并行预条件迭代法。该算法利用复移位拉普拉斯算子对Helmholtz方程进行预条件处理,联合稳定双共轭梯度法和基于矩阵的多重网格法来求解预条件方程离散后的大规模线性系统,在Linux集群系统上基于 MPI环境实现了求解算法的并行计算,重点解决了多重网格的并行划分、信息传递和多重网格组件的构建问题。数值实验表明,对于大波数问题,提出的算法具有良好的并行加速比,相较于串行算法极大地提高了计算效率。 相似文献
9.
基于曙光并行机的超大规模非线性方程组并行算法研究 总被引:8,自引:0,他引:8
该文讨论了一类求解大规模非线性方程组算法的并行性能及其在曙光并行机上的实现过程,与传统的算法不同之处是用一个块对角矩阵作为迭代矩阵,且该矩阵可由一个仅包含向量内积和矩阵与向量乘积的递推关系简便计算得到,在对算法进行描述之后,分析了算法的并行加速比和存储需求,讨论了算法在基于消息传递的MPI并行环境下的实现流程,数值计算表明理论分析与数值结果相比,算法在分布式并行环境下具有有较好的并行主攻较低的存储要求,可适用于大规模科学与工程的高性能计算。 相似文献
10.
提出一种按照计算域分解的并行化方法来构建等几何分析的刚度矩阵和右侧向量.将计算域分解成为若干个不相交的子区域,然后为每个区域分配一个处理器,所有处理器并行进行子区域上面的计算,所有处理器完成子区域的计算以后,使用一个快速的归并算法完成线性系统的装配.实验表明,本文提出的方法在8核的机器上可以达到6.46的加速比,能够在4秒左右的时间计算680万个矩阵元素个数.使用Intel MKL稀疏求解器来求解线性系统,本文的等几何分析求解器能够在大约10秒的时间内求解52万的自由度,本文的方法比ISOGAT速度要快上万倍. 相似文献
11.
Moment stability for linear systems with a nonwhite parametric noise is considered. A method of reduction of the study of this stability to the study of stability for large-scale matrices is proposed. Mean square stability diagrams for random harmonic oscillator are presented. 相似文献
12.
13.
《Computer Methods in Applied Mechanics and Engineering》1986,54(1):75-91
The formulation for the dynamic analysis of undamped linear structural systems using the finite element method results in two element matrices; the mass and stiffness matrices, that describe the element inertia and stiffness properties. However, these matrices are not sufficient to describe the dynamics of structures that undergo large rigid-body motion. Other element matrices, in addition to the mass and stiffness matrices, are required to account for the inertia coupling between gross motion and elastic deformation. These matrices are time-invariant and can be generated and assembled in the same manner as the mass and stiffness matrices are assembled in linear structural dynamics. An inherent relation between these matrices and the deformable body mean axes exists. This paper is the first of two parts. It presents the two-dimensional and three-dimensional formulation of the system equations of motion of inertia-variant flexible bodies. In particular, Euler parameters are employed to describe the rotations of the body reference in the spatial analysis. In Part II [13], this formulation is applied to the impact analysis of a large-scale constrained flexible aircraft which are modeled as a multi-body system consisting of interconnected rigid and flexible components. 相似文献
14.
Martin S. Andersen Joachim Dahl Lieven Vandenberghe 《Optimization methods & software》2013,28(3):396-423
Algorithms are presented for evaluating gradients and Hessians of logarithmic barrier functions for two types of convex cones: the cone of positive semidefinite matrices with a given sparsity pattern and its dual cone, the cone of sparse matrices with the same pattern that have a positive semidefinite completion. Efficient large-scale algorithms for evaluating these barriers and their derivatives are important in interior-point methods for nonsymmetric conic formulations of sparse semidefinite programs. The algorithms are based on the multifrontal method for sparse Cholesky factorization. 相似文献
15.
Xiaoyan LIU Yi LIU Bohong YIN Hailong YANG Zhongzhi LUAN Depei QIAN 《Frontiers of Computer Science》2023,17(4):174104
Although matrix multiplication plays an essential role in a wide range of applications, previous works only focus on optimizing dense or sparse matrix multiplications. The Sparse Approximate Matrix Multiply (SpAMM) is an algorithm to accelerate the multiplication of decay matrices, the sparsity of which is between dense and sparse matrices. In addition, large-scale decay matrix multiplication is performed in scientific applications to solve cutting-edge problems. To optimize large-scale decay matrix multiplication using SpAMM on supercomputers such as Sunway Taihulight, we present swSpAMM, an optimized SpAMM algorithm by adapting the computation characteristics to the architecture features of Sunway Taihulight.Specifically, we propose both intra-node and inter-node optimizations to accelerate swSpAMM for large-scale execution. For intra-node optimizations, we explore algorithm parallelization and block-major data layout that are tailored to better utilize the architecture advantage of Sunway processor. For inter-node optimizations, we propose a matrix organization strategy for better distributing sub-matrices across nodes and a dynamic scheduling strategy for improving load balance across nodes. We compare swSpAMM with the existing GEMM library on a single node as well as large-scale matrix multiplication methods on multiple nodes. The experiment results show that swSpAMM achieves a speedup up to 14.5× and 2.2× when compared to xMath library on a single node and 2D GEMM method on multiple nodes, respectively. 相似文献
16.
Parallel factor analysis (PARAFAC) is a tensor (multiway array) factorization method which allows to find hidden factors (component matrices) from a multidimensional data. Most of the existing algorithms for the PARAFAC, especially the alternating least squares (ALS) algorithm need to compute Khatri-Rao products of tall factors and multiplication of large matrices, and due to this require high computational cost and large memory and are not suitable for very large-scale-problems. Hence, PARAFAC for large-scale data tensors is still a challenging problem. In this paper, we propose a new approach based on a modified ALS algorithm which computes Hadamard products, instead Khatri-Rao products, and employs relatively small matrices. The new algorithms are able to process extremely large-scale tensors with billions of entries. Extensive experiments confirm the validity and high performance of the developed algorithm in comparison with other well-known algorithms. 相似文献
17.
This note presents a general method which reduces the computational requirements in the state feedback design of large-scale multivariable systems. The given system is first transformed into a general block canonical form by using simple equivalent transformation. The state feedback problem is then reformulated in terms of a Sylvester equation. Finally, the transformed system matrices along with certain assumed block forms for unknown matrices enable the Sylvester equation to be decomposed and solved effectively. 相似文献
18.
19.
大系统的理论与应用近十余年来有了相当大的发展,本文研究了这类系统的稳定性问题。首先对非定常线性系统的稳定性给出了一个简单的几何判据,然后建立起大系统的稳定性判据。最后考虑了大系统的结构,从而建立了简化的稳定性判据。 相似文献