期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

杨林峰李陶深李捷陈燕《计算机科学》2011,38(9):204-207

基于内点算法((Interior Point Method,IPM)框架,导出具有分块带边结构系数矩阵的线性规划(Linear Programming, I_P)问题的简化和最简修正方程,并证明最简修正方程的对角分块具有正定性。结合正定矩阵的Cholcsky分解和解藕技术设计了修正方程的并行求解方法,给出了LP的并行内点算法结构。集群环境下的数值实验表明,所提算法具有很好的加速比和可扩展性,适合求解大规模结构化工尹问题。相似文献

2.

An Efficient Algorithm for Matrix-Valued and Vector-Valued Optimal Mass Transport

Yongxin Chen Eldad Haber Kaoru Yamamoto Tryphon T. Georgiou Allen Tannenbaum 《Journal of scientific computing》2018,77(1):79-100

We present an efficient algorithm for recent generalizations of optimal mass transport theory to matrix-valued and vector-valued densities. These generalizations lead to several applications including diffusion tensor imaging, color image processing, and multi-modality imaging. The algorithm is based on sequential quadratic programming. By approximating the Hessian of the cost and solving each iteration in an inexact manner, we are able to solve each iteration with relatively low cost while still maintaining a fast convergence rate. The core of the algorithm is solving a weighted Poisson equation, where different efficient preconditioners may be employed. We utilize incomplete Cholesky factorization, which yields an efficient and straightforward solver for our problem. Several illustrative examples are presented for both the matrix and vector-valued cases. 相似文献

3.

Large capacity equation solver for structural analysis

Digambar P. Mondkar Graham H. Powell 《Computers & Structures》1974,4(4):699-728

This paper extends the procedures and ideas presented in an earlier paper [1] to the out-of-core solution of sets of equations of almost unlimited size. Block partitioning of the coefficient matrix and load vectors is considered and an illustrative example presented. The data transfer procedures, for input to or output from low speed storage, which are essential for an equation solver of this type, are explained in detail. Based on these procedures, a listing of a large capacity general purpose equation solver is presented. Example problems are solved to demonstrate the computational efficiency of the equation solver. 相似文献

4.

A solution algorithm for linear constraint equations in finite element analysis

J.I. Curiskis S. Valliappan 《Computers & Structures》1978,8(1):117-124

A general solution algorithm is presented for the incorporation of a general set of linear constraint equations into a linear algebraic system; such situations arise in the application of the finite element method to a variety of physical problems. Implementation of the algorithm, without need for pre-arranging the equations, into an equation solver using Gauss elimination is developed. The method is most attractive as compared to other approaches for constrained systems. 相似文献

5.

A multilevel parallel solver for block tridiagonal and banded linear systems

Ibrahim N. Hajj Stig Skelboe 《Parallel Computing》1990,15(1-3):21-45

This paper describes an efficient algorithm for the parallel solution of systems of linear equations with a block tridiagonal coefficient matrix. The algorithm comprises a multilevel LU-factorization based on block cyclic reduction and a corresponding solution algorithm.

The paper includes a general presentation of the parallel multilevel LU-factorization and solution algorithms, but the main emphasis is on implementation principles for a message passing computer with hypercube topology. Problem partitioning, processor allocation and communication requirement are discussed for the general block tridiagonal algorithm.

Band matrices can be cast into block tridiagonal form, and this special but important problem is dealt with in detail. It is demonstrated how the efficiency of the general block tridiagonal multilevel algorithm can be improved by introducing the equivalent of two-way Gaussian elimination for the first and the last partitioning and by carefully balancing the load of the processors. The presentation of the multilevel band solver is accompanied by detailed complexity analyses.

The properties of the parallel band solver were evaluated by implementing the algorithm on an Intel iPSC hypercube parallel computer and solving a larger number of banded linear equations using 2 to 32 processors. The results of the evaluation include speed-up over a sequential processor, and the measure values are in good agreement with the theoretical values resulting from complexity analysis. It is found that the maximum asymptotic speed-up of the multilevel LU-factorization using p processors and load balancing is approximated well by the expression (p +6)/4.

Finally, the multilevel parallel solver is compared with solvers based on row and column interleaved organization. 相似文献

6.

EM algorithms for Gaussian mixtures with split-and-merge operation

Zhihua ZhangAuthor Vitae Chibiao ChenAuthor VitaeJian SunAuthor Vitae Kap Luk ChanAuthor Vitae 《Pattern recognition》2003,36(9):1973-1983

In order to alleviate the problem of local convergence of the usual EM algorithm, a split-and-merge operation is introduced into the EM algorithm for Gaussian mixtures. The split-and-merge equations are first presented theoretically. These equations show that the merge operation is a well-posed problem, whereas the split operation is an ill-posed problem because it is the inverse procedure of the merge. Two methods for solving this ill-posed problem are developed through the singular value decomposition and the Cholesky decomposition. Accordingly, a new modified EM algorithm is constructed. Our experiments demonstrate that this algorithm is efficient for unsupervised color image segmentation. 相似文献

7.

Towards optimal in-core equation solving

Digambar P. Mondkar Graham H. Powell 《Computers & Structures》1974,4(3):531-548

A number of equation solving algorithms, all based on Gauss elimination, are reviewed. A modified Crout reduction procedure of almost optimal efficiency is presented for the sparsely populated coefficient matrices commonly arising in stiffness analysis of structural systems. A detailed discussion on how this procedure differs from the commonly used Gauss elimination algorithm is given. Operation count formulas of the procedure are derived for various phases of the solution process. An extension of the modified Crout procedure for partial reduction, for use in substructure analysis, is also presented Listings of two general purpose in-core equation solvers using two different storage schemes are given. Examples are presented to demonstrate the computational efficiency of the procedure. 相似文献

8.

求解二维对流扩散方程的投影迭代法

袁冬芳曹富军葛永斌《计算机工程与应用》2013,(4):39-42,86

鉴于目前流行的求解大型稀疏代数方程组的投影迭代法中,为提高迭代效率,在迭代前通常需要对稀疏矩阵进行预处理,改善迭代矩阵的条件数,从而减少迭代次数,这使得发展稀疏矩阵的存储技术变得尤为关键。基于二维对流扩散方程的四阶紧致差分格式,将其转化为代数方程组,得到其三对角块形式的系数矩阵,利用稀疏矩阵存储技术和预条件迭代法进行求解,并与传统的中心差分格式所得数值解进行比较,充分说明了方法的高效性和可靠性。相似文献

9.

Finite iterative algorithms for the generalized Sylvester-conjugate matrix equation {AX+BY=E\overline{X}F+S}

Ai-Guo Wu Guang-Ren Duan Yan-Ming Fu Wei-Jun Wu 《Computing》2010,89(3-4):147-170

This paper investigates the generalized Sylvester-conjugate matrix equation, which includes the normal Sylvester-conjugate, Kalman–Yakubovich-conjugate and generalized Sylvester matrix equations as its special cases. An iterative algorithm is presented for solving such a kind of matrix equations. This iterative method can give an exact solution within finite iteration steps for any initial values in the absence of round-off errors. Another feature of the proposed algorithm is that it is implemented by original coefficient matrices. By specifying the proposed algorithm, iterative algorithms for some special matrix equations are also developed. Two numerical examples are given to illustrate the effectiveness of the proposed methods. 相似文献

10.

On a general optimal algorithm for multirate output feedbackcontrollers for linear stochastic periodic systems

Nie-Zen Yen Yung-Chun Wu 《Automatic Control, IEEE Transactions on》1993,38(6):939-943

A modified optimal algorithm for multirate output feedback controllers of linear stochastic periodic systems is developed. By combining the discrete-time linear quadratic regulation (LQR) control problem and the discrete-time stochastic linear quadratic regulation (SLQR) control problem to obtain an extended linear quadratic regulation (ELQR) control problem, one derives a general optimal algorithm to balance the advantages of the optimal transient response of the LQR control problem and the optimal steady-state regulation of the SLQR control problem. In general, the solution of this algorithm is obtained by solving a set of coupled matrix equations. Special cases for which the coupled matrix equations can be reduced to a discrete-time algebraic Riccati equation are discussed. A reducable case is the optimal algorithm derived by H.M. Al-Rahmani and G.F. Franklin (1990), where the system has complete state information and the discrete-time quadratic performance index is transformed from a continuous-time one 相似文献

11.

A simple equation solver and its application to financial modelling

Emanuel Derman Christopher J. Van Wyk 《Software》1984,14(12):1169-1181

We discuss a simple algorithm for solving sets of simultaneous equations. The algorithm can solve systems of linear and some kinds of non-linear equations, although it has nowhere near the power of a general non-linear equation solver. Its principal advantages over more general algorithms are simplicity and speed. Versions of the algorithm have been used in a graphics language and in a system for interactively modifying the equations that constitute financial models. We discuss the second application in more detail here. 相似文献

12.

二维三温能量方程的Krylov子空间迭代求解 总被引：3，自引：0，他引：3

莫则尧符尚武《数值计算与计算机应用》2003,24(2):133-143

§1.引言惯性约束聚变(ICF)是实现热核聚变的一条重要途径,在ICF的研究中,数值模拟是非常重要的研究手段之一,为了全面、深入和细致地研究ICF的物理过程,特别是内爆动力学过程,文[1]的作者研制了求解二维三温(电子温度、离子温度和光子温度)辐射流体动相似文献

13.

大规模结构化二次规划并行算法

下载免费PDF全文

李捷李陶深杨林峰《计算机工程》2011,37(16):48-50

在内点算法(IPM)框架基础上,分析具有分块带边结构系数矩阵与箭形结构二次项的二次规划(QP)问题,导出其既约与最简既约修正方程.对既约修正方程系数矩阵进行置换,使其具有箭形分块结构,并结合该结构与解耦技术给出修正方程的并行求解算法,设计QP问题的并行IPM结构.在集群环境下的数值实验结果表明,该算法具有较好的加速比和... 相似文献

14.

Gradient-based maximal convergence rate iterative method for solving linear matrix equations

《国际计算机数学杂志》2012,89(3):515-527

This paper is concerned with numerical solutions to general linear matrix equations including the well-known Lyapunov matrix equation and Sylvester matrix equation as special cases. Gradient based iterative algorithm is proposed to approximate the exact solution. A necessary and sufficient condition guaranteeing the convergence of the algorithm is presented. A sufficient condition that is easy to compute is also given. The optimal convergence factor such that the convergence rate of the algorithm is maximized is established. The proposed approach not only gives a complete understanding on gradient based iterative algorithm for solving linear matrix equations, but can also be served as a bridge between linear system theory and numerical computing. Numerical example shows the effectiveness of the proposed approach. 相似文献

15.

Forward and back substitution algorithms on GPU: a case study on modified incomplete Cholesky Preconditioner for three-dimensional finite difference method

Yigitcan Aksari Harun Artuner 《The Journal of supercomputing》2012,62(1):550-572

Forward and back substitution algorithms are widely used for solving linear systems of equations after performing LU decomposition on the coefficient matrix. They are also essential in the implementation of high performance preconditioners which improve the convergence properties of the various iterative methods. In this paper, we describe an efficient approach to implementing forward and back substitution algorithms on a GPU and provide the implementation details of these algorithms on a Modified Incomplete Cholesky Preconditioner for the Conjugate Gradient (CG) algorithm. The resulting forward and back substitution algorithms are then used on a Modified Incomplete Cholesky Preconditioned Conjugate Gradient method to solve the sparse, symmetric, positive definite and linear systems of equations arising from the discretization of three dimensional finite difference ground-water flow models. By utilizing multiple threads, the proposed method yields speedups up to 60 times on GeForce GTX 280 compared to CPU implementation and up to 4.8 times speedup compared to cuSPARSE library function optimized for GPU by NVIDIA. 相似文献

16.

Supernodal sparse Cholesky factorization on graphics processing units

Dan Zou Yong Dou Song Guo Rongchun Li Lin Deng 《Concurrency and Computation》2014,26(16):2713-2726

Sparse Cholesky factorization is the most computationally intensive component in solving large sparse linear systems and is the core algorithm of numerous scientific computing applications. A large number of sparse Cholesky factorization algorithms have previously emerged, exploiting architectural features for various computing platforms. The recent use of graphics processing units (GPUs) to accelerate structured parallel applications shows the potential to achieve significant acceleration relative to desktop performance. However, sparse Cholesky factorization has not been explored sufficiently because of the complexity involved in its efficient implementation and the concerns of low GPU utilization. In this paper, we present a new approach for sparse Cholesky factorization on GPUs. We present the organization of the sparse matrix supernode data structure for GPU and propose a queue‐based approach for the generation and scheduling of GPU tasks with dense linear algebraic operations. We also design a subtree‐based parallel method for multi‐GPU system. These approaches increase GPU utilization, thus resulting in substantial computational time reduction. Comparisons are made with the existing parallel solvers by using problems arising from practical applications. The experiment results show that the proposed approaches can substantially improve sparse Cholesky factorization performance on GPUs. Relative to a highly optimized parallel algorithm on a 12‐core node, we were able to obtain speedups in the range 1.59× to 2.31× by using one GPU and 1.80× to 3.21× by using two GPUs. Relative to a state‐of‐the‐art solver based on supernodal method for CPU‐GPU heterogeneous platform, we were able to obtain speedups in the range 1.52× to 2.30× by using one GPU and 2.15× to 2.76× by using two GPUs. Concurrency and Computation: Practice and Experience, 2013. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

17.

B样条曲线的降阶公式及近似降阶方法 总被引：8，自引：0，他引：8

潘日晶姚志强潘日红《计算机学报》2003,26(10):1255-1260

已有的B样条曲线降阶方法，由于无降阶公式可循，对于可降阶曲线常要通过解一系列线性方程组来实现降阶．该文给出了B样条曲线的降阶公式，使得可直接用降阶公式对可降阶曲线进行降阶．利用降阶公式和约束优化方法，文中进一步给出了B样条曲线的一种近似降阶方法和相应的算法，使得在用约束优化方法求出可降阶的近似曲线后，就可直接用降阶公式求出降阶曲线，简化了降阶过程．该方法应用范围广且简单实用．相似文献

18.

基于GPU-CUDA的共轭斜量法实现及性能对比

彭土有《计算机时代》2014,(4):4-6

偏微分方程数值解法(包括有限差分法、有限元法)以及大量的数学物理方程数值解法最终都会演变成求解大型线性方程组。因此,探讨快速、稳定、精确的大型线性方程组解法一直是数值计算领域不断深入研究的课题且具有特别重要的意义。在迭代法中,共轭斜量法(又称共轭梯度法)被公认为最好的方法之一。但是,该方法最大缺点是仅适用于线性方程组系数矩阵为对称正定矩阵的情况,而且常规的CPU算法实现非常耗时。为此,通过将线性方程组系数矩阵作转换成对称矩阵后实施基于GPU-CUDA的快速共轭斜量法来解决一般性大型线性方程组的求解问题。试验结果表明:在求解效率方面,基于GPU-CUDA的共轭斜量法运行效率高,当线性方程组阶数超过3000时,其加速比将超过14;在解的精确性与求解过程的稳定性方面,与高斯列主元消去法相当。基于GPU-CUDA的快速共轭斜量法是求解一般性大型线性方程组快速而非常有效的方法。相似文献

19.

An implicit compact scheme solver for two-dimensional multicomponent flows

Mikhail Noskov Michele Benzi Mitchell D. Smooke 《Computers & Fluids》2007,36(2):376-397

A 2D implicit compact scheme solver has been implemented for the vorticity-velocity formulation in the case of nonreacting, multicomponent, axisymmetric, low Mach number flows. To stabilize the discrete boundary value problem, two sets of boundary closures are introduced to couple the velocity and vorticity fields. A Newton solver is used for solving steady-state and time-dependent equations. In this solver, the Jacobian matrix is formulated and stored in component form. To solve the system of linearized equations within each iteration of Newton’s method, preconditioned Bi-CGSTAB is used in combination with a matrix-vector product computed in component form. The almost dense Jacobian matrix is approximated by a partial Jacobian. For the preconditioner equation, the partial Jacobian is approximately factored using several methods. In a detailed study of several preconditioning techniques, a promising method based on ILUT preconditioning in combination with reordering and double scaling using the MC64 algorithm by Duff and Koster is selected. To validate the implicit compact scheme solver, several nonreacting model problems have been considered. At least third order accuracy in space is recovered on nonuniform grids. A comparison of the results of the implicit compact scheme solver with the results of a traditional implicit low order solver shows an order of magnitude reduction of computer memory and time using the compact scheme solver in the case of time-dependent mixing problems. 相似文献

20.

Quadratic programming solver for structural optimisation using SQP algorithm

《Advances in Engineering Software》2002,33(7-10):669-674

The dual quadratic programming algorithm of Goldfarb and Idnani is implemented as a solver for a sequential quadratic programming algorithm. Initially the algorithm is briefly described. As the algorithm requires the inverse of the Cholesky factor of the Hessian matrix at each iteration a procedure is presented to directly obtain a matrix that multiplied by its transpose gives the BFGS update of the Hessian. A procedure is then presented to triangularise the updated factor using two series of Givens rotations. In order to increase efficiency a ‘warm start’ strategy is proposed whereby the choice of constraints to enter the active set is based on information of previous SQP iterations. Finally two examples are given to demonstrate the efficiency and robustness of the implementation. 相似文献