期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A massively parallel geometric multigrid solver on hierarchically distributed grids

Sebastian Reiter Andreas Vogel Ingo Heppner Martin Rupp Gabriel Wittum 《Computing and Visualization in Science》2013,16(4):151-164

相似文献

2.

A highly parallel Black–Scholes solver based on adaptive sparse grids

《国际计算机数学杂志》2012,89(9):1212-1238

In this paper, we present a highly efficient approach for numerically solving the Black–Scholes equation in order to price European and American basket options. Therefore, hardware features of contemporary high performance computer architectures such as non-uniform memory access and hardware-threading are exploited by a hybrid parallelization using MPI and OpenMP which is able to drastically reduce the computing time. In this way, we achieve very good speed-ups and are able to price baskets with up to six underlyings. Our approach is based on a sparse grid discretization with finite elements and makes use of a sophisticated adaption. The resulting linear system is solved by a conjugate gradient method that uses a parallel operator for applying the system matrix implicitly. Since we exploit all levels of the operator's parallelism, we are able to benefit from the compute power of more than 100 cores. Several numerical examples as well as an analysis of the performance for different computer architectures are provided. 相似文献

3.

A MIMD implementation of a parallel Euler solver for unstructured grids

V. Venkatakrishnan Horst D. Simon Timothy J. Barth 《The Journal of supercomputing》1992,6(2):117-137

A mesh-vertex finite volume scheme for solving the Euler equations on triangular unstructured meshes is implemented on a MIMD (multiple instruction/multiple data stream) parallel computer. Three partitioning strategies for distributing the work load onto the processors are discussed. Issues pertaining to the communication costs are also addressed. We find that the spectral bisection strategy yields the best performance. The performance of this unstructured computation on the Intel iPSC/860 compares very favorably with that on a one-processor CRAY Y-MP/1 and an earlier implementation on the Connection Machine.The authors are employees of Computer Sciences Corporation. This work was funded under contract NAS 2-12961 相似文献

4.

Assessment of grid interface treatments for multi-block incompressible viscous flow computation

J. Liu W. Shyy 《Computers & Fluids》1996,25(8):719-740

In the multi-block computation of the Navier-Stokes equations, the interface treatment is a key issue. In the present work, we investigate this issue in the context of a pressure-based method using a non-orthogonal grid. For the momentum equations, a straightforward bilinear interpolation seems satisfactory as the interface treatment; on the other hand, because the pressure field depends on the satisfaction of the mass continuity equation, a conservative interface treatment has been found necessary for the pressure-correction equation. Two alternative interface treatments for the pressure-correction equation, one employing the Neumann boundary condition in both grid blocks, based on explicit local, cell-by-cell mass flux conservation, and the other utilizing Neumann-Dirichlet boundary conditions, allowing the interface condition in one block to be derived by interpolating the pressure field from the adjacent block, are assessed in the present work. To evaluate these interface schemes, the laminar flow inside a lid-driven cavity flow, and the turbulent flow around cascades of multiple airfoils have been investigated. For the case tested, both interface treatments give comparable accuracy. The finding that more than one type of interface treatment can work well allows one to devise a flexible multi-block strategy for complex flow computations. 相似文献

5.

A parallel linear system solver

《国际计算机数学杂志》2012,89(3):227-238

A new factorisation method for the solution of a linear system is proposed. The method is similar to an LU type factorisation of the coefficient matrix A where the factors are interlocking matrix quadrants and can be applied on a single-instruction stream parallel machine. 相似文献

6.

A FV-TD electromagnetic solver using adaptive Cartesian grids

Z.J. Wang A.J. PrzekwasYen Liu 《Computer Physics Communications》2002,148(1):17-29

A second-order finite-volume (FV) method has been developed to solve the time-domain (TD) Maxwell equations, which govern the dynamics of electromagnetic waves. The computational electromagnetic (CEM) solver is capable of handling arbitrary grids, including structured, unstructured, and adaptive Cartesian grids, which are topologically arbitrary. It is argued in this paper that the adaptive Cartesian grid is better than a tetrahedral grid for complex geometries considering both efficiency and accuracy. A cell-wise linear reconstruction scheme is employed to achieve second-order spatial accuracy. Second-order time accuracy is obtained through a two-step Runge-Kutta scheme. Issues on automatic adaptive Cartesian grid generation such as cell-cutting and cell-merging are discussed. A multi-dimensional characteristic absorbing boundary condition (MDC-ABC) is developed at the truncated far-field boundary to reduce reflected waves from this artificial boundary. The CEM solver is demonstrated with several test cases with analytical solutions. 相似文献

7.

A cost-optimal parallel tridiagonal system solver

Ferng-Ching Lin Kuo-Liang Chung 《Parallel Computing》1990,15(1-3):189-199

We first show how to transform the solution of an n × n tridiagonal system into suffix computations of continued fractions. Then a parallel substitution scheme is introduced to compute the suffix values. The derived parallel algorithm allows the tridiagonal system to be solved in O(log n) time on an unshuffle network with Θ(n /log n) processors. It is cost-optimal in the sense that processor number times execution time is minimized. Our solver is conceptually simple and easy for implementation. 相似文献

8.

A parallel first-order linear recurrence solver

《Journal of Parallel and Distributed Computing》1987,4(2):117-132

相似文献

9.

A massively parallel Eikonal solver on unstructured meshes

Ganellari Daniel Haase Gundolf Zumbusch Gerhard 《Computing and Visualization in Science》2018,19(5-6):3-18

Algorithms for the numerical solution of the Eikonal equation discretized with tetrahedra are discussed. Several massively parallel algorithms for GPU computing are developed. This includes domain decomposition concepts for tracking the moving wave fronts in sub-domains and over the sub-domain boundaries. Furthermore a low memory footprint implementation of the solver is introduced which reduces the number of arithmetic operations and enables improved memory access schemes. The numerical tests for different meshes originating from the geometry of a human heart document the decreased runtime of the new algorithms.

相似文献

10.

A block LU-SGS implicit unsteady incompressible flow solver on hybrid dynamic grids for 2D external bio-fluid simulations

L.P. Zhang X.H. Chang X.P. Duan Z.Y. Wang H.X. Zhang 《Computers & Fluids》2009,38(2):290-7114

A hybrid dynamic grid generation technique for two-dimensional (2D) morphing bodies and a block lower-upper symmetric Gauss-Seidel (BLU-SGS) implicit dual-time-stepping method for unsteady incompressible flows are presented for external bio-fluid simulations. To discretize the complicated computational domain around 2D morphing configurations such as fishes and insect/bird wings, the initial grids are generated by a hybrid grid strategy firstly. Body-fitted quadrilateral (quad) grids are generated first near solid bodies. An adaptive Cartesian mesh is then generated to cover the entire computational domain. Cartesian cells which overlap the quad grids are removed from the computational domain, and a gap is produced between the quad grids and the adaptive Cartesian grid. Finally triangular grids are used to fill this gap. During the unsteady movement of morphing bodies, the dynamic grids are generated by a coupling strategy of the interpolation method based on ‘Delaunay graph’ and local remeshing technique. With the motion of moving/morphing bodies, the grids are deformed according to the motion of morphing body boundaries firstly with the interpolation strategy based on ‘Delaunay graph’ proposed by Liu and Qin. Then the quality of deformed grids is checked. If the grids become too skewed, or even intersect each other, the grids are regenerated locally. After the local remeshing, the flow solution is interpolated from the old to the new grid. Based on the hybrid dynamic grid technique, an efficient implicit finite volume solver is set up also to solve the unsteady incompressible flows for external bio-fluid dynamics. The fully implicit equation is solved using a dual-time-stepping approach, coupling with the artificial compressibility method (ACM) for incompressible flows. In order to accelerate the convergence history in each sub-iteration, a block lower-upper symmetric Gauss-Seidel implicit method is introduced also into the solver. The hybrid dynamic grid generator is tested by a group of cases of morphing bodies, while the implicit unsteady solver is validated by typical unsteady incompressible flow case, and the results demonstrate the accuracy and efficiency of present solver. Finally, some applications for fish swimming and insect wing flapping are carried out to demonstrate the ability for 2D external bio-fluid simulations. 相似文献

11.

基于不完全算法的并行FPGA SAT求解器

黎铁军马柯帆张建民《计算机工程与科学》2021,43(12):2126-2130

可满足性问题是计算机理论与应用的核心问题。在FPGA上提出了一个基于不完全算法的并行求解器pprobSAT+。使用多线程的策略来减少相关组件的等待时间,提高了求解器效率。此外,不同线程采用共用地址和子句信息的数据存储结构,以减少片上存储器的资源开销。当所有数据均存储在FPGA的片上存储器时,pprobSAT+求解器可以达到最佳性能。实验结果表明,相比于单线程的求解器,所提出的pprobSAT+求解器可获得超过2倍的加速比。相似文献

12.

Accelerated convergence of the numerical simulation of incompressible flow in general curvilinear co-ordinates by discretizations on overset grids

A. Shklyar A. Arbel 《Mathematics and computers in simulation》2009

The convergence rate of a methodology for solving incompressible flow in general curvilinear co-ordinates is analyzed. Overset grids (double-staggered grids type), each defined by the same boundaries as the physical domain are used for discretization. Both grids are Marker and Cell (MAC) quadrilateral meshes with scalar variables (pressure, temperature, etc.) arranged at the center and the Cartesian velocity components at the middle of the sides of the mesh. The problem was checked against benchmark solutions of natural convection in a squeezed cavity, heat transfer in concentric and eccentric horizontal cylindrical annuli and hot cylinder in a duct. Convergence properties of Poisson’s pressure equations which arise from application of the SIMPLE-like procedure are analyzed by several methods: successive overrelaxation, symmetric successive overrelaxation, modified incomplete factorization, and conjugate gradient. A genetic algorithm was developed to solve problems of numerical optimization of calculation time, in a space of iteration numbers and relaxation factors. The application provides a means of making an unbiased comparison between the double-staggered grids method and the standard interpolation method. Furthermore, the convergence rate was demonstrated with the well-known calculation of natural convection heat transfer in concentric horizontal cylindrical annuli. Calculation times when double staggered grids were used were 6–10 times shorter than those achieved by interpolation. 相似文献

13.

A GPU-enabled Finite Volume solver for global magnetospheric simulations on unstructured grids

Andrea Lani Mehmet Sarp Yalim Stefaan Poedts 《Computer Physics Communications》2014

相似文献

14.

Ship motions using single-phase level set with dynamic overset grids 总被引：1，自引：0，他引：1

Pablo M. Carrica Robert V. Wilson Ralph W. Noack Fred Stern 《Computers & Fluids》2007,36(9):1415-1433

The problem of surface ships free to pitch and heave in regular head waves is analyzed numerically with an unsteady Reynolds averaged Navier Stokes (URANS) approach. The unsteady single-phase level set method previously developed by the authors was extended to include six degrees of freedom (6DOF) motions. The method uses rigid overset grids that move with relative motion during the computation, and the interpolation coefficients between the grids are recomputed dynamically every time the grids move. The motions in each time step are integrated implicitly using a predictor-corrector approach. An earth-based reference system is used for the solution of the fluid flow, while a ship-based reference system is used to compute the rigid-body equations of motion. Predicted results for sinkage and trim and resistance at two Froude numbers (medium, Fr = 0.28 and large, Fr = 0.41) were compared against experimental data, showing good agreement. Pitch and heave motions were computed for near-resonant cases at Fr = 0.28 and 0.41, with regular linear head waves with slope ak = 0.025 and wavelength λ = 1.5L, with L the ship length. The predicted motions compare favorably with existing experimental data. A solution for a large amplitude head wave case (ak = 0.075) was also obtained, in which the transom wave breaks and extreme motions are observed. The medium Froude number case was subject to a verification and validation analysis. A problem with two ships pitching and heaving one behind the other is demonstrated. 相似文献

15.

A tridiagonal solver for massively parallel computer systems

Fabio Reale 《Parallel Computing》1990,16(2-3):361-368

This paper describes a parallel solver of tridiagonal systems appropriate for distributed memory computers and implemented on an array of chain-connected T800 Transputers. Each processor in the chain uses the same program to solve its own subset of equations. This implementation is suited, for instance, for solving the heat conduction equation in one-dimensional hydrodynamic codes. The procedure performs a parallel cyclic reduction, a recursive Gaussian elimination on a reduced number of equations and a parallel backward unfolding scheme, with a direct substitution in the reduced equations. The code has been written in Occam2 language. A one-way communication of values between adjacent processors is required at each cycle of both the reduction and the unfolding steps. Due to the number of floating point operations and the amount of communications the implementation described here works efficiently on arrays with more than 4 processors and for more than 50 equations per processor. 相似文献

16.

A parallel Navier-Stokes solver: The Meiko implementation

Michael Prestin Leonid Shtilman 《The Journal of supercomputing》1995,9(4):347-364

A mixed spectral element, pseudospectral, and finite-difference scheme for solving the Navier-Stokes equations is implemented on a Meiko parallel supercomputer. The code for the solution of Navier-Stokes equations for jetlike flows is implemented with a spectral scheme in cross-flow directions, a spectral element scheme in the stream-wise direction, and finite-difference marching in time. Several strategies for distributing the workload onto the processors are discussed. Special attention is paid to using the flexible topology of the Meiko. 相似文献

17.

A parallel solver for huge dense linear systems

J.M. Badia J.L. Movilla J.I. Climente M. Castillo M. Marqués R. Mayo E.S. Quintana-Ortí J. Planelles 《Computer Physics Communications》2011,182(11):2441-2442

HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to yield an efficient solution of the systems on a wide range of parallel platforms, from clusters of processors to massively parallel multiprocessors. It exploits out-of-core strategies to leverage the secondary memory in order to solve huge linear systems O(100.000).The API is based on the parallel linear algebra library PLAPACK, and on its Out-Of-Core (OOC) extension POOCLAPACK. Both PLAPACK and POOCLAPACK use the Message Passing Interface (MPI) as the communication layer and BLAS to perform the local matrix operations.The API provides a friendly interface to the users, hiding almost all the technical aspects related to the parallel execution of the code and the use of the secondary memory to solve the systems. In particular, the API can automatically select the best way to store and solve the systems, depending of the dimension of the system, the number of processes and the main memory of the platform.Experimental results on several parallel platforms report high performance, reaching more than 1 TFLOP with 64 cores to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors.

New version program summary

Program title: Huge Dense System Solver (HDSS)Catalogue identifier: AEHU_v1_1Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_1.htmlProgram obtainable from: CPC Program Library, Queen?s University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 87 062No. of bytes in distributed program, including test data, etc.: 1 069 110Distribution format: tar.gzProgramming language: Fortran90, CComputer: Parallel architectures: multiprocessors, computer clustersOperating system: Linux/UnixHas the code been vectorized or parallelized?: Yes, includes MPI primitives.RAM: Tested for up to 190 GBClassification: 6.5External routines: MPI (http://www.mpi-forum.org/), BLAS (http://www.netlib.org/blas/), PLAPACK (http://www.cs.utexas.edu/~plapack/), POOCLAPACK (ftp://ftp.cs.utexas.edu/pub/rvdg/PLAPACK/pooclapack.ps) (code for PLAPACK and POOCLAPACK is included in the distribution).Catalogue identifier of previous version: AEHU_v1_0Journal reference of previous version: Comput. Phys. Comm. 182 (2011) 533Does the new version supersede the previous version?: YesNature of problem: Huge scale dense systems of linear equations, Ax=B, beyond standard LAPACK capabilities.Solution method: The linear systems are solved by means of parallelized routines based on the LU factorization, using efficient secondary storage algorithms when the available main memory is insufficient.Reasons for new version: In many applications we need to guarantee a high accuracy in the solution of very large linear systems and we can do it by using double-precision arithmetic.Summary of revisions: Version 1.1

•
Can be used to solve linear systems using double-precision arithmetic.
•
New version of the initialization routine. The user can choose the kind of arithmetic and the values of several parameters of the environment.

Running time: About 5 hours to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors using double-precision arithmetic on an eight-node commodity cluster with a total of 64 Intel cores. 相似文献

18.

Rapid meshing of turbomachinery rows using semi-unstructured multi-block conformal grids

Manuel A. Burgos Juan M. Chia Roque Corral Carlos López 《Engineering with Computers》2010,26(4):351-362

A semi-unstructured grid generation method especially tailored for the meshing of turbomachinery blade passages and their associated cavities is presented. The method is based on a smart combination of quasi-3D methods, an ad hoc block decomposition of the domain and a grid-based solid-model-less reconstruction of the computational domain. 相似文献

19.

Calculating viscous incompressible flow past an elliptical cylinder on a parallel computer

N. A. Bik I. N. Molchanov M. F. Yakovlev 《Cybernetics and Systems Analysis》1990,26(1):78-84

An algorithm developed for a parallel computer is described. The algorithm has been simulated on an ES-1060 machine, and the results are compared with those obtained on BÉSM-6 computer.Translated from Kibernetika, No. 1, pp. 64–68, January–February, 1990. 相似文献

20.

一种基于正弦变换的三维泊松方程并行求解算法

林士伟张卫民方民权李松《计算机工程与科学》2017,39(8):1419-1424

泊松方程的数值解法在许多物理或者工程问题上得到广泛应用,但是由于大部分三维泊松方程的离散化格式不具有明显的并行性,实际中使用整体迭代的思想,这使得计算效率和稳定性受到了限制。摒弃了传统数值解法中整体迭代的思想,结合离散正弦变换理论(DST),基于27点四阶差分格式,将三维泊松方程求解算法在算法级进行修改和并行优化,把整个求解问题转化成多个独立的问题进行求解,稳定性和并行性能得到大幅提升。对于确定的离散化形式,可以使用同一套参数解决不同的泊松方程,大大提高了编程效率。基于共享存储并行模型实现了该算法,实验结果显示,对于给出的实例,新算法具有较好的加速效果,计算结果精度误差约为10e-5,在可接受范围内,并且计算精度随着维数的升高具有一定提升。相似文献