期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A refinement-tree based partitioning method for dynamic load balancing with adaptively refined grids

William F. Mitchell 《Journal of Parallel and Distributed Computing》2007

The partitioning of an adaptive grid for distribution over parallel processors is considered in the context of adaptive multilevel methods for solving partial differential equations. A partitioning method based on the refinement-tree is presented. This method applies to most types of grids in two and three dimensions. For triangular and tetrahedral grids, it is guaranteed to produce connected partitions; no other partitioning method makes this guarantee. The method is related to the OCTREE method and space filling curves. Numerical results comparing it with several popular partitioning methods show that it computes partitions in an amount of time similar to fast load balancing methods like recursive coordinate bisection, and with mesh quality similar to slower, more optimal methods like the multilevel diffusive method in ParMETIS. 相似文献

2.

Parallel generation of triangular and quadrilateral meshes 总被引：2，自引：0，他引：2

L. Lmmer M. Burghardt 《Advances in Engineering Software》2000,31(12):929-936

相似文献

3.

A hybrid ant colony optimization approach for finite element mesh decomposition

A.?Bahreininejad Email author 《Structural and Multidisciplinary Optimization》2004,28(5):307-316

This paper examines the application of the ant colony optimization algorithm to the partitioning of unstructured adaptive meshes for parallel explicit time-stepping finite element analysis. The concept of the ant colony optimization technique for finding approximate solutions to combinatorial optimization problems is described.The application of ant colony optimization for partitioning finite element meshes based on triangular elements is described.A recursive greedy algorithm optimization method is also presented as a local optimization technique to improve the quality of the solutions given by the ant colony optimization algorithm. The partitioning is based on the recursive bisection approach.The mesh decomposition is carried out using normal and predictive modes for which the predictive mode uses a trained multilayered feed-forward neural network which estimates the number of triangular elements that will be generated after finite elements mesh generation is carried out.The performance of the proposed hybrid approach for the recursive bisection of finite element meshes is examined by decomposing two mesh examples. 相似文献

4.

Subdomain generation for non-convex parallel finite element domains

B. H. V. Topping A. I. Khan 《Advances in Engineering Software》1996,25(2-3):253-266

In this paper the Subdomain Generation Method (SGM), originally formulated in Khan & Topping (1993; Khan, A. I. & Topping, B. H. V., Subdomain generation for parallel finite element analysis. Comput. Syst. Engng, 1993, 4(4/6), 473–488) for convex finite element domains, is generalized for arbitrary shaped domains. Modifications to the original SGM are described which allow partitioning of non-convex domains. These modifications have been made to the formulation of the optimization module and the predictive module. The examples presented in Khan & Topping (1993) have been re-worked and two more examples have been added which demonstrate the application of the method to arbitrary shaped domains. It is shown with the aid of the examples that the method provides well-balanced subdomains very efficiently and allows parallel adaptive mesh generation. The method in its present form may be used to partition unstructured graphs in two or three dimensions. Since the computational cost for the mesh partitioning with this method depends solely upon the initial coarse mesh, hence the computational cost does not increase with the increase in the mesh density of the final mesh. The method in its present form is unsuitable for relatively coarse grained parallel computers, however the modifications which would impart a greater degree of scalability to this method are discussed. 相似文献

5.

Adaptive Local Refinement with Octree Load Balancing for the Parallel Solution of Three-Dimensional Conservation Laws

J.E. Flaherty R.M. Loy M.S. Shephard B.K. Szymanski J.D. Teresco L.H. Ziantz 《Journal of Parallel and Distributed Computing》1997,47(2):717

Conservation laws are solved by a local Galerkin finite element procedure with adaptive space-time mesh refinement and explicit time integration. The Courant stability condition is used to select smaller time steps on smaller elements of the mesh, thereby greatly increasing efficiency relative to methods having a single global time step. Processor load imbalances, introduced at adaptive enrichment steps, are corrected by using traversals of an octree representing a spatial decomposition of the domain. To accommodate the variable time steps, octree partitioning is extended to use weights derived from element size. Partition boundary smoothing reduces the communications volume of partitioning procedures for a modest cost. Computational results comparing parallel octree and inertial partitioning procedures are presented for the three-dimensional Euler equations of compressible flow solved on an IBM SP2 computer. 相似文献

6.

基于多层k路划分的三维网格并行任务分配策略

于方郑晓薇孙晓鹏《计算机工程与设计》2010,31(2)

为解决传统任务划分方法在三维网格并行计算任务分配阶段产生的通信开销大的问题,提出了一种基于多层k路划分算法的并行任务分配策略.首先利用多层k路划分算法划分三维网格,将任务划分问题转化为图划分问题,然后基于图划分结果给出一个任务映射并行算法将计算任务分配到各计算结点.在深腾1800上求解三维网格模型最短路径问题的实验结果表明,相比于传统的行列划分任务分配策略,该策略在保证负裁平衡的同时有效地降低了通信开销,算法的运行时间减少,加速比得到提高. 相似文献

7.

A parallel FE–FV scheme to solve fluid flow in complex geologic media

Dim Coumou Stephan Matthäi Sebastian Geiger Thomas Driesner 《Computers & Geosciences》2008,34(12):1697-1707

相似文献

8.

Efficient mapping and implementation of matrix algorithms on a hypercube

Vladimir Cherkassky Ross Smith 《The Journal of supercomputing》1988,2(1):7-27

It is well known that parallelism by itself does not lead to higher speeds. This study shows how to put parallelism to best use, that is, how to find an optimal balance between communication and computation overheads for two parallel matrix algorithms. The problem graph for matrix algorithms analyzed in this paper is a two-dimensional grid (toroidal mesh) which is mapped onto a hypercube topology. To perform matrix operations on a hypercube, a matrix is partitioned into several submatrices which are stored and manipulated in the nodes. We seek to find an optimal matrix partitioning to minimize overall execution time. The NCUBE parallel machine is used for experimental performance evaluation. For matrix multiplication, we derive an exact analytical model to determine the optimal partitioning size and perform its experimental verification on the NCUBE parallel processor. For a parallel Gaussian elimination known as the balanced algorithm, we present performance measurements and an approximate analytical model for performance evaluation. Our analyses show that the optimal submatrix size is typically small and does not depend on the original matrix size. 相似文献

9.

Subdomain generation using emergent ant colony optimization 总被引：1，自引：0，他引：1

A. Bahreininejad P. Hesamfar 《Computers & Structures》2006,84(28):1719-1728

Finite elements mesh decomposition is a well known optimization problem and is used to split a computationally expensive finite elements mesh into smaller subdomains for parallel finite elements analysis.The ant colony optimization is a type of algorithm that seeks to model the emergent behaviour observed in ant colonies and utilize this behaviour to solve combinatorial problems. This technique has been applied to several problems, most of which are graph related because the ant colony metaphor can be most easily applied to such types of problems. This paper examines the application of ant colony optimization algorithm to the partitioning of unstructured adaptive meshes for parallel explicit time-stepping finite elements analysis.The concept of ant colony optimization technique in addition to the notion of swarm intelligence for finding approximate solutions to combinatorial optimization problems is described. This algorithm combines the features of the classical ant colony optimization technique with swarm intelligence to form a model which is an artificial system designed to perform a certain task.The application of the ant colony optimization for partitioning finite elements meshes based on triangular elements using the swarm intelligence concept is described. A recursive greedy algorithm optimization method is also presented as a local optimization technique to improve the quality of the solutions given by the ant colony optimization algorithm. The partitioning is based on the recursive bisection approach.The mesh partitioning is carried out using normal and predictive modes for which the predictive mode uses a trained multi-layered feedforward neural network that estimates the number of triangular elements that will be generated after finite elements mesh generation is carried out.The performance of the proposed hybrid approach for the recursive bisection of finite elements meshes is examined by decomposing two mesh examples and comparing them with a well known finite elements domain decomposer. 相似文献

10.

An Adaptive Moving Mesh Finite Element Solution of the Regularized Long Wave Equation

Changna Lu Weizhang Huang Jianxian Qiu 《Journal of scientific computing》2018,74(1):122-144

An adaptive moving mesh finite element method is proposed for the numerical solution of the regularized long wave (RLW) equation. A moving mesh strategy based on the so-called moving mesh PDE is used to adaptively move the mesh to improve computational accuracy and efficiency. The RLW equation represents a class of partial differential equations containing spatial-time mixed derivatives. For the numerical solution of those equations, a \(C^0\) finite element method cannot apply directly on a moving mesh since the mixed derivatives of the finite element approximation may not be defined. To avoid this difficulty, a new variable is introduced and the RLW equation is rewritten into a system of two coupled equations. The system is then discretized using linear finite elements in space and the fifth-order Radau IIA scheme in time. A range of numerical examples in one and two dimensions, including the RLW equation with one or two solitary waves and special initial conditions that lead to the undular bore and solitary train solutions, are presented. Numerical results demonstrate that the method has a second order convergence and is able to move and adapt the mesh to the evolving features in the solution. 相似文献

11.

A Prefix Code Matching Parallel Load-Balancing Method for Solution-Adaptive Unstructured Finite Element Graphs on Distributed Memory Multicomputers

Chung Yeh-Ching Liao Ching-Jung Yang Don-Lin 《The Journal of supercomputing》2000,15(1):25-49

In this paper, we propose a prefix code matching parallel load-balancing method (PCMPLB) to efficiently deal with the load imbalance of solution-adaptive finite element application programs on distributed memory multicomputers. The main idea of the PCMPLB method is first to construct a prefix code tree for processors. Based on the prefix code tree, a schedule for performing load transfer among processors can be determined by concurrently and recursively dividing the tree into two subtrees and finding a maximum matching for processors in the two subtrees until the leaves of the prefix code tree are reached. We have implemented the PCMPLB method on an SP2 parallel machine and compared its performance with two load-balancing methods, the directed diffusion method and the multilevel diffusion method, and five mapping methods, the AE/ORB method, the AE/MC method, the MLkP method, the PARTY library method, and the JOSTLE-MS method. An unstructured finite element graph Truss was used as a test sample. During the execution, Truss was refined five times. Three criteria, the execution time of mapping/load-balancing methods, the execution time of an application program under different mapping/load-balancing methods, and the speedups achieved by mapping/load-balancing methods for an application program, are used for the performance evaluation. The experimental results show that (1) if a mapping method is used for the initial partitioning and this mapping method or a load-balancing method is used in each refinement, the execution time of an application program under a load-balancing method is less than that of the mapping method. (2) The execution time of an application program under the PCMPLB method is less than that of the directed diffusion method and the multilevel diffusion method. 相似文献

12.

Parallel processing, neural networks and genetic algorithms 总被引：4，自引：0，他引：4

B.H.V. Topping J. Sziveri A. Bahreinejad J.P.B. Leite B. Cheng 《Advances in Engineering Software》1998,29(10):763-786

In an earlier paper[1] some recent developments in computational technology to structural engineering were described. The developments included: parallel and distributed computing; neural networks; and genetic algorithms. In this paper, the authors concentrate on parallel implementations of neural networks and genetic algorithms. In the final section of the paper the authors show how a parallel finite element analysis may be undertaken in an efficient manner by preprocessing of the finite element model using a genetic algorithm utilizing a neural network predictor. This preprocessing is the partitioning of the finite element mesh into sub-domains to ensure load balancing and minimum interprocessor communication during the parallel finite element analysis on a MIMD distributed memory computer. © 1998 Published by Elsevier Science Limited. All rights reserved. 相似文献

13.

A parallel linear system solver

《国际计算机数学杂志》2012,89(3):227-238

A new factorisation method for the solution of a linear system is proposed. The method is similar to an LU type factorisation of the coefficient matrix A where the factors are interlocking matrix quadrants and can be applied on a single-instruction stream parallel machine. 相似文献

14.

A parallel polynomial Jacobi–Davidson approach for dissipative acoustic eigenvalue problems

Tsung-Ming Huang Feng-Nan Hwang Sheng-Hong Lai Weichung Wang Zih-Hao Wei 《Computers & Fluids》2011,45(1):207-214

We consider a rational algebraic large sparse eigenvalue problem arising in the discretization of the finite element method for the dissipative acoustic model in the pressure formulation. The presence of nonlinearity due to the frequency-dependent impedance poses a challenge in developing an efficient numerical algorithm for solving such eigenvalue problems. In this article, we reformulate the rational eigenvalue problem as a cubic eigenvalue problem and then solve the resulting cubic eigenvalue problem by a parallel restricted additive Schwarz preconditioned Jacobi–Davidson algorithm (ASPJD). To validate the ASPJD-based eigensolver, we numerically demonstrate the optimal convergence rate of our discretization scheme and show that ASPJD converges successfully to all target eigenvalues. The extraneous root introduced by the problem reformulation does not cause any observed side effect that produces an undesirable oscillatory convergence behavior. By performing intensive numerical experiments, we identify an efficient correction-equation solver, an effective algorithmic parameter setting, and an optimal mesh partitioning. Furthermore, the numerical results suggest that the ASPJD-based eigensolver with an optimal mesh partitioning results in superlinear scalability on a distributed and parallel computing cluster scaling up to 192 processors. 相似文献

15.

非结构网格布点方法研究进展 总被引：2，自引：0，他引：2

聂玉峰刘莹《计算机工程与应用》2008,44(32):35-40

用有限元方法求解偏微分方程初边值问题首先要离散求解区域,即网格生成,并且网格质量的好坏直接影响着有限元解的收敛性和精度,所以关于网格生成有很多学者从各自的领域出发做了大量的研究工作。论文关注于非结构网格点的布置方法,对已有的具有代表性的布点方法的研究进展进行了分类综述。相似文献

16.

全局部分重复计算划分 总被引：1，自引：0，他引：1

王轶然陈莉冯晓兵张兆庆《计算机研究与发展》2006,43(12):2158-2165

并行化编译器常常采用拥有者计算规则来进行计算划分，为了提高性能和可扩展性，后来引入了部分重复计算划分的概念．这是一种针对并行程序节点间局部性的重要优化方法．以前的部分重复计算划分局限于一个循环套的范围，因此新提出了全局部分重复计算划分的问题，给出一个简化的性能模型和一个基于整数线性规划的全局部分重复计算划分框架．实验结果表明，其结果显著优于局限于单个循环套的部分重复计算划分，比以前提出的启发式方法有更好的适应性．相似文献

17.

时间-空间混和有限元 总被引：1，自引：0，他引：1

下载免费PDF全文

钟万勰高强《动力学与控制学报》2007,5(1):1-7

分析动力学与分析结构力学在数学理论上是一致的.振动与结构力学问题,其实只是一个符号之差.分析力学方法对两方面可通用.双曲型偏微分方程与椭圆型偏微分方程也是差一个符号.虽然性质不同,但分析上有共同之处.本文提出在有限元分析方面,不用对时间、空间分别离散而是组成混和的时空混和有限元网格.数值结果表明,时空混和有限元是有前途的. 相似文献

18.

Massively parallel adaptive mesh refinement and coarsening for dynamic fracture simulations

Andrei Alhadeff Sofie E. Leon Waldemar Celes Glaucio H. Paulino 《Engineering with Computers》2016,32(3):533-552

We use the graphical processing unit (GPU) to perform dynamic fracture simulation using adaptively refined and coarsened finite elements and the inter-element cohesive zone model. Due to the limited memory available on the GPU, we created a specialized data structure for efficient representation of the evolving mesh given. To achieve maximum efficiency, we perform finite element calculation on a nodal basis (i.e., by launching one thread per node and collecting contributions from neighboring elements) rather than by launching threads per element, which requires expensive graph coloring schemes to avoid concurrency issues. These developments made possible the parallel adaptive mesh refinement and coarsening schemes to systematically change the topology of the mesh. We investigate aspects of the parallel implementation through microbranching examples, which has been explored experimentally and numerically in the literature. First, we use a reduced-scale version of the experimental specimen to demonstrate the impact of variation in floating point operations on the final fracture pattern. Interestingly, the parallel approach adds some randomness into the finite element simulation on the structured mesh in a similar way as would be expected from a random mesh. Next, we take advantage of the speedup of the implementation over a similar serial implementation to simulate a specimen whose size matches that of the actual experiment. At this scale, we are able to make more direct comparisons to the original experiment and find excellent agreement with those results. 相似文献

19.

并行网络蠕虫模拟中任务优化划分的研究 总被引：1，自引：0，他引：1

王晓锋方滨兴云晓春张宏莉《计算机学报》2006,29(8):1367-1374

为提高并行网络蠕虫模拟的性能，需要对蠕虫模拟任务进行合理的划分．鉴于基于图划分工具的任务划分方法存在的不足，提出了并行网络蠕虫模拟任务的优化划分方法：以并行网络蠕虫模拟运行时间估计模型作为优化目标函数，采用改进的模拟退火算法实现对蠕虫模拟任务的划分．在PDNS上进行的Slammer蠕虫传播模拟实验表明，该优化划分方法较基于图划分工具的方法提高模拟性能20％以上．相似文献

20.

Development of a parallel Poisson's equation solver with adaptive mesh refinement and its application in field emission prediction

K.-H. Hsu 《Computer Physics Communications》2006,174(12):948-960

A parallel electrostatic Poisson's equation solver coupled with parallel adaptive mesh refinement (PAMR) is developed in this paper. The three-dimensional Poisson's equation is discretized using the Galerkin finite element method using a tetrahedral mesh. The resulting matrix equation is then solved through the parallel conjugate gradient method using the non-overlapping subdomain-by-subdomain scheme. A PAMR module is coupled with this parallel Poisson's equation solver to adaptively refine the mesh where the variation of potentials is large. The parallel performance of the parallel Poisson's equation is studied by simulating the potential distribution of a CNT-based triode-type field emitter. Results with ∼100 000 nodes show that a parallel efficiency of 84.2% is achieved in 32 processors of a PC-cluster system. The field emission properties of a single CNT triode- and tetrode-type field emitter in a periodic cell are computed to demonstrate their potential application in field emission prediction. 相似文献