期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Explicit preconditioned domain decomposition schemes for solving nonlinear boundary value problems

《Computers & Mathematics with Applications》2003,45(1-3):263-272

A new class of inner-outer iterative procedures in conjunction with Picard-Newton methods based on explicit preconditioning iterative methods for solving nonlinear systems is presented. Explicit preconditioned iterative schemes, based on the explicit computation of a class of domain decomposition generalized approximate inverse matrix techniques are presented for the efficient solution of nonlinear boundary value problems on multiprocessor systems. Applications of the new composite scheme on characteristic nonlinear boundary value problems are discussed and numerical results are given. 相似文献

2.

A parallel finite element scheme for thermo-hydro-mechanical (THM) coupled problems in porous media

Wenqing Wang Georg Kosakowski Olaf Kolditz 《Computers & Geosciences》2009,35(8):1631-1641

Many applied problems in geoscience require knowledge about complex interactions between multiple physical and chemical processes in the sub-surface. As a direct experimental investigation is often not possible, numerical simulation is a common approach. The numerical analysis of coupled thermo-hydro-mechanical (THM) problems is computationally very expensive, and therefore the applicability of existing codes is still limited to simplified problems. In this paper we present a novel implementation of a parallel finite element method (FEM) for the numerical analysis of coupled THM problems in porous media. The computational task of the FEM is partitioned into sub-tasks by a priori domain decomposition. The sub-tasks are assigned to the CPU nodes concurrently. Parallelization is achieved by simultaneously establishing the sub-domain mesh topology, synchronously assembling linear equation systems in sub-domains and obtaining the overall solution with a sub-domain linear solver (parallel BiCGStab method with Jacobi pre-conditioner). The present parallelization method is implemented in an object-oriented way using MPI for inter-processor communication. The parallel code was successfully tested with a 2-D example from the international DECOVALEX benchmarking project. The achieved speed-up for a 3-D extension of the test example on different computers demonstrates the advantage of the present parallel scheme. 相似文献

3.

A domain splitting algorithm for parabolic problems

H. Blum S. Lisky R. Rannacher 《Computing》1992,49(1):11-23

In the parallel implementation of solution methods for parabolic problems one has to find a proper balance between the parallel efficiency of a fully explicit scheme and the need for stability and accuracy which requires some degree of implicitness. As a compromise a domain splitting scheme is proposed which is locally implicit on slightly overlapping subdomains but propagates the corresponding boundary data by a simple explicit process. The analysis of this algorithm shows that it has satisfactory stability and approximation properties and can be effectively parallelized. These theoretical results are confirmed by numerical tests on a transputer system. 相似文献

4.

Distributed generic approximate sparse inverses

George A. Gravvanis Christos K. Filelis-Papadopoulos 《The Journal of supercomputing》2014,70(1):365-384

The need for accuracy in the solution of linear systems derived from the discretization of partial differential equations leads to large sparse linear systems. The solution of sparse linear systems requires efficient scalable methods. Iterative solvers require efficient parallel preconditioning methods to solve effectively sparse linear systems. Herewith, a new parallel algorithm for the generic approximate sparse inverse matrix method for distributed memory systems is proposed. The computation of the distributed generic approximate sparse inverse matrix is based on a column-wise approach, which allows the separation to independent problems that can be handled in parallel without synchronization points or intermediate communications. This is achieved by reforming the generic approximate sparse inverse matrix algorithm and its process of computation with a new partial solution method for the computation of the nonzero elements of each column dictated by the approximate inverse sparsity pattern. Moreover, an algorithmic scheme is proposed for the efficient distribution of data amongst the available workstations, along with a load balancing scheme for problems with large standard deviation in the number of nonzero elements per column. Numerical results are presented for the proposed schemes for various model problems. 相似文献

5.

Parallel solution of contact shape optimization problems based on Total FETI domain decomposition method

Vít Vondrák Tomáš Kozubek Alexandros Markopoulos Zdeněk Dostál 《Structural and Multidisciplinary Optimization》2010,42(6):955-964

An application of a variant of the parallel domain decomposition method that we call Total FETI or TFETI (Total Finite Element Tearing and Interconnecting) for the solution of contact problems of elasticity to the parallel solution of contact shape optimization problems is described. A unique feature of the TFETI algorithm is its capability to solve large contact problems with optimal, i.e., asymptotically linear complexity. We show that the algorithm is even more efficient for the solution of the contact shape optimization problems as it can exploit effectively a specific structure of the auxiliary problems arising in the semi-analytic sensitivity analysis. Thus the triangular factorizations of the stiffness matrices of the subdomains are carried out in parallel only once for each design step, the evaluation of the components of the gradient of the cost function can be carried out in parallel, and even the evaluation of each component of the gradient itself can be further parallelized using the standard TFETI scheme. Theoretical results which prove asymptotically linear complexity of the solution are reported and documented by numerical experiments. The results of numerical solution of a 3D contact shape optimization problem confirm the high degree of parallelism of the algorithm. 相似文献

6.

Parallel Solutions for Large-Scale General Sparse Nonlinear Systems of Equations

下载免费PDF全文

HU Chengyi 《计算机科学技术学报》1996,11(3):257-271

In solving application problems,many large-scale nonlinear systems of equaions result in sparse Jacobian matrices.Such nonlinear systems are called sparse nonlinear systems.The irregularity of the locations of nonzrero elements of a general sparse matrix makes it very difficult to generally map sparse matrix computations to multiprocessors for parallel processing in a well balanced manner.To overcome this difficulty,we define a new storage scheme for general sparse matrices in this paper,With the new storage scheme,we develop parallel algorithms to solve large-scale general sparse systems of equations by interval Newton/Generalized bisection methods which reliably find all numerical solutions within a given domain.I n Section 1,we provide an introduction to the addressed problem and the interval Newton‘s methods.In Section 2,some currently used storage schemes for sparse systems are reviewed.In Section 3,new index schemes to store general sparse matrices are reported.In Section 4,we present a parallel algorithm to evaluate a general sparse Jacobian matrix.In Section 5,we present a parallel algorithm to solve the corresponding interval linear system by the all-row preconditioned scheme.Conclusions and future work are discussed in Section 6. 相似文献

7.

A parallel Self Mesh-Adaptive N-body method based on approximate inverses

P. E. Kyziropoulos C. K. Filelis-Papadopoulos G. A. Gravvanis C. Efthymiopoulos 《The Journal of supercomputing》2017,73(12):5197-5220

A new parallel Self Mesh-Adaptive N-body method based on approximate inverses is proposed. The scheme is a three-dimensional Cartesian-based method that solves the Poisson equation directly in physical space, using modified multipole expansion formulas for the boundary conditions. Moreover, adaptive-mesh techniques are utilized to form a class of separate smaller n-body problems that can be solved in parallel and increase the total resolution of the system. The solution method is based on multigrid method in conjunction with the symmetric factored approximate sparse inverse matrix as smoother. The design of the parallel Self Mesh-Adaptive method along with discussion on implementation issues for shared memory computer systems is presented. The new parallel method is evaluated through a series of benchmark simulations using N-body models of isolated galaxies or galaxies interacting with dwarf companions. Furthermore, numerical results on the performance and the speedups of the scheme are presented. 相似文献

8.

Hybrid parallel multimethod hyperheuristic for mixed-integer dynamic optimization problems in computational systems biology

González Patricia Argüeso-Alejandro Pablo Penas David R. Pardo Xoan C. Saez-Rodriguez Julio Banga Julio R. Doallo Ramón 《The Journal of supercomputing》2019,75(7):3471-3498

This paper describes and assesses a parallel multimethod hyperheuristic for the solution of complex global optimization problems. In a multimethod hyperheuristic, different metaheuristics cooperate to outperform the results obtained by any of them isolated. The results obtained show that the cooperation of individual parallel searches modifies the systemic properties of the hyperheuristic, achieving significant performance improvements versus the sequential and the non-cooperative parallel solutions. Here we present and evaluate a hybrid parallel scheme of the multimethod, using both message-passing (MPI) and shared memory (OpenMP) models. The hybrid parallelization allows to achieve a better trade-off between performance and computational resources, through a compromise between diversity (number of islands) and intensity (number of threads per island). For the performance evaluation, we considered the general problem of reverse engineering nonlinear dynamic models in systems biology, which yields very large mixed-integer dynamic optimization problems. In particular, three very challenging problems from the domain of dynamic modeling of cell signaling were used as case studies. In addition, experiments have been carried out in a local cluster, a large supercomputer and a public cloud, to show the suitability of the proposed solution in different execution platforms.

相似文献

9.

A near-wall strategy for buoyancy-affected turbulent flows using stabilized FEM with applications to indoor air flow simulation

《Computer Methods in Applied Mechanics and Engineering》2005,194(36-38):3797-3816

We consider the numerical simulation of buoyancy-affected, incompressible turbulent flows using a stabilized finite-element method. We present an approach which combines two domain decomposition methods (DDM). Firstly, we apply a DDM with full overlap for near-wall modelling, which can be interpreted as an improved wall-function concept. Secondly, a non-overlapping DDM of iteration-by-subdomains-type for the parallel solution of the linearized problems is employed. For this scheme, we demonstrate both the accuracy for a benchmark problem and the applicability to realistic indoor-air flow problems. 相似文献

10.

Adjoint design sensitivity analysis of dynamic crack propagation using peridynamic theory

Min-Yeong Moon Jae-Hyun Kim Youn Doh Ha Seonho Cho 《Structural and Multidisciplinary Optimization》2015,51(3):585-598

Based on the peridynamics of the reformulated continuum theory, an adjoint design sensitivity analysis (DSA) method is developed for the solution of dynamic crack propagation problems using the explicit scheme of time integration. Non-shape DSA problems are considered for the dynamic crack propagation including the successive branching of cracks. The adjoint variable method is generally suitable for path-independent problems but employed in this bond-based peridynamics since its path is readily available. Since both original and adjoint systems possess time-reversal symmetry, the trajectories of systems are symmetric about the u-axis. We take advantage of the time-reversal symmetry for the efficient and concurrent computation of original and adjoint systems. Also, to improve the numerical efficiency of large scale problems, a parallel computation scheme is employed using a binary space decomposition method. The accuracy of analytical design sensitivity is verified by comparing it with the finite difference one. The finite difference method is susceptible to the amount of design perturbations and could result in inaccurate design sensitivity for highly nonlinear peridynamics problems with respect to the design. It is demonstrated that the peridynamic adjoint sensitivity involving history-dependent variables can be accurate only if the path of the adjoint response analysis is identical to that of the original response. 相似文献

11.

Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems

G.R. Mudalige M.B. Giles J. Thiyagalingam I.Z. Reguly C. Bertolli P.H.J. Kelly A.E. Trefethen 《Parallel Computing》2013

OP2 is a high-level domain specific library framework for the solution of unstructured mesh-based applications. It utilizes source-to-source translation and compilation so that a single application code written using the OP2 API can be transformed into multiple parallel implementations for execution on a range of back-end hardware platforms. In this paper we present the design and performance of OP2’s recent developments facilitating code generation and execution on distributed memory heterogeneous systems. OP2 targets the solution of numerical problems based on static unstructured meshes. We discuss the main design issues in parallelizing this class of applications. These include handling data dependencies in accessing indirectly referenced data and design considerations in generating code for execution on a cluster of multi-threaded CPUs and GPUs. Two representative CFD applications, written using the OP2 framework, are utilized to provide a contrasting benchmarking and performance analysis study on a number of heterogeneous systems including a large scale Cray XE6 system and a large GPU cluster. A range of performance metrics are benchmarked including runtime, scalability, achieved compute and bandwidth performance, runtime bottlenecks and systems energy consumption. We demonstrate that an application written once at a high-level using OP2 is easily portable across a wide range of contrasting platforms and is capable of achieving near-optimal performance without the intervention of the domain application programmer. 相似文献

12.

Using multiple levels of parallelism to enhance the performance of domain decomposition solvers

L. Giraud A. Haidar S. Pralet 《Parallel Computing》2010,36(5-6):285-296

Large-scale scientific simulations are nowadays fully integrated in many scientific and industrial applications. Many of these simulations rely on modelisations based on PDEs that lead to the solution of huge linear or nonlinear systems of equations involving millions of unknowns. In that context, the use of large high performance computers in conjunction with advanced fully parallel and scalable numerical techniques is mandatory to efficiently tackle these problems.In this paper, we consider a parallel linear solver based on a domain decomposition approach. Its implementation naturally exploits two levels of parallelism, that offers the flexibility to combine the numerical and the parallel implementation scalabilities. The combination of the two levels of parallelism enables an optimal usage of the computing resource while preserving attractive numerical performance. Consequently, such a numerical technique appears as a promising candidate for intensive simulations on massively parallel platforms.The robustness and parallel numerical performance of the solver is investigated on large challenging linear systems arising from the finite element discretization in structural mechanics applications. 相似文献

13.

A parallel method for the numerical solution of integro-differential equation with positive memory

《Computer Methods in Applied Mechanics and Engineering》2003,192(41-42):4641-4658

An efficient parallel numerical method is proposed for an integro-differential equation with positive memory. Instead of solving the equation in classical time-marching methods which require massive storage of solutions of previous time steps in order to advance to a next time step, the Fourier–Laplace transformation in time is applied to obtain a set of complex-valued, elliptic problems parameterized by points on a contour in the complex plane. Using the independence of an elliptic problem corresponding to one contour point is independent of those elliptic problems corresponding to other contour points, all elliptic problems can be solved in parallel essentially without data communications. Then the time domain solution can be obtained by the Fourier–Laplace inversion formula. An error analysis and the numerical implementation of this parallel method is presented. 相似文献

14.

Direct dynamic analysis of shells of revolution using high-precision finite elements

Herman Suryoutomo Phillip L. Gould Prodyot K. Basu 《Computers & Structures》1977,7(3):425-433

For the transient dynamic analysis of structural systems, the direct numerical integration of the equations of motion may be regarded as an alternative to the mode superposition method for linear problems and a necessity for nonlinear problems. When compared to a modal superposition solution, the direct integration approach is attractive in that the eigenvalue problem is avoided. Depending on the amount of information required from the dynamic analysis, e.g. frequencies, frequencies and mode shapes, and/or a complete time history, a direct integration scheme may prove to be more efficient than a modal superposition solution for some linear problems as well.The purpose of this study is to develop and demonstrate a direct integration algorithm which is compatible with an existing high-precision rotational shell finite element. Excellent comparative efficiency for static problems was achieved with this element by the incorporation of the exact geometry, the utilization of high-order interpolation polynomials and, yet, the retention of only a minimum number of nodal variables in the global formulation. Likewise, accurate and efficient results for the free vibration analysis of rotational shells were facilitated by the inclusion of a consistent mass matrix and the utilization of a rationally justified kinematic condensation procedure. The approach to the direct integration stage is strongly tempered by the established characteristics of this element which enable a given shell to be modeled accurately in the spatial domain with a comparatively coarse discretization.The equations of motion for a shell of revolution under conservative loading are derived from Hamilton's variational principle and specialized for the discretization of a rotational shell into curved shell elements. Degrees of freedom in excess of those required to establish minimum (C°) continuity at the nodal circles are eliminated through kinematic condensation. Some guidance as to the proper order of the polynominal approximations for a dynamic analysis is provided by earlier free vibration studies. Whereas the condensation is exact for static problems, it is only approximate for dynamic response and it was found that the accuracy of the eigenvalues obtained for the reduced problem decreases with increasing order of the condensed functions. This tendency is counted by the desirability of using sufficiently high-order interpolations so as to permit accurate stress computations, both at the nodal circles and between nodes since a coarse discretization is necessary to realize maximum efficiency. It was found that cubic polynomials were generally satisfactory from both standpoints, except in localized regions of high stress gradients where quintic polynomials were employed. The finite element discretizations for the direct integration studies were selected on this basis.For the high-precision finite element at hand, the efficiencies achieved in the space domain are demonstrated by the ability to achieve precise solutions with relatively coarse discretization patterns. The resulting comparatively large elements are not subject to accurate representation by diagonal mass matrices so that an implicit, consistent mass approach is followed. Efficiency in the time domain as well rests on the successful modeling of rotational shells subject to dynamic loading using coarse discretitations in space and large increments in time. Computational efficiency and accuracy are demonstrated for various problems documented in the literature, including a shallow spherical cap subject to a step pulse and a hyperboloidal shell under a simulated dynamic wind pressure. 相似文献

15.

A Parallel Boundary Value Technique for Singularly Perturbed Two-Point Boundary Value Problems

Vigo-aguiar J. Natesan S. 《The Journal of supercomputing》2004,27(2):195-206

A class of singularly perturbed two-point boundary-value problems (BVPs) for second-order ordinary differential equations (DEs) is considered here. In order to obtain numerical solution to these problems, an iterative non-overlapping domain decomposition method is suggested. The BVPs are independent in each subdomain and one can use parallel computers to solve these BVPs. One of the characteristics of the method is that the number of processors available is a free parameter of the method. Practical experiments on a Silicon Graphics Origin 200, with 4 MIPS R10000 processors have been performed, showing the reliability and performance of the proposed parallel schemes. Error estimates for the solution and numerical examples are provided. 相似文献

16.

A class decomposition approach for GA-based classifiers

《Engineering Applications of Artificial Intelligence》2005,18(3):271-278

Genetic algorithm (GA) has been used as a conventional method for classifiers to evolve solutions adaptively for classification problems. In this paper, a new approach using class decomposition is proposed to improve the performance of GA-based classifiers. A classification problem is fully partitioned into several class modules in the output domain and each module is responsible for solving a fraction of the original problem. These modules are trained in parallel and independently and the results obtained are integrated and evolved further for a final solution. A scheme based on Fisher's linear discriminant (FLD) computation is used to estimate the difficulty of separating two classes. Based on the FLD information derived, different integration approaches are proposed and their performance is compared. The experiment results on a benchmark data set show that class decomposition can achieve higher classification rate than the normal GA and FLD-based integration improves the classification accuracy further. 相似文献

17.

On the Easy Use of Scientific Computing Services for Large Scale Linear Algebra and Parallel Decision Making with the P-Grade Portal

Hrachya Astsatryan Vladimir Sahakyan Yuri Shoukouryan Michel Daydé Aurelie Hurault Ronan Guivarch Harutyun Terzyan Levon Hovhannisyan 《Journal of Grid Computing》2013,11(2):239-248

Scientific research is becoming increasingly dependent on the large-scale analysis of data using distributed computing infrastructures (Grid, cloud, GPU, etc.). Scientific computing (Petitet et al. 1999) aims at constructing mathematical models and numerical solution techniques for solving problems arising in science and engineering. In this paper, we describe the services of an integrated portal based on the P-Grade (Parallel Grid Run-time and Application Development Environment) portal (http://www.p-grade.hu) that enables the solution of large-scale linear systems of equations using direct solvers, makes easier the use of parallel block iterative algorithm and provides an interface for parallel decision making algorithms. The ultimate goal is to develop a single sign on integrated multi-service environment providing an easy access to different kind of mathematical calculations and algorithms to be performed on hybrid distributed computing infrastructures combining the benefits of large clusters, Grid or cloud, when needed. 相似文献

18.

Non-linear systems in the frequency domain: Energy transfer filters

S. A. Billings Zi-Qiang Lang 《International journal of control》2013,86(14):1066-1081

The analysis of non-linear systems in the frequency domain is studied and a new class of filters, called energy transfer filters, is introduced. While conventional linear filter design procedures are based on the principle of attenuating unwanted effects the new energy transfer filter design concept exploits non-linearity to allow energy to be moved to new frequency locations. The ability to design non-linear filters that can move energy to designed locations in the frequency domain introduces new degrees of freedom into filter design and offers new solution possibilities to many filtering problems. 相似文献

19.

Parallel and Systolic Solution of Normalized Explicit Approximate Inverse Preconditioning

Gravvanis G. A. Giannoutakis K. M. Bekakos M. P. Efremides O. B. 《The Journal of supercomputing》2004,30(2):77-96

A new class of normalized approximate inverse matrix techniques, based on the concept of sparse normalized approximate factorization procedures are introduced for solving sparse linear systems derived from the finite difference discretization of partial differential equations. Normalized explicit preconditioned conjugate gradient type methods in conjunction with normalized approximate inverse matrix techniques are presented for the efficient solution of sparse linear systems. Theoretical results on the rate of convergence of the normalized explicit preconditioned conjugate gradient scheme and estimates of the required computational work are presented. Application of the new proposed methods on two dimensional initial/boundary value problems is discussed and numerical results are given. The parallel and systolic implementation of the dominant computational part is also investigated. 相似文献

20.

基于多授权中心属性基加密的多域云访问控制方案

杨小东杨苗苗刘婷婷王彩芬《计算机工程与科学》2018,40(7):1192-1198

针对多授权属性基加密方案的合谋攻击和多域共享数据问题,提出了一种基于多授权中心属性基加密的多域云访问控制方案。中央认证机构不参与用户私钥的生成过程,有效避免了用户与授权机构之间的联合攻击;通过线性秘密共享方案和代理重加密技术,云服务器对上传的数据文件进行重加密,实现了单域和多域用户数据的共享。分析结果表明,新方案在用户私钥生成和文件加/解密上具有较高的性能,并在q parallel BDHE假设下是自适应性安全的。相似文献