首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 41 毫秒
1.
Multiscale analysis technique became a successful remedy to complicated problems in which nonlinear behavior is linked with microscopic damage mechanisms. For efficient parallel multiscale analyses, hierarchical grouping algorithms (e.g., the two‐level ‘coarse‐grained’ method) have been suggested and proved superior over a simple parallelization. Here, we expanded the two‐level algorithm to give rise to a multilayered grouping parallel algorithm suitable for large‐scale multiple‐level multiscale analyses. With practical large‐scale applications, we demonstrated the superior performance of multilayered grouping over the coarse‐grained method. Notably, we show that the unique data transfer rates of the symmetric multiprocessor cluster system can lead to the seemingly ‘super‐linear speedup’ and that there appears to exist the optimal number of subgroups of three‐tiered multiscale analysis. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
An integrated framework and computational technology is described that addresses the issues to foster absolute scalability (A‐scalability) of the entire transient duration of the simulations of implicit non‐linear structural dynamics of large scale practical applications on a large number of parallel processors. Whereas the theoretical developments and parallel formulations were presented in Part 1, the implementation, validation and parallel performance assessments and results are presented here in Part 2 of the paper. Relatively simple numerical examples involving large deformation and elastic and elastoplastic non‐linear dynamic behaviour are first presented via the proposed framework for demonstrating the comparative accuracy of methods in comparison to available experimental results and/or results available in the literature. For practical geometrically complex meshes, the A‐scalability of non‐linear implicit dynamic computations is then illustrated by employing scalable optimal dissipative zero‐order displacement and velocity overshoot behaviour time operators which are a subset of the generalized framework in conjunction with numerically scalable spatial domain decomposition methods and scalable graph partitioning techniques. The constant run times of the entire simulation of ‘fixed‐memory‐use‐per‐processor’ scaling of complex finite element mesh geometries is demonstrated for large scale problems and large processor counts on at least 1024 processors. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

3.
Fast algorithms for the computation of N‐body problems can be broadly classified into mesh‐based interpolation methods, and hierarchical or multiresolution methods. To this latter class belongs the well‐known fast multipole method (FMM ), which offers ??(N) complexity. The FMM is a complex algorithm, and the programming difficulty associated with it has arguably diminished its impact, being a barrier for adoption. This paper presents an extensible parallel library for N‐body interactions utilizing the FMM algorithm. A prominent feature of this library is that it is designed to be extensible, with a view to unifying efforts involving many algorithms based on the same principles as the FMM and enabling easy development of scientific application codes. The paper also details an exhaustive model for the computation of tree‐based N‐body algorithms in parallel, including both work estimates and communications estimates. With this model, we are able to implement a method to provide automatic, a priori load balancing of the parallel execution, achieving optimal distribution of the computational work among processors and minimal inter‐processor communications. Using a client application that performs the calculation of velocity induced by N vortex particles in two dimensions, ample verification and testing of the library was performed. Strong scaling results are presented with 10 million particles on up to 256 processors, including both speedup and parallel efficiency. The largest problem size that has been run with the P etFMM library at this point was 64 million particles in 64 processors. The library is currently able to achieve over 85% parallel efficiency for 64 processes. The performance study, computational model, and application demonstrations presented in this paper are limited to 2D. However, the software architecture was designed to make an extension of this work to 3D straightforward, as the framework is templated over the dimension. The software library is open source under the PETS c license, even less restrictive than the BSD license; this guarantees the maximum impact to the scientific community and encourages peer‐based collaboration for the extensions and applications. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

4.
This paper presents a computational homogenization scheme that is of particular interest for problems formulated in curvilinear coordinates. The main goal of this contribution is to generalize the computational homogenization scheme to a formulation of micro–macro transitions in curvilinear convective coordinates, where different physical spaces are considered at the homogenized macro‐continuum and at the locally attached representative micro‐structures. The deformation and the coordinate system of the micro‐structure are assumed to be coupled with the local deformation and the local coordinate system at a corresponding point of the macro‐continuum. For the consistent formulation of micro–macro transitions, the operations scale‐up and scale‐down are introduced, considering the rotated representation of tensor variables at the different physical reference frames of micro‐ and macro‐structure. The second goal of this paper is to use objective strain measures like the Green–Lagrange strain tensor for the solution of boundary value problems on the micro‐ and macro‐scale by providing the required transformations for the work‐conjugate stress, strain and tangent tensors into variables admissible for the considered micro–macro transitions and satisfying the averaging theorem. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

5.
In this paper we investigate the additional storage overhead needed for a parallel implementation of finite element applications. In particular, we compare the storage requirements for the factorization of the sparse matrices that would occur on a parallel processor vs. a uniprocessor. This variation in storage results from the factorization fill-in. We address the question of whether the storage overhead is so large for parallel implementations that it imposes severe limitations on the problem size in contrast to the problems executed sequentially on a uniprocessor. The storage requirements for the parallel implementation are based upon a new ordering scheme, the combination mesh-based scheme. This scheme uses a domain decomposition method which attempts to balance the processors' loads and decreases the interprocessor communication. The storage requirements for the sequential implementation is based upon the minimum degree algorithm. The difference between the two storage requirements corresponds to the storage overhead attributed to the parallel scheme. Experiments were conducted on regular and irregular, 2-D and 3-D problems. The meshes were decomposed into 2–256 subdomains which can be executed on 2–256 processors, respectively. The total storage requirements or fill-in for most of the 2-D problems were less than a factor of two increase over the sequential execution. In contrast, large 3-D problems had zero increase in storage or fill-in over the sequential execution; the fill-in was less for the parallel execution than the sequential execution. Thus, we conclude that the storage overhead attributed to the use of parallel processors will not impose severe constraints on the problem size. Further, for large 3-D applications, the combination mesh-based algorithm does better than minimum degree for reducing the fill-in.  相似文献   

6.
Most of the recently proposed computational methods for solving partial differential equations on multiprocessor architectures stem from the 'divide and conquer' paradigm and involve some form of domain decomposition. For those methods which also require grids of points or patches of elements, it is often necessary to explicitly partition the underlying mesh, especially when working with local memory parallel processors. In this paper, a family of cost-effective algorithms for the automatic partitioning of arbitrary two- and three-dimensional finite element and finite difference meshes is presented and discussed in view of a domain decomposed solution procedure and parallel processing. The influence of the algorithmic aspects of a solution method (implicit/explicit computations), and the architectural specifics of a multiprocessor (SIMD/MIMD, startup/transmission time), on the design of a mesh partitioning algorithm are discussed. The impact of the partitioning strategy on load balancing, operation count, operator conditioning, rate of convergence and processor mapping is also addressed. Finally, the proposed mesh decomposition algorithms are demonstrated with realistic examples of finite element, finite volume, and finite difference meshes associated with the parallel solution of solid and fluid mechanics problems on the iPSC/2 and iPSC/860 multiprocessors.  相似文献   

7.
A two‐level domain decomposition method is introduced for general shape optimization problems constrained by the incompressible Navier–Stokes equations. The optimization problem is first discretized with a finite element method on an unstructured moving mesh that is implicitly defined without assuming that the computational domain is known and then solved by some one‐shot Lagrange–Newton–Krylov–Schwarz algorithms. In this approach, the shape of the domain, its corresponding finite element mesh, the flow fields and their corresponding Lagrange multipliers are all obtained computationally in a single solve of a nonlinear system of equations. Highly scalable parallel algorithms are absolutely necessary to solve such an expensive system. The one‐level domain decomposition method works reasonably well when the number of processors is not large. Aiming for machines with a large number of processors and robust nonlinear convergence, we introduce a two‐level inexact Newton method with a hybrid two‐level overlapping Schwarz preconditioner. As applications, we consider the shape optimization of a cannula problem and an artery bypass problem in 2D. Numerical experiments show that our algorithm performs well on a supercomputer with over 1000 processors for problems with millions of unknowns. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

8.
Multi‐scale problems are often solved by decomposing the problem domain into multiple subdomains, solving them independently using different levels of spatial and temporal refinement, and coupling the subdomain solutions back to obtain the global solution. Most commonly, finite elements are used for spatial discretization, and finite difference time stepping is used for time integration. Given a finite element mesh for the global problem domain, the number of possible decompositions into subdomains and the possible choices for associated time steps is exponentially large, and the computational costs associated with different decompositions can vary by orders of magnitude. The problem of finding an optimal decomposition and the associated time discretization that minimizes computational costs while maintaining accuracy is nontrivial. Existing mesh partitioning tools, such as METIS, overlook the constraints posed by multi‐scale methods and lead to suboptimal partitions with a high performance penalty. We present a multi‐level mesh partitioning approach that exploits domain‐specific knowledge of multi‐scale methods to produce nearly optimal mesh partitions and associated time steps automatically. Results show that for multi‐scale problems, our approach produces decompositions that outperform those produced by state‐of‐the‐art partitioners like METIS and even those that are manually constructed by domain experts. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

9.
This paper presents a two‐scale approach for the mechanical and numerical modelling of materials with microstructure‐like concrete or fibre‐reinforced concrete in the non‐linear regime. It addresses applications, where the assumption of scale separation as the basis for classical homogenization methods does not hold. This occurs when the resolution of micro and macro scale does not differ ab initio or when evolving fluctuations in the macro‐fields are in the order of the micro scale during the loading progress. Typical examples are localization phenomena. The objective of the present study is to develop an efficient solution method exploiting the physically existing multiscale character of the problem. The proposed method belongs to the superposition‐based methods with local enrichment of the large‐scale solution ū by a small‐scale part u ′. The main focus of the present formulation is to allow for locality of the small‐scale solution within the large‐scale elements to achieve an efficient solution strategy. At the same time the small‐scale information exchange over the large‐scale element boundaries is facilitated while maintaining the accuracy of a refined complete solution. Thus, the emphasis lies on finding appropriate locality constraints for u ′. To illustrate the method the formulation is applied to a damage mechanics based material model for concrete‐like materials. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

10.
We present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.  相似文献   

11.
12.
This paper presents the implementation of advanced domain decomposition techniques for parallel solution of large‐scale shape sensitivity analysis problems. The methods presented in this study are based on the FETI method proposed by Farhat and Roux which is a dual domain decomposition implementation. Two variants of the basic FETI method have been implemented in this study: (i) FETI‐1 where the rigid‐body modes of the floating subdomains are computed explicitly. (ii) FETI‐2 where the local problem at each subdomain is solved by the PCG method and the rigid‐body modes are computed explicitly. A two‐level iterative method is proposed particularly tailored to solve re‐analysis type of problems, where the dual domain decomposition method is incorporated in the preconditioning step of a subdomain global PCG implementation. The superiority of this two‐level iterative solver is demonstrated with a number of numerical tests in serial as well as in parallel computing environments. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

13.
A parallel implementation of the contact algorithm discussed in Part I of this paper has been developed for a non-linear dynamic explicit finite element program to analyse the response of three-dimensional shell structures. The parallel contact algorithm takes advantage of the fact that in general only some parts of the structure will actually be in contact at any given time. Special interprocessor communication routines and a method which enables individual processors to dynamically build local contact domains during execution have been developed. The performance of the parallel contact algorithm has been studied by executing the program on 128 processors of a distributed-memory multiprocessor computer.  相似文献   

14.
15.
Simulations of crack growth that are based on the cohesive surface methodology typically involve ill‐conditioned systems of equations and require much processing time. This paper shows how these systems of equations can be solved efficiently by adopting the domain decomposition approach in which the finite element mesh is partitioned into multiple blocks. The system of equations is then reduced to a much smaller system of equations that is solved with an iterative algorithm in combination with a powerful two‐level preconditioner. Although the solution algorithm is more efficient than a direct solution algorithm on a single‐processor computer, it becomes really attractive when used on a parallel computer. This is demonstrated for a large scale simulation of crack growth in a polymer using a Cray T3E with 64 processors. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

16.
An explicit‐dynamics spatially discontinuous Galerkin (DG) formulation for non‐linear solid dynamics is proposed and implemented for parallel computation. DG methods have particular appeal in problems involving complex material response, e.g. non‐local behavior and failure, as, even in the presence of discontinuities, they provide a rigorous means of ensuring both consistency and stability. In the proposed method, these are guaranteed: the former by the use of average numerical fluxes and the latter by the introduction of appropriate quadratic terms in the weak formulation. The semi‐discrete system of ordinary differential equations is integrated in time using a conventional second‐order central‐difference explicit scheme. A stability criterion for the time integration algorithm, accounting for the influence of the DG discretization stability, is derived for the equivalent linearized system. This approach naturally lends itself to efficient parallel implementation. The resulting DG computational framework is implemented in three dimensions via specialized interface elements. The versatility, robustness and scalability of the overall computational approach are all demonstrated in problems involving stress‐wave propagation and large plastic deformations. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

17.
This paper investigates a computational strategy for studying the interactions between multiple through‐the‐width delaminations and global or local buckling in composite laminates taking into account possible contact between the delaminated surfaces. To achieve an accurate prediction of the quasi‐static response, a very refined discretization of the structure is required, leading to the resolution of very large and highly nonlinear numerical problems. In this paper, a nonlinear finite element formulation along with a parallel iterative scheme based on a multiscale domain decomposition is used for the computation of three‐dimensional mesoscale models. Previous works by the authors already dealt with the simulation of multiscale delamination assuming small perturbations. This paper presents the formulation used to include geometric nonlinearities into this existing multiscale framework and discusses the adaptations that need to be made to the iterative process to ensure the rapid convergence and the scalability of the method in the presence of buckling and delamination. These various adaptations are illustrated by simulations involving large numbers of DOFs. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

18.
A multi‐scale cohesive numerical framework is proposed to simulate the failure of heterogeneous adhesively bonded systems. This multi‐scale scheme is based on Hill's variational principle of energy equivalence between the higher and lower level scales. It provides an easy way to obtain accurate homogenized macroscopic properties while capturing the physics of failure processes at the micro‐scale in sufficient detail. We use an isotropic rate‐dependent damage model to mimic the failure response of the constituents of heterogeneous adhesives. The finite element method is used to solve the equilibrium equation at each scale. A nested iterative scheme inspired by the return mapping algorithm used in computational inelasticity is implemented. We propose a computationally attractive technique to couple the macro‐ and micro‐scales for rate‐dependent constitutive laws. We introduce an adhesive patch test to study the numerical performance, including spatial and temporal convergence of the multi‐scale scheme. We compare the solution of the multi‐scale cohesive scheme with a direct numerical simulation. Finally, we solve mode I and mode II fracture problems to demonstrate failure at the macro‐scale. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

19.
This paper presents a generalized finite element method (GFEM) based on the solution of interdependent global (structural) and local (crack)‐scale problems. The local problems focus on the resolution of fine‐scale features of the solution in the vicinity of three‐dimensional cracks, while the global problem addresses the macro‐scale structural behavior. The local solutions are embedded into the solution space for the global problem using the partition of unity method. The local problems are accurately solved using an hp‐GFEM and thus the proposed method does not rely on analytical solutions. The proposed methodology enables accurate modeling of three‐dimensional cracks on meshes with elements that are orders of magnitude larger than the process zone along crack fronts. The boundary conditions for the local problems are provided by the coarse global mesh solution and can be of Dirichlet, Neumann or Cauchy type. The effect of the type of local boundary conditions on the performance of the proposed GFEM is analyzed. Several three‐dimensional fracture mechanics problems aimed at investigating the accuracy of the method and its computational performance, both in terms of problem size and CPU time, are presented. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

20.
Adaptive mesh refinement and coarsening schemes are proposed for efficient computational simulation of dynamic cohesive fracture. The adaptive mesh refinement consists of a sequence of edge‐split operators, whereas the adaptive mesh coarsening is based on a sequence of vertex‐removal (or edge‐collapse) operators. Nodal perturbation and edge‐swap operators are also employed around the crack tip region to improve crack geometry representation, and cohesive surface elements are adaptively inserted whenever and wherever they are needed by means of an extrinsic cohesive zone model approach. Such adaptive mesh modification events are maintained in conjunction with a topological data structure (TopS). The so‐called PPR potential‐based cohesive model (J. Mech. Phys. Solids 2009; 57 :891–908) is utilized for the constitutive relationship of the cohesive zone model. The examples investigated include mode I fracture, mixed‐mode fracture and crack branching problems. The computational results using mesh adaptivity (refinement and coarsening) are consistent with the results using uniform mesh refinement. The present approach significantly reduces computational cost while exhibiting a multiscale effect that captures both global macro‐crack and local micro‐cracks. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号