共查询到20条相似文献,搜索用时 0 毫秒
1.
Level‐set topology optimization with many linear buckling constraints using an efficient and robust eigensolver 下载免费PDF全文
Peter D. Dunning Evgueni Ovtchinnikov Jennifer Scott H. Alicia Kim 《International journal for numerical methods in engineering》2016,107(12):1029-1053
Linear buckling constraints are important in structural topology optimization for obtaining designs that can support the required loads without failure. During the optimization process, the critical buckling eigenmode can change; this poses a challenge to gradient‐based optimization and can require the computation of a large number of linear buckling eigenmodes. This is potentially both computationally difficult to achieve and prohibitively expensive. In this paper, we motivate the need for a large number of linear buckling modes and show how several features of the block Jacobi conjugate gradient (BJCG) eigenvalue method, including optimal shift estimates, the reuse of eigenvectors, adaptive eigenvector tolerances and multiple shifts, can be used to efficiently and robustly compute a large number of buckling eigenmodes. This paper also introduces linear buckling constraints for level‐set topology optimization. In our approach, the velocity function is defined as a weighted sum of the shape sensitivities for the objective and constraint functions. The weights are found by solving an optimization sub‐problem to reduce the mass while maintaining feasibility of the buckling constraints. The effectiveness of this approach in combination with the BJCG method is demonstrated using a 3D optimization problem. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
2.
SUNG YI M. FOUAD AHMAD HARRY H. HILTON 《International journal for numerical methods in engineering》1997,40(10):1857-1875
Recently much attention has been paid to high-performance computing and the development of parallel computational strategies and numerical algorithms for large-scale problems. In this present study, a finite element procedure for the dynamic analyses of anisotropic viscoelastic composite shell structures by using degenerated 3-D elements has been studied on vector and coarse grained and massively parallel machines. CRAY hardware performance monitors such as Flowtrace and Perftrace tools are used to obtain performance data for subroutine program modules and specified code segments. The performances of conjugated gradient method, the Cray sparse matrix solver and the Feable solver are evaluated. SIMD and MIMD parallel implementation of the finite element algorithm for dynamic simulation of viscoelastic composite structures on the CM-5 is also presented. The performance studies have been conducted in order to evaluate efficiency of the numerical algorithm on this architecture versus vector processing CRAY systems. Parametric studies on the CM-5 as well as the CRAY system and benchmarks for various problem sizes are shown. The second study is to evaluate how effectively the finite element procedures for viscoelastic composite structures can be solved in the Single Instruction Multiple Data (SIMD) parallel environment. CM-FORTRAN is used. A conjugate gradient method is employed for the solution of systems. In the third study, we propose to implement the finite element algorithm in a scalable distributed parallel environment using a generic message passing library such as PVM. The code is portable to a range of current and future parallel machines. We also introduced the domain decomposition scheme to reduce the communication time. The parallel scalability of the dynamic viscoelastic finite element algorithm in data parallel and scalable distributed parallel environments is also discussed. © 1997 by John Wiley & Sons, Ltd. 相似文献
3.
Two-dimensional simulations of the evolution of dendrite microstructure during isothermal and non-isothermal solidifications of a Ni-0.41Cu binary alloy are carried out using the phase-field method. The governing evolution equation for the phase field variable, the solute mole fraction and the temperature are formulated and numerically solved using an explicit finite difference scheme. To make the computations tractable, parallel computing is employed. The results obtained show that under lower cooling rates, the solidification process is controlled by partitioning of the solute between the solid and the liquid at the solid/liquid interface. At high cooling rates, on the other hand, solute trapping takes place and solidification is controlled by the heat extraction rate. An increase in the cooling rate is also found to have a pronounced effect on the dendrite microstructure causing it to change from poorly developed dendrites consisting of only primary stalks, via fully developed dendrites containing secondary and tertiary arms to the diamond-shaped grains with cellular surfaces. These findings are in excellent agreement with experimental observations. 相似文献
4.
M. Pari W. Swart M.B. van Gijzen M.A.N. Hendriks J.G. Rots 《International journal for numerical methods in engineering》2020,121(10):2128-2146
Sequentially linear analysis (SLA), an event-by-event procedure for finite element (FE) simulation of quasi-brittle materials, is based on sequentially identifying a critical integration point in the FE model, to reduce its strength and stiffness, and the corresponding critical load multiplier (λcrit), to scale the linear analysis results. In this article, two strategies are proposed to efficiently reuse previous stiffness matrix factorisations and their corresponding solutions in subsequent linear analyses, since the global system of linear equations representing the FE model changes only locally. The first is based on a direct solution method in combination with the Woodbury matrix identity, to compute the inverse of a low-rank corrected stiffness matrix relatively cheaply. The second is a variation of the traditional incomplete LU preconditioned conjugate gradient method, wherein the preconditioner is the complete factorisation of a previous analysis step's stiffness matrix. For both the approaches, optimal points at which the factorisation is recomputed are determined such that the total analysis time is minimised. Comparison and validation against a traditional parallel direct sparse solver, with regard to a two-dimensional (2D) and three-dimensional (3D) benchmark study, illustrates the improved performance of the Woodbury-based direct solver over its counterparts, especially for large 3D problems. 相似文献
5.
The hydrodynamic interactions in a suspension of small particles in a viscous fluid is computed by a boundary integral velocity representation featuring a completed double-layer potential (completed double-layer boundary integral equation method or CDL = BIEM). A multiple expansion is used to represent interactions between distance particles, leading to a considerable improvement in computational speed. The resulting large linear system of equations provides an ideal setting for asynchronous iterative solvers (block Gauss-Seidel) on a message-passing MIMD parallel computer (Intel iPSC/860 Hypercube) using a supervisor-workers load-balancing strategy. Our benchmark results as a function of problem size and number of processors suggest that our algorithm will scale successfully to the massively parallel computers of the future. 相似文献
6.
Jun Sun Pan Michaleris Anshul Gupta Padma Raghavan 《International journal for numerical methods in engineering》2005,63(6):833-858
As parallel and distributed computing gradually becomes the computing standard for large scale problems, the domain decomposition method (DD) has received growing attention since it provides a natural basis for splitting a large problem into many small problems, which can be submitted to individual computing nodes and processed in a parallel fashion. This approach not only provides a method to solve large scale problems that are not solvable on a single computer by using direct sparse solvers but also gives a flexible solution to deal with large scale problems with localized non‐linearities. When some parts of the structure are modified, only the corresponding subdomains and the interface equation that connects all the subdomains need to be recomputed. In this paper, the dual–primal finite element tearing and interconnecting method (FETI‐DP) is carefully investigated, and a reduced back‐substitution (RBS) algorithm is proposed to accelerate the time‐consuming preconditioned conjugate gradient (PCG) iterations involved in the interface problems. Linear–non‐linear analysis (LNA) is also adopted for large scale problems with localized non‐linearities based on subdomain linear–non‐linear identification criteria. This combined approach is named as the FETI‐DP‐RBS‐LNA algorithm and demonstrated on the mechanical analyses of a welding problem. Serial CPU costs of this algorithm are measured at each solution stage and compared with that from the IBM Watson direct sparse solver and the FETI‐DP method. The results demonstrate the effectiveness of the proposed computational approach for simulating welding problems, which is representative of a large class of three‐dimensional large scale problems with localized non‐linearities. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献
7.
Felipe A. Cruz Matthew G. Knepley L. A. Barba 《International journal for numerical methods in engineering》2011,85(4):403-428
Fast algorithms for the computation of N‐body problems can be broadly classified into mesh‐based interpolation methods, and hierarchical or multiresolution methods. To this latter class belongs the well‐known fast multipole method (FMM ), which offers ??(N) complexity. The FMM is a complex algorithm, and the programming difficulty associated with it has arguably diminished its impact, being a barrier for adoption. This paper presents an extensible parallel library for N‐body interactions utilizing the FMM algorithm. A prominent feature of this library is that it is designed to be extensible, with a view to unifying efforts involving many algorithms based on the same principles as the FMM and enabling easy development of scientific application codes. The paper also details an exhaustive model for the computation of tree‐based N‐body algorithms in parallel, including both work estimates and communications estimates. With this model, we are able to implement a method to provide automatic, a priori load balancing of the parallel execution, achieving optimal distribution of the computational work among processors and minimal inter‐processor communications. Using a client application that performs the calculation of velocity induced by N vortex particles in two dimensions, ample verification and testing of the library was performed. Strong scaling results are presented with 10 million particles on up to 256 processors, including both speedup and parallel efficiency. The largest problem size that has been run with the P etFMM library at this point was 64 million particles in 64 processors. The library is currently able to achieve over 85% parallel efficiency for 64 processes. The performance study, computational model, and application demonstrations presented in this paper are limited to 2D. However, the software architecture was designed to make an extension of this work to 3D straightforward, as the framework is templated over the dimension. The software library is open source under the PETS c license, even less restrictive than the BSD license; this guarantees the maximum impact to the scientific community and encourages peer‐based collaboration for the extensions and applications. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献
8.
Substructured formulations of nonlinear structure problems – influence of the interface condition 下载免费PDF全文
Camille Negrello Pierre Gosselet Christian Rey Julien Pebrel 《International journal for numerical methods in engineering》2016,107(13):1083-1105
We investigate the use of non‐overlapping domain decomposition (DD) methods for nonlinear structure problems. The classic techniques would combine a global Newton solver with a linear DD solver for the tangent systems. We propose a framework where we can swap Newton and DD so that we solve independent nonlinear problems for each substructure and linear condensed interface problems. The objective is to decrease the number of communications between subdomains and to improve parallelism. Depending on the interface condition, we derive several formulations that are not equivalent, contrarily to the linear case. Primal, dual and mixed variants are described and assessed on a simple plasticity problem. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
9.
Julien Cortial Charbel Farhat 《International journal for numerical methods in engineering》2009,77(4):451-470
The parallel implicit time‐integration algorithm (PITA) is among a very limited number of time‐integrators that have been successfully applied to the time‐parallel solution of linear second‐order hyperbolic problems such as those encountered in structural dynamics. Time‐parallelism can be of paramount importance to fast computations, for example, when space‐parallelism is unfeasible as in problems with a relatively small number of degrees of freedom in general, and reduced‐order model applications in particular, or when reaching the fastest possible CPU time is desired and requires the exploitation of both space‐ and time‐parallelisms. This paper extends the previously developed PITA to the non‐linear case. It also demonstrates its application to the reduction of the time‐to‐solution on a Linux cluster of sample non‐linear structural dynamics problems. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献
10.
X. Antoine A. Bendali M. Darbas 《International journal for numerical methods in engineering》2004,61(8):1310-1331
Since the advent of the fast multipole method, large‐scale electromagnetic scattering problems based on the electric field integral equation (EFIE) formulation are generally solved by a Krylov iterative solver. A well‐known fact is that the dense complex non‐hermitian linear system associated to the EFIE becomes ill‐conditioned especially in the high‐frequency regime. As a consequence, this slows down the convergence rate of Krylov subspace iterative solvers. In this work, a new analytic preconditioner based on the combination of a finite element method with a local absorbing boundary condition is proposed to improve the convergence of the iterative solver for an open boundary. Some numerical tests precise the behaviour of the new preconditioner. Moreover, comparisons are performed with the analytic preconditioner based on the Calderòn's relations for integral equations for several kinds of scatterers. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献
11.
任意荷载作用下结构动力响应的并行算法 总被引:8,自引:0,他引:8
介绍计算任意荷载作用下线性结构动力响应的精细时程积分及其并行算法。离散化结构的动力响应方程通过变量变换可以转换为一阶线性常微分方程组,该方程组的解由表示初值影响的齐次方程解和反映荷载作用的积分之和构成。上述第一项用矩阵指数函数计算,第二项在本文中用数值积分计算。这种算法很适应并行计算,已在TRANSPUTER并行机上实现。 相似文献
12.
针对共享主存的并行计算环境和微机网络并行计算环境,本文给出了求解人型稀疏对称矩阵的部分极端特征对的并行精化Davidson方法,分析了该法的内在并行性.各处理器利用矩阵的行块和投影了空间的正交皋所组成矩阵的行块进行运算,结合重新启动策略求解矩阵多个特征对的近似值,并用以计算某型号机翼的固有频率,在微机网络并行计算环境和拥有共享土存并行计算环境IBM-P650上进行了数值试验. 相似文献
13.
14.
R. A. BIAECKI M. MERKEL H. MEWS G. KUHN 《International journal for numerical methods in engineering》1996,39(24):4215-4242
An algorithm for solving large BEM equation sets arising when using the substructuring option is described. The solver is based on a frontal solution philosophy. This feature along with a special ordering scheme for the unknowns minimizes the effect of the fill-in. The condensation of unknowns is carried out in two loops running over all substructures. Within the first loop all unknowns associated with nodes at the external surface of the body are eliminated. The remaining interfacial unknowns are condensed within the second loop. The software has a built-in mechanism of recognizing whether the given step of condensation of unknowns can be carried out in-core or out-of-core. This limits the usage of slow direct access files to the necessary minimum. The implemented multiple right-hand side option makes the solver suitable for non-linear applications. Examples of such applications in non-linear heat transfer problems and plasticity are discussed. A parallelized version of the solver uses the parallel virtual machine (PVM) software and is portable to many computer systems. The parallelization is based on a concept following the block solver philosophy. The parallel solver might be run on computer clusters consisting of processors of different power. A special load equivalencing algorithm has been developed to assign proper task to computers of different power. The solver has been run in an industrial environment to solve problems of about 100 thousand unknowns. Numerical examples are included. 相似文献
15.
K. G. Manoj S. K. Bhattacharyya 《International journal for numerical methods in engineering》1997,40(17):3279-3295
A block equation solver for the solution of large, sparse, banded unsymmetric system of linear equations is presented in this paper. The method employs Crout variation of Gauss elimination technique for the solution. The solver ensures the efficient use of the available memory by doing block factorization and storage. It uses a skyline storage scheme which will avoid unnecessary operations on zero elements above the skyline which has found widespread use in banded symmetric solvers. A FORTRAN code with ample comments is provided. © 1997 John Wiley & Sons, Ltd. 相似文献
16.
Akira Imakura 《East Asian journal on applied mathematics.》2014,4(3):267-282
Subspace projection methods based on the Krylov subspace using powers of a
matrix $A$ have often been standard for solving large matrix computations in many areas
of application. Recently, projection methods based on the extended Krylov subspace
using powers of $A$ and $A^{−1}$ have attracted attention, particularly for functions of a matrix
times a vector and matrix equations. In this article, we propose an efficient algorithm
for constructing an orthonormal basis for the extended Krylov subspace. Numerical
experiments indicate that this algorithm has less computational cost and approximately
the same accuracy as the traditional algorithm. 相似文献
17.
Pawel Kudela 《International journal for numerical methods in engineering》2016,106(6):413-429
The proposed spectral element method implementation is based on sparse matrix storage of local shape function derivatives calculated at Gauss–Lobatto–Legendre points. The algorithm utilizes two basic operations: multiplication of sparse matrix by vector and element‐by‐element vectors multiplication. Compute‐intensive operations are performed for a part of equation of motion derived at the degree of freedom level of 3D isoparametric spectral elements. The assembly is performed at the force vector in such a way that atomic operations are minimized. This is achieved by a new mesh coloring technique The proposed parallel implementation of spectral element method on GPU is applied for the first time for Lamb wave simulations. It has been found that computation on multicore GPU is up to 14 times faster than on single CPU. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
18.
Daniel J. Rixen Charbel Farhat Radek Tezaur Jan Mandel 《International journal for numerical methods in engineering》1999,46(4):501-533
In this paper, we prove that the Algebraic A‐FETI method corresponds to one particular instance of the original one‐level FETI method. We also report on performance comparisons on an Origin 2000 between the one‐ and two‐level FETI methods and an optimized sparse solver, for two industrial applications: the stress analysis of a thin shell structure, and that of a three‐dimensional structure modelled by solid elements. These comparisons suggest that for topologically two‐dimensional problems, sparse solvers are effective when the number of processors is relatively small. They also suggest that for three‐dimensional applications, scalable domain decomposition methods such as FETI deliver a superior performance on both sequential and parallel hardware configurations. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献
19.
In Ho Cho Keith A. Porter 《International journal for numerical methods in engineering》2014,100(12):914-932
Multiscale analysis technique became a successful remedy to complicated problems in which nonlinear behavior is linked with microscopic damage mechanisms. For efficient parallel multiscale analyses, hierarchical grouping algorithms (e.g., the two‐level ‘coarse‐grained’ method) have been suggested and proved superior over a simple parallelization. Here, we expanded the two‐level algorithm to give rise to a multilayered grouping parallel algorithm suitable for large‐scale multiple‐level multiscale analyses. With practical large‐scale applications, we demonstrated the superior performance of multilayered grouping over the coarse‐grained method. Notably, we show that the unique data transfer rates of the symmetric multiprocessor cluster system can lead to the seemingly ‘super‐linear speedup’ and that there appears to exist the optimal number of subgroups of three‐tiered multiscale analysis. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
20.
Boundary element and finite element combination analysis on parallel schemes are improved in this paper. The conjugate gradient method (CG method) is introduced for renewal of unknowns on the combination boundary in place of the Schwarz method previously used, which makes it possible to determine a parameter required in the renewal iteration automatically. Further, the condense method is employed for higher efficiency of solution by reducing the number of degree of freedoms in both equations for the finite element and boundary element domains. Comparison of the present algorithm with the previous one in some numerical examples shows marked improvement in computational efficiency. 相似文献