首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This study considers the scaling of three algebraic multigrid aggregation schemes for a finite element discretization of a drift–diffusion system, specifically the drift–diffusion model for semiconductor devices. The approach is more general and can be applied to other systems of partial differential equations. After discretization on unstructured meshes, a fully coupled multigrid preconditioned Newton–Krylov solution method is employed. The choice of aggregation scheme for generating coarser levels has a significant impact on the performance and scalability of the multigrid preconditioner. For the test cases considered, the uncoupled aggregation scheme, which aggregates/combines the immediate neighbors, followed by repartitioning and data redistribution for the coarser level matrices on a subset of the Message Passing Interface (MPI) processes, outperformed the two other approaches, including the baseline aggressive coarsening scheme. Scaling results are presented up to 147,456 cores on an IBM Blue Gene/P platform. A comparison of the scaling of a multigrid V‐cycle and W‐cycle is provided. Results for 65,536 cores demonstrate that a factor of 3.5 × reduction in time between the uncoupled aggregation and baseline aggressive coarsening scheme can be obtained by significantly reducing the iteration count due to the increased number of multigrid levels and the generation of better quality aggregates. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

2.
This study investigates algebraic multilevel domain decomposition preconditioners of the Schwarz type for solving linear systems associated with Newton–Krylov methods. The key component of the preconditioner is a coarse approximation based on algebraic multigrid ideas to approximate the global behaviour of the linear system. The algebraic multilevel preconditioner is based on an aggressive coarsening graph partitioning of the non‐zero block structure of the Jacobian matrix. The scalability of the preconditioner is presented as well as comparisons with a two‐level Schwarz preconditioner using a geometric coarse grid operator. These comparisons are obtained on large‐scale distributed‐memory parallel machines for systems arising from incompressible flow and transport using a stabilized finite element formulation. The results demonstrate the influence of the smoothers and coarse level solvers for a set of 3D example problems. For preconditioners with more than one level, careful attention needs to be given to the balance of robustness and convergence rate for the smoothers and the cost of applying these methods. For properly chosen parameters, the two‐ and three‐level preconditioners are demonstrated to be scalable to 1024 processors. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

3.
Large‐scale systems of nonlinear equations appear in many applications. In various applications, the solution of the nonlinear equations should also be in a certain interval. A typical application is a discretized system of reaction diffusion equations. It is well known that chemical species should be positive otherwise the solution is not physical and in general blow up occurs. Recently, a projected Newton method has been developed, which can be used to solve this type of problems. A drawback is that the projected Newton method is not globally convergent. This motivates us to develop a new feasible projected Newton–Krylov algorithm for solving a constrained system of nonlinear equations. Combined with a projected gradient direction, our feasible projected Newton–Krylov algorithm circumvents the non‐descent drawback of search directions which appear in the classical projected Newton methods. Global and local superlinear convergence of our approach is established under some standard assumptions. Numerical experiments are used to illustrate that the new projected Newton method is globally convergent and is a significate complementarity for Newton–Krylov algorithms known in the literature. © 2016 The Authors. International Journal for Numerical Methods in Engineering Published by John Wiley & Sons Ltd.  相似文献   

4.
Several engineering applications give rise quite naturally to linearized FE systems of equations possessing a multi‐level structure. An example is provided by geomechanical models of layered and faulted geological formations. For such problems the use of a multi‐level incomplete factorization (MIF) as a preconditioner for Krylov subspace methods can prove a robust and efficient solution accelerator, allowing for a fine tuning of the fill‐in degree with a significant improvement in both the solver performance and the memory consumption. The present paper develops two novel MIF variants for the solution of multi‐level symmetric positive definite systems. Two correction algorithms are proposed with the aim of preserving the positive definiteness of the preconditioner, thus avoiding possible breakdowns of the preconditioned conjugate gradient solver. The MIF variants are experimented with in the solution of both a single system and a long‐term quasi‐static simulation dealing with a multi‐level geomechanical application. The numerical results show that MIF typically outperforms by up to a factor 3 a more traditional algebraic preconditioner such as an incomplete Cholesky factorization with partial fill‐in. The advantage is emphasized in a long‐term simulation where the fine fill‐in tuning allowed for by MIF yields a significant improvement for the computer memory requirement as well. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

5.
The flow‐condition‐based interpolation (FCBI) finite element approach is studied in the solution of advection–diffusion problems. Two FCBI procedures are developed and tested with the original FCBI method: in the first scheme, a general solution of the advection–diffusion equation is embedded into the interpolation, and in the second scheme, the link‐cutting bubbles approach is used in the interpolation. In both procedures, as in the original FCBI method, no artificial parameters are included to reach stability for high Péclet number flows. The procedures have been implemented for two‐dimensional analysis and the results of some test problems are presented. These results indicate good stability and accuracy characteristics and the potential of the FCBI solution approach. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

6.
In this paper, we develop a block preconditioner for Jacobian‐free global–local multiscale methods, in which the explicit computation of the Jacobian may be circumvented at the macroscale by using a Newton–Krylov process. Effective preconditioning is necessary for the Krylov subspace iterations (e.g. GMRES) to enhance computational efficiency. This is, however, challenging since no explicit information regarding the Jacobian matrix is available. The block preconditioning technique developed in this paper circumvents this problem by effectively deflating the spectrum of the Jacobian matrix at the current Newton step using information about only the Krylov subspaces corresponding to the Jacobian matrices in the previous Newton steps and their representations on those subspaces. This approach is optimal and results in exponential convergence of the GMRES iterations within each Newton step, thus minimizing expensive microscale computations without requiring explicit Jacobian formation in any step. In terms of both computational cost and storage requirements, the action of a single block of the preconditioner per GMRES step scales linearly as the number of degrees of freedom of the macroscale problem as well as the dimension of the invariant subspace of the preconditioned Jacobian matrix. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

7.
In this work, we consider the local discontinuous Galerkin (LDG) method applied to second‐order elliptic problems arising in the modeling of single‐phase flows in porous media. It has been recently proven that the spectral condition number of the stiffness matrix exhibits an asymptotic behavior of ??(h?2) on structured and unstructured meshes, where h is the mesh size. Thus, efficient preconditioners are mandatory. We present a semi‐algebraic multilevel preconditioner for the LDG method using local Lagrange‐type interpolatory basis functions. We show, numerically, that its performance does not degrade, or at least the number of iterations increases very slowly, as the number of unknowns augments. The preconditioner is tested on problems with high jumps in the coefficients, which is the typical scenario of problems arising in porous media. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

8.
We present a scheme for solving two‐dimensional semilinear reaction–diffusion equations using an expanded mixed finite element method. To linearize the mixed‐method equations, we use a two‐grid algorithm based on the Newton iteration method. The solution of a non‐linear system on the fine space is reduced to the solution of two small (one linear and one non‐linear) systems on the coarse space and a linear system on the fine space. It is shown that the coarse grid can be much coarser than the fine grid and achieve asymptotically optimal approximation as long as the mesh sizes satisfy H=O(h1/3). As a result, solving such a large class of non‐linear equation will not be much more difficult than solving one single linearized equation. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

9.
We consider the performance of sparse linear solvers for problems that arise from thermo‐mechanical applications. Such problems have been solved using sparse direct schemes that enable robust solution at the expense of memory requirements that grow non‐linearly with the dimension of the coefficient matrix. In this paper, we consider a class of preconditioned iterative solvers as a limited‐memory alternative to direct solution schemes. However, such preconditioned iterative solvers typically exhibit complex trade‐offs between reliability and performance. We therefore characterize such trade‐offs for systems from thermo‐mechanical problems by considering several preconditioning schemes including multilevel methods and those based on sparse approximate inversion and incomplete matrix factorization. We provide an analysis of computational costs and memory requirements for model thermo‐mechanical problems, indicating that certain incomplete factorization schemes can achieve good performance. We also provide empirical evaluations that corroborate our analysis and indicate the relative effectiveness of different solution schemes. Our results indicate that our drop‐threshold incomplete Cholesky preconditioning is more robust, efficient and flexible than other popular preconditioning schemes. In addition, we propose preconditioner reuse to amortize preconditioner construction cost over a sequence of linear systems that arise from non‐linear solutions in a plastic regime. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

10.
This work is focused on the Newton‐Krylov technique for computing the steady cyclic states of evolution problems in nonlinear mechanics with space‐time periodicity conditions. This kind of problems can be faced, for instance, in the modeling of a rolling tire with a periodic tread pattern, where the cyclic state satisfies “rolling” periodicity condition, including shifts both in time and space. The Newton‐Krylov method is a combination of a Newton nonlinear solver with a Krylov linear solver, looking for the initial state, which provides the space‐time periodic solution. The convergence of the Krylov iterations is proved to hold in presence of an adequate preconditioner. After preconditioning, the Newton‐Krylov method can be also considered as an observer‐controller method, correcting the transient solution of the initial value problem after each period. Using information stored while computing the residual, the Krylov solver computation time becomes negligible with respect to the residual computation time. The method has been analyzed and tested on academic applications and compared with the standard evolution (fixed point) method. Finally, it has been implemented into the Michelin industrial code, applied to a full 3D rolling tire model.  相似文献   

11.
We develop a parallel fully implicit domain decomposition algorithm for solving optimization problems constrained by time‐dependent nonlinear partial differential equations. In particular, we study the boundary control of unsteady incompressible Navier–Stokes equations. After an implicit discretization in time, a fully coupled sparse nonlinear optimization problem needs to be solved at each time step. The class of full space Lagrange–Newton–Krylov–Schwarz algorithms is used to solve the sequence of optimization problems. Among optimization algorithms, the fully implicit full space approach is considered to be the easiest to formulate and the hardest to solve. We show that Lagrange–Newton–Krylov–Schwarz, with a one‐level restricted additive Schwarz preconditioner, is an efficient class of methods for solving these hard problems. To demonstrate the scalability and robustness of the algorithm, we consider several problems with a wide range of Reynolds numbers and time step sizes, and we present numerical results for large‐scale calculations involving several million unknowns obtained on machines with more than 1000 processors. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

12.
The recent advances in microarchitectural bone imaging disclose the possibility to assess both the apparent density and the trabecular microstructure of intact bones in a single measurement. Coupling these imaging possibilities with microstructural finite element (µFE) analysis offers a powerful tool to improve bone stiffness and strength assessment for individual fracture risk prediction. Many elements are needed to accurately represent the intricate microarchitectural structure of bone; hence, the resulting µFE models possess a very large number of degrees of freedom. In order to be solved quickly and reliably on state‐of‐the‐art parallel computers, the µFE analyses require advanced solution techniques. In this paper, we investigate the solution of the resulting systems of linear equations by the conjugate gradient algorithm, preconditioned by aggregation‐based multigrid methods. We introduce a variant of the preconditioner that does not need assembling the system matrix but uses element‐by‐element techniques to build the multilevel hierarchy. The preconditioner exploits the voxel approach that is common in bone structure analysis, and it has modest memory requirements, at the same time robust and scalable. Using the proposed methods, we have solved in 12min a model of trabecular bone composed of 247 734 272 elements, yielding a matrix with 1 178 736 360 rows, using 1024 CRAY XT3 processors. The ability to solve, for the first time, large biomedical problems with over 1 billion degrees of freedom on a routine basis will help us improve our understanding of the influence of densitometric, morphological, and loading factors in the etiology of osteoporotic fractures such as commonly experienced at the hip, spine, and wrist. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

13.
A two‐level nonoverlapping Schwarz algorithm is developed for the Stokes problem. The main feature of the algorithm is that a mixed problem with both velocity and pressure unknowns is solved with a balancing domain decomposition by constraints (BDDC)‐type preconditioner, which consists of solving local Stokes problems and one global coarse problem related to only primal velocity unknowns. Our preconditioner allows to use a smaller set of primal velocity unknowns than other BDDC preconditioners without much concern on certain flux conditions on the subdomain boundaries and the inf–sup stability of the coarse problem. In the two‐dimensional case, velocity unknowns at subdomain corners are selected as the primal unknowns. In addition to them, averages of each velocity component across common faces are employed as the primal unknowns for the three‐dimensional case. By using its close connection to the Dual–primal finite element tearing and interconnecting (FETI‐DP algorithm) (SIAM J Sci Comput 2010; 32 : 3301–3322; SIAM J Numer Anal 2010; 47 : 4142–4162], it is shown that the resulting matrix of our algorithm has the same eigenvalues as the FETI‐DP algorithm except zero and one. The maximum eigenvalue is determined by H/h, the number of elements across each subdomains, and the minimum eigenvalue is bounded below by a constant, which does not depend on any mesh parameters. Convergence of the method is analyzed and numerical results are included. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

14.
We construct finite volume schemes of very high order of accuracy in space and time for solving the nonlinear Richards equation (RE). The general scheme is based on a three‐stage predictor–corrector procedure. First, a high‐order weighted essentially non‐oscillatory (WENO) reconstruction procedure is applied to the cell averages at the current time level to guarantee monotonicity in the presence of steep gradients. Second, the temporal evolution of the WENO reconstruction polynomials is computed in a predictor stage by using a global weak form of the governing equations. A global space–time DG FEM is used to obtain a scheme without the parabolic time‐step restriction caused by the presence of the diffusion term in the RE. The resulting nonlinear algebraic system is solved by a Newton–Krylov method, where the generalized minimal residual method algorithm of Saad and Schulz is used to solve the linear subsystems. Finally, as a third step, the cell averages of the finite volume method are updated using a one‐step scheme, on the basis of the solution calculated previously in the space–time predictor stage. Our scheme is validated against analytical, experimental, and other numerical reference solutions in four test cases. A numerical convergence study performed allows us to show that the proposed novel scheme is high order accurate in space and time. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

15.
We consider the approximate solution of self-adjoint elliptic problems in three space dimensions by piecewise linear finite elements with respect to a highly non-uniform tetrahedral mesh which is generated adaptively. The arising linear systems are solved iteratively by the conjugate gradient method provided with a multilevel preconditioner. Here, the accuracy of the iterative solution is coupled with the discretization error. As the performance of hierarchical bases preconditioners deteriorates in three space dimensions, the BPX preconditioner is used, taking special care of an efficient implementation. Reliable a posteriori estimates for the discretization error are derived from a local comparison with the approximation resulting from piecewise quadratic elements. To illustrate the theoretical results, we consider a familiar model problem involving reentrant corners and a real-life problem arising from hyperthermia, a recent clinical method for cancer therapy.  相似文献   

16.
We consider the efficient numerical solution of the three‐dimensional wave equation with Neumann boundary conditions via time‐domain boundary integral equations. A space‐time Galerkin method with C‐smooth, compactly supported basis functions in time and piecewise polynomial basis functions in space is employed. We discuss the structure of the system matrix and its efficient parallel assembly. Different preconditioning strategies for the solution of the arising systems with block Hessenberg matrices are proposed and investigated numerically. Furthermore, a C++ implementation parallelized by OpenMP and MPI in shared and distributed memory, respectively, is presented. The code is part of the boundary element library BEM4I. Results of numerical experiments including convergence and scalability tests up to a thousand cores on a cluster are provided. The presented implementation shows good parallel scalability of the system matrix assembly. Moreover, the proposed algebraic preconditioner in combination with the FGMRES solver leads to a significant reduction of the computational time. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

17.
We study theoretically the propagation and distribution of electron spin density in semiconductors within the drift–diffusion model in an external electric field. From the solution of the spin drift–diffusion equation, we derive the expressions for spin currents in the down-stream (DS) and up-stream (US) directions. We find that drift and diffusion currents contribute to the spin current and there is an electric field, called the drift–diffusion crossover field, where the drift and diffusion mechanisms contribute equally to the spin current in the DS direction, and that the spin current in the US direction vanishes when the electric field is very large. We calculate the drift–diffusion crossover field and show that the intrinsic spin diffusion length in a semiconductor can be determined directly from it if the temperature, electron density and both the temperature and electron density, respectively, are known for nondegenerate, highly degenerate and degenerate systems. The results will be useful in obtaining transport properties of the electron’s spin in semiconductors, the essential information for spintronic technology.  相似文献   

18.
High-dimensional two-sided space fractional diffusion equations with variable diffusion coefficients are discussed. The problems can be solved by an implicit finite difference scheme that is proven to be uniquely solvable, unconditionally stable and first-order convergent in the infinity norm. A nonsingular multilevel circulant preconditoner is proposed to accelerate the convergence rate of the Krylov subspace linear system solver efficiently. The preconditoned matrix for fast convergence is a sum of the identity matrix, a matrix with small norm, and a matrix with low rank under certain conditions. Moreover, the preconditioner is practical, with an O(N logN) operation cost and O(N) memory requirement. Illustrative numerical examples are also presented.  相似文献   

19.
Despite the rapid increase of efficiency, perovskite solar cells (PSCs) still face some challenges, one of which is the current–voltage hysteresis. Herein, it is reported that yttrium‐doped tin dioxide (Y‐SnO2) electron selective layer (ESL) synthesized by an in situ hydrothermal growth process at 95 °C can significantly reduce the hysteresis and improve the performance of PSCs. Comparison studies reveal two main effects of Y doping of SnO2 ESLs: (1) it promotes the formation of well‐aligned and more homogeneous distribution of SnO2 nanosheet arrays (NSAs), which allows better perovskite infiltration, better contacts of perovskite with SnO2 nanosheets, and improves electron transfer from perovskite to ESL; (2) it enlarges the band gap and upshifts the band energy levels, resulting in better energy level alignment with perovskite and reduced charge recombination at NSA/perovskite interfaces. As a result, PSCs using Y‐SnO2 NSA ESLs exhibit much less hysteresis and better performance compared with the cells using pristine SnO2 NSA ESLs. The champion cell using Y‐SnO2 NSA ESL achieves a photovoltaic conversion efficiency of 17.29% (16.97%) when measured under reverse (forward) voltage scanning and a steady‐state efficiency of 16.25%. The results suggest that low‐temperature hydrothermal‐synthesized Y‐SnO2 NSA is a promising ESL for fabricating efficient and hysteresis‐less PSC.  相似文献   

20.
A two‐level domain decomposition method is introduced for general shape optimization problems constrained by the incompressible Navier–Stokes equations. The optimization problem is first discretized with a finite element method on an unstructured moving mesh that is implicitly defined without assuming that the computational domain is known and then solved by some one‐shot Lagrange–Newton–Krylov–Schwarz algorithms. In this approach, the shape of the domain, its corresponding finite element mesh, the flow fields and their corresponding Lagrange multipliers are all obtained computationally in a single solve of a nonlinear system of equations. Highly scalable parallel algorithms are absolutely necessary to solve such an expensive system. The one‐level domain decomposition method works reasonably well when the number of processors is not large. Aiming for machines with a large number of processors and robust nonlinear convergence, we introduce a two‐level inexact Newton method with a hybrid two‐level overlapping Schwarz preconditioner. As applications, we consider the shape optimization of a cannula problem and an artery bypass problem in 2D. Numerical experiments show that our algorithm performs well on a supercomputer with over 1000 processors for problems with millions of unknowns. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号