共查询到20条相似文献,搜索用时 9 毫秒
1.
This paper describes a massively parallel code for a state-of-the art thermal lattice–Boltzmann method. Our code has been carefully optimized for performance on one GPU and to have a good scaling behavior extending to a large number of GPUs. Versions of this code have been already used for large-scale studies of convective turbulence.GPUs are becoming increasingly popular in HPC applications, as they are able to deliver higher performance than traditional processors. Writing efficient programs for large clusters is not an easy task as codes must adapt to increasingly parallel architectures, and the overheads of node-to-node communications must be properly handled.We describe the structure of our code, discussing several key design choices that were guided by theoretical models of performance and experimental benchmarks. We present an extensive set of performance measurements and identify the corresponding main bottlenecks; finally we compare the results of our GPU code with those measured on other currently available high performance processors. Our results are a production-grade code able to deliver a sustained performance of several tens of Tflops as well as a design and optimization methodology that can be used for the development of other high performance applications for computational physics. 相似文献
2.
《Mathematics and computers in simulation》2007,73(2-6):113-116
A two-dimensional finite-difference lattice Boltzmann model for liquid–vapor systems is introduced and analyzed. Two different numerical schemes are used and compared in recovering equilibrium density and velocity profiles for a planar interface. We show that flux limiter techniques can be conveniently adopted to minimize spurious numerical effects and improve the numerical accuracy of the model. 相似文献
3.
In this paper, a lattice Boltzmann model for the Korteweg–de Vries (KdV) equation with higher-order accuracy of truncation error is presented by using the higher-order moment method. In contrast to the previous lattice Boltzmann model, our method has a wide flexibility to select equilibrium distribution function. The higher-order moment method bases on so-called a series of lattice Boltzmann equation obtained by using multi-scale technique and Chapman–Enskog expansion. We can also control the stability of the scheme by modulating some special moments to design the dispersion term and the dissipation term. The numerical example shows the higher-order moment method can be used to raise the accuracy of truncation error of the lattice Boltzmann scheme. 相似文献
4.
Suman Sinha 《Computer Physics Communications》2012,183(12):2616-2621
We present a study on the performance of the Wang–Landau algorithm in a lattice model of liquid crystals which is a continuous lattice spin model. We propose a novel method of the spin update scheme in a continuous lattice spin model. The proposed scheme reduces the autocorrelation time of the simulation and results in faster convergence. 相似文献
5.
Kireeva A. E. Sabelfeld K. K. Gribov E. N. Maltseva N. V. 《The Journal of supercomputing》2019,75(12):7790-7798
The Journal of Supercomputing - The paper presents a three-dimensional cellular automaton model of electrochemical oxidation of the carbon. The sample of the electro-conductive carbon black... 相似文献
6.
Performance modeling and analysis of heterogeneous lattice Boltzmann simulations on CPU–GPU clusters
Computational fluid dynamic simulations are in general very compute intensive. Only by parallel simulations on modern supercomputers the computational demands of complex simulation tasks can be satisfied. Facing these computational demands GPUs offer high performance, as they provide the high floating point performance and memory to processor chip bandwidth. To successfully utilize GPU clusters for the daily business of a large community, usable software frameworks must be established on these clusters. The development of such software frameworks is only feasible with maintainable software designs that consider performance as a design objective right from the start. For this work we extend the software design concepts to achieve more efficient and highly scalable multi-GPU parallelization within our software framework waLBerla for multi-physics simulations centered around the lattice Boltzmann method. Our software designs now also support a pure-MPI and a hybrid parallelization approach capable of heterogeneous simulations using CPUs and GPUs in parallel. For the first time weak and strong scaling performance results obtained on the Tsubame 2.0 cluster for more than 1000 GPUs are presented using waLBerla. With the help of a new communication model the parallel efficiency of our implementation is investigated and analyzed in a detailed and structured performance analysis. The suitability of the waLBerla framework for production runs on large GPU clusters is demonstrated. As one possible application we show results of strong scaling experiments for flows through a porous medium. 相似文献
7.
In general, explicit numerical schemes are only conditionally stable. A particularity of lattice Boltzmann multiple-relaxation-time (MRT) schemes is the presence of free (“kinetic”) relaxation parameters. They do not appear in the transport coefficients of the modelled second-order (macroscopic) equations but they have an impact on the effective accuracy and stability of the algorithm. The simplest uniform choice (the well known BGK/SRT model) is often inadequate, and therefore a compromise in the complexity of the model is sought. For this purpose, the von Neumann stability analysis is performed for the d1Q3 two-relaxation-time (TRT) advection–diffusion model. The extended optimal (EOTRT) model, which relates the two collision times such that the most stable scheme is set by a suitable choice of the equilibrium parameters, equal for any Peclet number, is then developed. This extends the very recently derived optimal subclass (OTRT) to larger combinations of “physical” and “kinetic” collision rates. Next, we provide the necessary and/or sufficient stability limits on the EOTRT subclass for a wide range of velocity sets, with and without numerical diffusion, and delineate the interesting choices of free equilibrium weights for the d2Q9 and d3Q15 models. The BGK/SRT model is without advanced advection properties; we prove (for minimal stencil schemes d1Q3, d2Q5 and d3Q7) that the non-negativity of the equilibrium distribution is necessary for its stability in the advection-dominated limit. Beyond the EOTRT and BGK/SRT subclasses of the TRT model, blind choices of the “ghost” collision number may result in quite unstable schemes, even for positive equilibrium. However, we find that the d1Q3 stability curves govern the advection properties of the multi-dimensional models and a fuller picture of the TRT stability properties begins to emerge. 相似文献
8.
《国际计算机数学杂志》2012,89(12):1678-1688
Departing from a method to approximate the solutions of a two-dimensional generalization of the well-known Fisher's equation from population dynamics, we extend this computational technique to calculate the solutions of a FitzHugh–Nagumo model and derive conditions under which its positive and bounded analytic solutions are estimated consistently by positive and bounded numerical approximations. The constraints are relatively flexible, and they are provided exclusively in terms of the model coefficients and the computational parameters. The proofs are established with the help of the theory of M-matrices, using the facts that such matrices are non-singular, and that the entries of their inverses are positive numbers. Some numerical experiments are performed in order to show that our method is capable of preserving the positivity and the boundedness of the numerical solutions. The simulations evince a good agreement between the numerical estimations and the corresponding exact solutions derived in this work. 相似文献
9.
This paper is concerned with the problems of dissipative stability analysis and control of the two-dimensional (2-D) Fornasini–Marchesini local state-space (FM LSS) model. Based on the characteristics of the system model, a novel definition of 2-D FM LSS (Q, S, R)-α-dissipativity is given first, and then a sufficient condition in terms of linear matrix inequality (LMI) is proposed to guarantee the asymptotical stability and 2-D (Q, S, R)-α-dissipativity of the systems. As its special cases, 2-D passivity performance and 2-D H∞ performance are also discussed. Furthermore, by use of this dissipative stability condition and projection lemma technique, 2-D (Q, S, R)-α-dissipative state-feedback control problem is solved as well. Finally, a numerical example is given to illustrate the effectiveness of the proposed method. 相似文献
10.
The disrupting effect of quantum memory on the dynamics of a spatial quantum formulation of the iterated prisoner’s dilemma game with variable entangling is studied. The game is played within a cellular automata framework, i.e., with local and synchronous interactions. The main findings of this work refer to the shrinking effect of memory on the disruption induced by noise. 相似文献
11.
The immersed boundary method is a practical and effective method for fluid–structure interaction problems. It has been applied to a variety of problems. Most of the time-stepping schemes used in the method are explicit, which suffer a drawback in terms of stability and restriction on the time step. We propose a lattice Boltzmann based implicit immersed boundary method where the immersed boundary force is computed at the unknown configuration of the structure at each time step. The fully nonlinear algebraic system resulting from discretizations is solved by an Inexact Newton–Krylov method in a Jacobian-free manner. The test problem of a flexible filament in a flowing viscous fluid is considered. Numerical results show that the proposed implicit immersed boundary method is much more stable with larger time steps and significantly outperforms the explicit version in terms of computational cost. 相似文献
12.
In this paper, the dynamical behaviors of a two-dimensional simplified Hodgkin–Huxley (H–H) model exposed to external electric fields are investigated through qualitative analysis and numerical simulation. A necessary and sufficient condition is proposed for the existence of the Hopf bifurcation. Saddle-node bifurcations and canards of the simplified model with the coefficients of different linear forms are also discussed. Finally, the bifurcation curves with the coefficients of different linear forms are shown. The numerical results demonstrate that some linear forms can retain the bifurcation characteristics of the original model, which is of great use to simplify the H–H model for the real-world applications. 相似文献
13.
Multicomponent phase transition kinetics in cereal foam—Part I: developing a lattice Boltzmann model
Foam thermo-physics is a significant point of interest in current research in a broad range of applications reaching from material science, geology, chemical, biotechnology, ceramic processing to food science. The latter involves the challenge of continuous quality in combination with high-temperature processing. Thermal treatment strongly influences foam structure, stability and as well enforces chemical reactions or physical processes such as phase transitions. From a process engineering point of view, such reactions can be used for process optimization considerations. In cereal foam, heat transfer is suggested to depend, besides heat conduction in the lamella, on evaporation–condensation processes inside the foam bubbles. According to the meso-scale incidence of physical processes within complex foam micro-structures, the lattice Boltzmann method verifies its application to numerical investigations on the considered length scale. Thus, the objective of this study is the development of a lattice Boltzmann model covering heat and mass diffusion in combination with phase transition processes. 相似文献
14.
Ramón Alonso-Sanz 《Quantum Information Processing》2017,16(6):161
The disrupting effect of quantum noise on the dynamics of a spatial quantum formulation of the iterated prisoner’s dilemma game with variable entangling is studied in this work. The game is played in the cellular automata manner, i.e., with local and synchronous interaction. It is concluded in this article that quantum noise induces in fair games the need for higher entanglement in order to make possible the emergence of the strategy pair (Q, Q), which produces the same payoff of mutual cooperation. In unfair quantum versus classic player games, quantum noise delays the prevalence of the quantum player. 相似文献
15.
The Journal of Supercomputing - Accurate cellular traffic prediction becomes more and more critical for efficient network resource management in the Internet of Things (IoT). However, high-accuracy... 相似文献
16.
This paper proposes two viable computing strategies for distributed parallel systems: domain division with sub-domain overlapping and asynchronous communication. We have implemented a parallel computing procedure for simulation of Ti thin film growing process of a system with 1000 x 1000 atoms by means of the Monte Carlo (MC) method. This approach greatly reduces the computation time for simulation of large-scale thin film growth under realistic deposition rates. The multi-lattice MC model of deposition comprises two basic events: deposition, and surface diffusion. Since diffusion constitutes more than 90% of the total simulation time of the whole deposition process at high temperature, we concentrated on implementing a new parallel diffusion simulation that reduces communication time during simulation. Asynchronous communication and domain overlapping techniques are used to reduce the waiting time and communication time among parallel processors. The parallel algorithms we propose can simulate the thin 相似文献
17.
A unified framework to derive discrete time-marching schemes for the coupling of immersed solid and elastic objects to the lattice Boltzmann method is presented. Based on operator splitting for the discrete Boltzmann equation, second-order time-accurate schemes for the immersed boundary method, viscous force coupling and external boundary force are derived. Furthermore, a modified formulation of the external boundary force is introduced that leads to a more accurate no-slip boundary condition. The derivation also reveals that the coupling methods can be cast into a unified form, and that the immersed boundary method can be interpreted as the limit of force coupling for vanishing particle mass. In practice, the ratio between fluid and particle mass determines the strength of the force transfer in the coupling. The integration schemes formally improve the accuracy of first-order algorithms that are commonly employed when coupling immersed objects to a lattice Boltzmann fluid. It is anticipated that they will also lead to superior long-time stability in simulations of complex fluids with multiple scales. 相似文献
18.
《Calphad》2021
Dendritic growth is one of the most important phenomena during the solidification of alloys. However, solute redistribution on the front of solid-liquid interface may result in nonuniform distribution of concentration between dendrite branches. This often causes microscopic segregation and undermines the properties of materials. In order to control the solidification microstructure of Al–Li alloy, we firstly need to understand in depth the morphological and concentration evolution during dendrite growth. Here, the KKS (S.G. Kim, W.T. Kim, T. Suzuki) phase-field model coupling CALPHAD data is employed. The dependences of the dendrite morphologies and growth kinetics on undercooling or initial solute concentration are qualitatively analyzed. Dendrite growth rate increases slowly when undercooling ΔT is approximately less than 25 °C, and steeply when ΔT>40 °C corresponding to the transition from diffusional dendrite growth into rapid solidification. Accordingly, the obtained morphologies change from dendrite into seaweed crystal. The increase of supersaturation influences dendrite growth similarly in terms of growth rate and morphology. Moreover, through simulation of columnar dendrites growth, we find that the microscopic segregation becomes more severely with decreasing undercooling, or increasing supersaturation. These results demonstrate the capability of the technology---phase-field simulation coupling to CALPHAD in the modelling of microstructure evolution during solidification of alloys. 相似文献
19.
Numerical simulations have been performed on the pressure-driven rarefied flow through channels with a sudden contraction–expansion of 2:1:2 using isothermal two and three-dimensional lattice Boltzmann method (LBM). In the LBM, a Bosanquet-type effective viscosity and a modified second-order slip boundary condition are used to account for the rarefaction effect on gas viscosity to cover the slip and transition flow regimes, that is, a wider range of Knudsen number. Firstly, the in-house LBM code is verified by comparing the computed pressure distribution and flow pattern with experimental ones measured by others. The verified code is then used to study the effects of the outlet Knudsen number Kn o , driving pressure ratio P i /P o , and Reynolds number Re, respectively, varied in the ranges of 0.001–1.0, 1.15–5.0, and 0.02–120, on the pressure distributions and flow patterns as well as to document the differences between continuum and rarefied flows. Results are discussed in terms of the distributions of local pressure, Knudsen number, centerline velocity, and Mach number. The variations of flow patterns and vortex length with Kn o and Re are also documented. Moreover, a critical Knudsen number is identified to be Kn oc = 0.1 below and above which the behaviors of nonlinear pressure profile and velocity distribution and the variations of vortex length with Re upstream and downstream of constriction are different from those of continuum flows. 相似文献
20.
This paper presents a two-staged parallel mechanism composed by a rigid platform in a serial connection with a compliant platform, and concentrates on its configuration and interrelation. The analysis starts with the operator of a 3UPU configuration with a central strut being derived. Configuration and displacement formulas of the compliant platform are demonstrated, leading to the analytic equations of the relationship between the actuated angles of the operator and the position parameters of the end-effector. The numerical evaluation of workspace of the two-staged parallel mechanism is then followed. 相似文献