期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

High-performance parallel computing for incompressible flow simulations

O. Byrde W. Couzy M. O. Deville M. L. Sawley 《Computational Mechanics》1999,23(2):98-107

High-performance parallel computer systems have been employed to compute a variety of three-dimensional incompressible fluid flows. Three different numerical methods have been used for the discretization of the Navier-Stokes equations, with domain decomposition techniques employed for the parallel resolution of the discretized equations. These parallel flow solvers have been applied to the numerical simulation of different flows, ranging from basic flow studies to industrial applications. The results of these studies show that high-performance parallel computing has evolved from its initial investigative phase into a mature technology that can be employed for large-scale numerical flow simulation. 相似文献

2.

Temporal fringe pattern analysis with parallel computing

Ng TW Ang KT Argentini G 《Applied optics》2005,44(33):7125-7129

Temporal fringe pattern analysis is invaluable in transient phenomena studies but necessitates long processing times. Here we describe a parallel computing strategy based on the single-program multiple-data model and hyperthreading processor technology to reduce the execution time. In a two-node cluster workstation configuration we found that execution periods were reduced by 1.6 times when four virtual processors were used. To allow even lower execution times with an increasing number of processors, the time allocated for data transfer, data read, and waiting should be minimized. Parallel computing is found here to present a feasible approach to reduce execution times in temporal fringe pattern analysis. 相似文献

3.

Solution of train-tunnel entry flow using parallel computing 总被引：1，自引：0，他引：1

B. S. Holmes J. Dias S. M. Rifai J. C. Buell Z. Johan T. Sassa T. Sato 《Computational Mechanics》1999,23(2):124-129

A solution to the problem of predicting the airflow over a train entering a tunnel is presented using parallel processing and a novel moving boundary condition scheme. The moving boundary condition approach avoids some of the topological problems of traditional approaches to this problem such as ALE techniques and contact surfaces. The method is demonstrated using both incompressible and compressible flow solvers based on the GLS finite element formulation. Flow solutions are compared with experiment for a simple geometry and the method is demonstrated on an actual train geometry. 相似文献

4.

Scalable optical hypercube-based interconnection network for massively parallel computing

Louri A Sung H 《Applied optics》1994,33(32):7588-7598

Two important parameters of a network for massively parallel computers are scalability and modularity. Scalability has two aspects: size and time (or generation). Size scalability refers to the property that the size of the network can be increased with nominal effect on the existing configuration. Also, the increase in size is expected to result in a linear increase in performance. Time scalability implies that the communication capabilities of a network should be large enough to support the evolution of processing elements through generations. A modular network enables the construction of a large network out of many smaller ones. The lack of these two important parameters has limited the use of certain types of interconnection networks in the area of massively parallel computers. We present a new modular optical interconnection network, called an optical multimesh hypercube (OMMH), which is both size and time scalable. The OMMH combines positive features of both the hypercube (small diameter, high connectivity, symmetry, simple routing, and fault tolerance) and the torus (constant node degree and size scalability) networks. Also presented is a three-dimensional optical implementation of the OMMH network. A basic building block of the OMMH network is a hypercube module that is constructed with free-space optics to provide compact and high-density localized hypercube connections. The OMMH network is then constructed by the connection of such basic building blocks with multiwavelength optical fibers to realize torus connections. The proposed implementation methodology is intended to exploit the advantages of both space-invariant free-space and multiwavelength fiber-based optical interconnect technologies. The analysis of the proposed implementation shows that such a network is optically feasible in terms of the physical size and the optical power budget. 相似文献

5.

Bisection for parallel computing using Ritz and Fiedler vectors

A. Kaveh H. A. Rahimi Bondarabady 《Acta Mechanica》2004,167(3-4):131-144

Summary. In this article, an efficient algorithm is developed for the decomposition of large-scale finite element models. A weighted incidence graph with N nodes is used to transform the connectivity properties of finite element meshes into those of graphs. A graph G₀ constructed in this manner is then reduced to a graph G_n of desired size by a sequence of contractions G₀ G₁ G₂ G_n. For G₀, two pseudoperipheral nodes s₀ and t₀ are selected and two shortest route trees are expanded from these nodes. For each starting node, a vector is constructed with N entries, each entry being the shortest distance of a node n_i of G₀ from the corresponding starting node. Hence two vectors v₁ and v₂ are formed as Ritz vectors for G₀. A similar process is repeated for G_i (i=1,2,,n), and the sizes of the vectors obtained are then extended to N. A Ritz matrix consisting of 2(n+1) normalized Ritz vectors each having N entries is constructed. This matrix is then used in the formation of an eigenvalue problem. The first eigenvector is calculated, and an approximate Fiedler vector is constructed for the bisection of G₀. The performance of the method is illustrated by some practical examples. 相似文献

6.

Reconfigurable intelligent optical backplane for parallel computing and communications

Szymanski TH Hinton HS 《Applied optics》1996,35(8):1253-1268

相似文献

7.

Modeling and simulation of auction-based shop-floor control using parallel computing

Dharmaraj Veeramani Kung-jeng Wang Jose Rojas 《IIE Transactions》1998,30(9):773-783

The high level of complexity and cost involved in the development, testing, and implementation of software for traditional, hierarchical shop-floor control of automated manufacturing systems has motivated considerable research in recent years on the distributed shop-floor control paradigm. In this paper, we describe a methodology for modeling and simulation of an auction-based shop-floor control scheme in a parallel and distributed computing environment using the Parallel Virtual Machine software library. Compared to traditional discrete-event simulation, this approach provides a more accurate means for modeling and evaluation of the shop-floor behavior under distributed control, and enables rapid prototyping of the actual control software. We discuss the challenges and highlight research opportunities associated with modeling and simulation of distributed shop-floor control systems using parallel and distributed computing. 相似文献

8.

Progress in global parallel computing research: a bibliometric approach

Zhongqiu Liu Yaolin Liu Yangjie Guo Hua Wang 《Scientometrics》2013,95(3):967-983

This study adopts a bibliometric approach to analyze the progress in global parallel computing research from the related literature in the Science Citation Index Expanded database from 1958 to 2011. By investigating the characteristics of annual publication outputs, we find that parallel computing has recently experienced increasing attention again after its first rapid development in the 1990s, and the research in this field is entering into a new phase. The distribution of publications indicates that the seven major industrial countries (G7), with USA ranking top, are identified as the most productive and influential countries in this domain. Author keywords were analyzed by comparison, and we conclude that the study focus of parallel computing has shifted from hardware to software, with parallel application and programming based on MPI, GPUs and multicores being the research tendencies; grid computing and cloud computing dominate the distributed computing area due to their heterogeneous and scalable structures; and, furthermore, the processors of parallel machines are heading for a diverse development. The citing-cited matrix brings into light the intense interactions among the disciplines of computer science, engineering, mathematics and physics. The mutual interactions between the four disciplines have increased gradually and reflect the subject characteristics in influence content. 相似文献

9.

公铁两用隧道动态响应的并行计算分析

下载免费PDF全文

张伟伟金先龙曹露芬王建炜王新《振动与冲击》2012,31(8):164-169,175

具有复杂结构和多种用途的隧道正受到广泛运用。为了分析隧道结构在多种载荷下的响应特性,在LS-DYNA环境下,建立了列车-隧道-土体的动力耦合三维有限元计算模型,采用基于负载均衡的并行计算技术,解决了该大规模非线性有限元模型的求解难题。结果表明：运用动力松弛法加载静应力场可以取得较好的效果;加载静应力场时,隧道与土体接触计算更准确,且应力响应分布规律不同;列车及公路车辆载荷的影响远大于轨道不平度的影响,隧道衬砌在公路车道准静态载荷下的垂向位移约1.15mm,在列车动载荷作用引起的垂向位移峰值约0.8mm;基于接触负载均衡的并行计算方法提高了约15%的计算效率,而对于CPU数量的选择,需要综合考虑模型的规模和空间拓扑结构。相似文献

10.

Full Eulerian deformable solid-fluid interaction scheme based on building-cube method for large-scale parallel computing

Koji Nishiguchi Rahul Bale Shigenobu Okazawa Makoto Tsubokura 《International journal for numerical methods in engineering》2019,117(2):221-248

We propose a full Eulerian incompressible solid-fluid interaction scheme capable of achieving high parallel efficiency and easily generating meshes for complex solid geometries. While good scalability of a full Eulerian solid-fluid interaction formulation has been reported by Sugiyama et al, their analysis was carried out using uniform Cartesian mesh and an artificial compressibility method. Typically, it is more challenging to achieve good scalability for hierarchical Cartesian meshes and a fully incompressible formulation. In addition, the conventional full Eulerian methods require a large computational cost to resolve complex solid geometries due to the usage of uniform Cartesian meshes. In an attempt to overcome the aforementioned issues, we employ the building-cube method, where the computational domain is divided into cubic regions called cubes. Each cube is divided at equal intervals, the same number of cubes is assigned to each core, and the spatial loop processing is executed for each cube. The numerical method is verified by computing five numerical examples. In the weak scaling test, the parallel efficiency at 32768 cores with 32 cores as a reference is 93.6%. In the strong scaling test, the parallel efficiency at 32768 cores with 128 cores as a reference is 70.2%. 相似文献

11.

Incompressible Navier-Stokes solver using extrapolation method suitable for massively parallel computing

K. Shimano C. Arakawa 《Computational Mechanics》1999,23(2):172-181

The authors propose combination of the coupled method and the extrapolation method as a numerical technique suitable for calculation of an incompressible flow on a massively parallel computer. In the coupled method, the momentum equations and the continuity equation are directly coupled, and velocity components and pressure values are simultaneously updated. It is very simple and efficiently parallelized. The extrapolation method is an accelerative technique predicting a converged solution from a sequence of intermediate solutions generated by an iterative procedure. When it is implemented on a parallel computer, it is expected to retain good accelerative property even for fine granularity in contrast to the multigrid method. In this paper three existing versions of the extrapolation method, ROLE, MPE and ROGE, are reviewed, and LWE, a new version developed by the authors, is presented. Then, ROLE and LWE are applied to numerical analysis of Poisson's equation on a Fujitsu AP1000 and its results are shown. The mathematical proof that the extrapolation method, which is based on the linear theory, is applicable to an iterative procedure solving nonlinear equations is presented. Then the code consisting of the coupled method and the extrapolation method is implemented on a Fujitsu AP1000 to solve two simple 2-D steady flows. Accelerative property of the extrapolation method is discussed, and suitability of the code to massively parallel computing is demonstrated. 相似文献

12.

Adjoint design sensitivity analysis of molecular dynamics in parallel computing environment

Hong-Lae Jang Jae-Hyun Kim Youmie Park Seonho Cho 《International Journal of Mechanics and Materials in Design》2014,10(4):379-394

An adjoint design sensitivity analysis method is developed for molecular dynamics using a parallel computing scheme of spatial decomposition in both response and design sensitivity analyses to enhance the computational efficiency. Molecular dynamics is a path-dependent transient dynamic problem with many design variables of high nonlinearity. Adjoint variable method is not appropriate for path-dependent problems but employed in this paper since the path is readily available from response analysis. The required adjoint system is derived as a terminal value problem. To compute the interaction forces between atoms in different spatial boxes, only atomic positions in the neighboring boxes are required to minimize the amount of data communications. Through some numerical examples, the high nonlinearity of the selected design variables is discussed. Also, the accuracy of the derived adjoint design sensitivity is verified by comparing with finite difference sensitivity and the efficiency of parallel adjoint variable method is demonstrated. 相似文献

13.

Evaluation of parallel performance of large scale computing using workstation network

T. Horie H. Kuramae 《Computational Mechanics》1996,17(4):234-241

Since the computing environment with many workstations on networks has recently become available, these workstations can be regarded as a virtual parallel computer, or a workstation cluster to perform finite element analyses. The parallel performance of finite element algorithms, such as the Gaussian elimination method, the conkugate gradient method and the domain decomposition method, depends strongly on the parallel parameters in the cluster system. The method to evaluate the parallel computational time and to optimize the parallel parameters are presented. The parallel computing systems are developed based on these techniques and applied to a large scale problem with 350,000 degrees of freedom using twenty-one workstations. 相似文献

14.

Performances and limits of a parallel oscillator for electrochemical quartz crystal microbalances

Ehahoun H Gabrielli C Keddam M Perrot H Rousseau P 《Analytical chemistry》2002,74(5):1119-1127

This paper describes a driving circuit for an electrochemical quartz crystal microbalance (EQCM) adapted to a wide range of applications. The oscillator is a Miller-type parallel oscillator using an operational transconductance amplifier (OTA). A theoretical study of the oscillating circuit led to the analytical expression of the microbalance frequency as well as to an overestimation of the error on the mass measurement. The reliability of the EQCM was then experimentally verified through electrochemical copper deposition and dissolution. The limit of operation of the EQCM was also investigated, both analytically and experimentally. This work shows that parallel oscillators using few electronic components allow a very reliable EQCM to be obtained for mass measurements on metallic films, even if they are highly damped. 相似文献

15.

Fast hydrological model calibration based on the heterogeneous parallel computing accelerated shuffled complex evolution method

Guangyuan Kan Xiaoyan He Liuqian Ding Jiren Li Yang Hong Depeng Zuo 《工程优选》2018,50(1):106-119

Hydrological model calibration has been a hot issue for decades. The shuffled complex evolution method developed at the University of Arizona (SCE-UA) has been proved to be an effective and robust optimization approach. However, its computational efficiency deteriorates significantly when the amount of hydrometeorological data increases. In recent years, the rise of heterogeneous parallel computing has brought hope for the acceleration of hydrological model calibration. This study proposed a parallel SCE-UA method and applied it to the calibration of a watershed rainfall–runoff model, the Xinanjiang model. The parallel method was implemented on heterogeneous computing systems using OpenMP and CUDA. Performance testing and sensitivity analysis were carried out to verify its correctness and efficiency. Comparison results indicated that heterogeneous parallel computing-accelerated SCE-UA converged much more quickly than the original serial version and possessed satisfactory accuracy and stability for the task of fast hydrological model calibration. 相似文献

16.

Variational inference for a polytomous-attribute saturated diagnostic classification model with parallel computing

Oka Motonori Saso Shun Okada Kensuke 《Behaviormetrika》2023,50(1):63-92

Behaviormetrika - As a statistical tool to assist formative assessments in educational settings, diagnostic classification models (DCMs) have been increasingly used to provide diagnostic... 相似文献

17.

Finite elements in space and time for parallel computing of viscoelastic deformation

M. Buch A. Idesman R. Niekamp E. Stein 《Computational Mechanics》1999,24(5):386-395

The efficient parallel computation of time dependent problems, e.g. parabolic problems of viscoelastic material deformation, underlies the “bottleneck” of the serial approach in time. The usual method of lines, also called semidiscretization, leads to an iterative calculation in time, i.e. a sequential solution of the spatial problems for all time steps. Due to that, only one spatial problem can be solved in parallel at a certain time step. For an efficient parallelization, it is necessary to compute the whole problem in a distributed way. Furthermore, both h- and p-adaptive approximation should be possible in time and space. For these purposes, in addition to the spatial FE-discretization, a continuous finite element discretization in time is used. Thus, one obtains a total algebraic equation system in space and time, whose solution has to be parallelized efficiently, and h- and p-adaptivity in time and space within the frame of the overall Galerkin-process has to be realized. The present paper treats symmetric and non-symmetric formulations of two different viscoelastic three-parameter models. The new numerical approach concerns first for the Malvern Model (generalized Maxwell Model). The numerical examples for the new non-symmetric formulation and the traditional semidiscretization show the advantage (with respect to convergence to the problem solution) of the new finite element approach with simultaneous discretizations in time and space. But the algebraic systems are bad-conditioned such that parallel iterative solvers with various preconditions are not efficient. The symmetric formulation for the Malvern Model can be obtained for the one-dimensional case only. A numerical example showed the good iterative solvability of the symmetric formulation. Therefore, in order to obtain a symmetric formulation in the 3D-case the generalized Kelvin–Voigt Model was chosen as an alternative one. It should be mentioned that the numerical examples show both the effectiveness of parallel computation and the efficiency of h- and p-adaptation (p-adaptation yields the higher rate of convergence than h-adaptation). Received 19 April 1998 相似文献

18.

A new parallel sparse direct solver: Presentation and numerical experiments in large‐scale structural mechanics parallel computing

I. Guèye S. El Arem F. Feyel F.‐X. Roux G. Cailletaud 《International journal for numerical methods in engineering》2011,88(4):370-384

The main purpose of this work is to present a new parallel direct solver: Dissection solver. It is based on LU factorization of the sparse matrix of the linear system and allows to detect automatically and handle properly the zero‐energy modes, which are important when dealing with DDM. A performance evaluation and comparisons with other direct solvers (MUMPS, DSCPACK) are also given for both sequential and parallel computations. Results of numerical experiments with a two‐level parallelization of large‐scale structural analysis problems are also presented: FETI is used for the global problem parallelization and Dissection for the local multithreading. In this framework, the largest problem we have solved is of an elastic solid composed of 400 subdomains running on 400 computation nodes (3200 cores) and containing about 165 millions dof. The computation of one single iteration consumes less than 20 min of CPU time. Several comparisons to MUMPS are given for the numerical computation of large‐scale linear systems on a massively parallel cluster: performances and weaknesses of this new solver are highlighted. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

19.

Method of computing interference-measurement errors when the Fabry-Perot interferometer mirrors are not parallel

A. A. Pomeranskii Yu. F. Tomashevskii 《Measurement Techniques》1981,24(5):360-363

相似文献

20.

Influence of radiation on the limits of heterogeneous combustion of a particle in two parallel reactions on its surface

V. V. Kalinchak S. G. Orlovskaya A. I. Kalinchak 《Journal of Engineering Physics and Thermophysics》1995,68(3):400-407

The influence of radiation on critical parameters of heterogeneous ignition and extinction of a carbon particle in air is analyzed with allowance for two heterogeneous reactions.Notation Q _chem surface power of heat release through chemical reactions, W/m² - Q _h overall density of heat flux by molecular convectionQ _m.c. and radiationQ _i, W/m² - d particle diameter, m - t time, sec - T ₁,T ₂,T ₂,T _w particle gas, and reaction chamber wall temperature respectively, K - ₁, ₂ particle and gas density, kg/m³ - c ₁,c ₂ specific heat of particle and gas, J/(kg·K) - n _ox relative mass concentration of oxidant in the gaseous medium - q _i thermal effect of the first (i=1, C+O₂=CO₂) and the second (i=2, 2C+O₂=2CO) chemical reactions, J/kg - _i stoichiometric coefficient - E activation energy, J/mole - k _0i preexponential factor, m/sec - R universal gas constant, J/(mole·K) - Nu Nusselt number - ₂ thermal conductivity coefficient of gas, W/(m·K) - D ₂ diffusion coefficient of gas, m²/sec - ₂₀, ₂₀,D ₀ density, thermal conductivity, and diffusion coefficients of gas atT ₀ - emissivity coefficient - Stefan-Boltzmann constant, W/(m²·K⁴) - , heat- and mass-transfer coefficients, W/(m²·K), m/sec. Indexes: 1, particle - 2 gas - ign ignition - ext extinction - w wall - st steady - cr critical - in initial - c combustion - m maximum - lim limiting I. I. Mechnikov Odessa State University. Translated from Inzhenerno-Fizicheskii Zhurnal, Vol. 68, No. 3, pp. 466–473, 1995. 相似文献