首页 | 本学科首页   官方微博 | 高级检索  
     


Parallelization of a multiblock flow code: an engineering implementation
Affiliation:1. NASA Langley Research Center, Hampton, VA 23681-0001, USA;2. Advanced Scientific Computing Ltd., Waterloo, Ontario, Canada;1. National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, IL, USA;2. Department of Computer and Information Engineering, Inha University, Incheon, Republic of Korea;1. CFD and Aeroacoustics Department, ONERA, 92320 Châtillon, France;2. Applied Aerodynamics Department, ONERA, 92190 Meudon, France;1. Cyberscience Center, Tohoku University, Sendai 980-8578, Japan;2. Graduate School of Information Sciences, Tohoku University, Sendai 980-8578, Japan;3. NEC System Technologies, Ltd., Osaka 540-8551, Japan;4. Department of Aerospace Engineering, Tohoku University, Sendai 980-8579, Japan;5. Department of Mechanical Systems Engineering, Tokyo University Agriculture and Technology, Koganei 184-8588, Japan;6. Japan Science and Technology Agency, Core Research for Evolutional Science and Technology, Japan
Abstract:Current trends in computer hardware are dictating a gradual shift toward the use of clusters of relatively inexpensive but powerful workstations, or massively parallel processing (MPP) machines, for scientific computing. However, most computational fluid dynamics (CFD) codes in use today were developed for large, shared-memory machines and are not readily portable to the distributed computing environment. One major hurdle in porting CFD codes to distributed computing platforms is the difficulty encountered in partitioning the problem so that the computation-to-communication ratio for each compute node (process) is maximized and the idle time during which one node waits for other nodes to transfer data is minimized. In the present work, pertinent issues involved in the parallelization of a widely used multiblock Navier–Stokes code TLNS3D are discussed. An engineering approach is used here to parallelize this code so that minimal deviation from the original (nonparallel) code is incurred. A natural partitioning along grid blocks is adopted in which one or more blocks are distributed to each of the available nodes. An automatic, static load-balancing strategy is employed for equitable distribution of computational work to specified nodes. Both parallel virtual machine (PVM) and message passing interface (MPI) protocols are incorporated for data communication to allow maximum portability to a wide range of computer configurations. Results are presented that are comparable with a priori estimates of performance for distributed computing and that are competitive in terms of central processing unit (CPU) time and wall time usage with large, shared-memory supercomputers.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号