并行计算环境与数值并行算法研究   总被引:3,自引:0,他引:3  
本文介绍并行计算环境的一些最新进展,探讨我国当前数值并行算法研究的技术路线,特别强调高水平的基础研究与解决科学工程计算中重大挑战问题的结合。  相似文献   

对本文的研究与创新工作概括如下 :( 1 )并行计算模型是研究并行计算可扩展性的基础。本文在深入分析已有并行计算模型的基础上 ,对常用并行计算模型进行分类 ,指出了它们的适用范围和优缺点。( 2 )深入分析了可扩展性与执行时间、可扩展性与单机性能之间的关系。结果表明 :如果片面强调执行时间或单机性能 ,可能会对可扩展性带来不利的影响。从理论和实验上分析了任务和数据分配策略对并行系统可扩展性的影响。( 3)首次从费用有效性的角度提出了近优可扩展模型。它不仅可以描述并行系统的可扩展能力 ,而且可以根据小规模系统的性能指标 ,预…  相似文献   

利用IBM4381-P03型及CPU计算机进行并行处理功能的开发和并行计算的应用,得到较高的加速比。  相似文献   

研究如何提高航空发动机包容性数值仿真并行计算效率的问题。由于仿真需要庞大的网格数量、高度的非线性和复杂的接触算法,并行计算效率一直比较低,已经成为制约工程应用的重要因素。为了提高航空发动机包容性数值仿真并行计算效率,提出了在共享内存并行模式(Share-Memory Parallel,SMP)下,采用自接触算法进行显式动力学分析,提高并行计算效率的方法。实际算例的比较表明,相比传统的面-面接触算法,采用自接触算法可以有效提高航空发动机包容性数值仿真并行计算效率。  相似文献   

CG法分布式并行计算的实现   总被引:1,自引:0,他引:1  
叙述了在Windows环境下CG法的分布式并行计算的基本方法、结构与算法,并用Visual C++编写源代码具体实现。  相似文献   

徐宁  杨庚 《计算机仿真》2003,20(9):53-55,93,94
该文采用有限元方法对动力模型进行了数值仿真。首先推导出模型方程,然后根据将要采用的数值方法提出了新的变量替换关系和无量纲化参数,再讨论了一般器件的边界条件,最后对具有亚微米级的GaAs MESFET进行了数值仿真,数值结果表明在一定的条件下,电子的流动具有跨音速特征。  相似文献   

在以MPICH技术构建的局域网集群系统下,利用并行计算程序进行了超声速弧形翼-身组合体的三维绕流流场数值仿真,得到了弧形翼射弹的流场信息;并且通过对不同数量网格在集群不同结点数目下的计算结果进行分析比较,得出了加速比和并行效率随结点数目变化的规律,发现大规模网格在加速比和并行效率方面性能优越,更适合集群系统的并行计算,同时验证了此集群系统在数值仿真应用中的有效性和优越性,为进行大规模科学工程计算提供了技术支持.  相似文献   

并行处理仿真为并行系统的建模分析,并行算法的模拟执行以及并行环境的性能评价提供支持,本文利用任务相关仿真时钟和重叠时间片建立了一种支持完全并和用户并发方式的并行多任务模型,并结合对不同调度算法和互连结构的仿真实验,着重分析了任务调度对系统性能的影响以及互连网络技术与通信开销的关系。同时,仿真环境还提供模拟执行的并发度曲线和任务执行踪迹供和户分析调试并行程序。  相似文献   

为解决当前群体行为模型因规模扩大而导致计算量剧增的问题,采用并行离散事件方法构建了大规模群体行为模型,利用YH-SUPE仿真引擎实现了群体行为模型的并行计算。重点介绍了模型中仿真对象和仿真对象信息交互的设计方法,并对该模型在不同数量的节点和仿真实体的环境下进行了测试。实验结果表明,将并行计算引入群体行为建模之中,可以显著提高仿真个体的数量,更加有效地支持了群体模型的实时运行。  相似文献   

基于Fluent的全机数值模拟及并行计算   总被引:3,自引:0,他引:3  
利用CFD商用软件Fluent对亚声速飞行飞机的三维绕流流场进行了数值模拟以及并行计算,得到了飞机附近的流场,实现了此软件在高性能并行计算机上的并行;并且通过对不同数量网格在不同结点数目机群上的计算结果进行分析比较,验证了此商用软件在并行平台上应用的有效性,也为进行大规模科学工程计算提供了技术参照。  相似文献   

并行计算是CFD技术发展的必然趋势。本文从高超声速流动的特点出发,研究多分区结构网格下CFD并行计算方法,重点解决了区域之间流场信息的数据交换问题和边界处理问题,以保证流场的连续性。本文采用有限体积法求解高超声速流场,空间离散格式为Osher-Chakravarthy TVD格式,利用MPI消息传递模式完成数据交换,在自主搭建的PC集群上进行算例考核,验证了算法的可行性和正确性。  相似文献   

面向高性能数值计算的并行计算模型DRAM(h)   总被引:11,自引:0,他引:11  
张云泉 《计算机学报》2003,26(12):1660-1670
提出了一个基于存储层次的新并行计算模型DRAM(h),并在该模型下对两个经典并行数值计算算法的不同实现形式:四种形式并行下三角方程求解(PTRS)和六种形式无列选主元并行LU分解(PLU),进行了分析.模型分析表明,具有近乎相同时间和空间复杂性的同一算法不同实现形式,在该模型下会有完全不同的存储复杂度.作者在日立公司SR2201MPP并行机、曙光3000超级服务器和中国科学院科学与工程计算国家重点实验室(LSEC)的128节点Linux Cluster等三种并行计算平台上对模型分析结果进行了实验验证.结果表明,该模型分析在绝大多数情况下都能较好地与不同实验平台上的实验结果吻合.个别出现偏差的分析结果,在根据计算平台的存储层次特点修改模型分析的假定后,也能够进行解释.这说明了该模型对不同形式的算法实现进行存储访问模式区分的有效性.对在计算模型中加入指令/线程级并行的可行性和方法的研究是下一步的工作.  相似文献   

莫则尧 《计算机学报》2004,27(10):1311-1319
复杂物理现象通常由多类复杂的物理过程紧耦合构成,其数值模拟也通常由适用不同物理过程的多类并行应用程序紧耦合完成.如何设计这些物理过程之间的联接算法,既要保证程序之间数据传递的高效,又要保证程序各自运行和总体模拟的高效,还要保证程序各自开发的独立,是一个值得研究的课题.该文基于广泛应用于高温高压多物理研究中的辐射流体力学和中子输运多物理并行数值模拟,在非结构网格上,提出了两种联接算法:完全松散联接算法和两层紧耦合联接算法,前者侧重于实现程序各自运行的高效和开发的独立,后者在前者的基础上,还权衡了数据传递和总体模拟的高效.在两台并行机的数百个处理机上,通信复杂度分析和数值实验结果表明两个算法均是有效的,可推广适用于辐射或中子输运与其他流体力学的多物理并行数值模拟应用中.特别地,两层紧耦合联接算法是高效可扩展的,取得了近似最优的并行性能.  相似文献   

The development of intelligent transportation systems (ITS) and the resulting need for the solution of a variety of dynamic traffic network models and management problems require faster‐than‐real‐time computation of shortest path problems in dynamic networks. Recently, a sequential algorithm was developed to compute shortest paths in discrete time dynamic networks from all nodes and all departure times to one destination node. The algorithm is known as algorithm DOT and has an optimal worst‐case running‐time complexity. This implies that no algorithm with a better worst‐case computational complexity can be discovered. Consequently, in order to derive algorithms to solve all‐to‐one shortest path problems in dynamic networks, one would need to explore avenues other than the design of sequential solution algorithms only. The use of commercially‐available high‐performance computing platforms to develop parallel implementations of sequential algorithms is an example of such avenue. This paper reports on the design, implementation, and computational testing of parallel dynamic shortest path algorithms. We develop two shared‐memory and two message‐passing dynamic shortest path algorithm implementations, which are derived from algorithm DOT using the following parallelization strategies: decomposition by destination and decomposition by transportation network topology. The algorithms are coded using two types of parallel computing environments: a message‐passing environment based on the parallel virtual machine (PVM) library and a multi‐threading environment based on the SUN Microsystems Multi‐Threads (MT) library. We also develop a time‐based parallel version of algorithm DOT for the case of minimum time paths in FIFO networks, and a theoretical parallelization of algorithm DOT on an ‘ideal’ theoretical parallel machine. Performances of the implementations are analyzed and evaluated using large transportation networks, and two types of parallel computing platforms: a distributed network of Unix workstations and a SUN shared‐memory machine containing eight processors. Satisfactory speed‐ups in the running time of sequential algorithms are achieved, in particular for shared‐memory machines. Numerical results indicate that shared‐memory computers constitute the most appropriate type of parallel computing platforms for the computation of dynamic shortest paths for real‐time ITS applications.  相似文献   

采用计算流体力学方法,对高超声速流场进行了多区并行计算研究。基于MPI消息传递库采用Fortran语言编制了CFD并行计算程序,对NS方程采用AUSMPW+格式和LU-SGS方法求解。针对流场采用多区剖分,将每一个子区分配给相应节点进行计算。每一迭代步,相邻子区域间交换边界数据。计算表明,本文所建立的程序和方法是可行的,能够进一步延伸到大规模并行计算和工程应用中。  相似文献   

Message Passing (MP) and Distributed Shared Memory (DSM) are the two most common approaches to distributed parallel computing. MP is difficult to use, whereas DSM is not scalable. Performance scalability and ease of programming can be achieved at the same time by using navigational programming (NavP). This approach combines the advantages of MP and DSM, and it balances convenience and flexibility. Similar to MP, NavP suggests to its programmers the principle of pivot-computes and hence is efficient and scalable. Like DSM, NavP supports incremental parallelization and shared variable programming and is therefore easy to use. The implementation and performance analysis of real-world algorithms, namely parallel Jacobi iteration and parallel Cholesky factorization, presented in this paper supports the claim that the NavP approach is better suited for general-purpose parallel distributed programming than either MP or DSM.  相似文献   

文章提出了基于网格计算来实现电力系统分布式并行计算的方案。主要涉及计算池(ComputingPool)的设计、资源的管理与动态分配,以及图论分割和稀疏数值计算库的设计和实现等。文章首先介绍了网格计算应用于电力系统分布式并行计算的概念,在此基础上,分析了基于GlobusR网格计算开发平台实现的以上功能模块。最后对测试平台和测试结果进行了简要的介绍。  相似文献   

本文介绍了高性能并行计算在CFD数值模拟中的应用。CFD高性能并行计算可扩大求解规模,加快求解速度,是CFD实现高效计算的必然发展趋势。本文通过数值风洞的概念分析了CFD高性能计算的应用前景及对高性能计算的需求。通过某乘波飞行器前体并行算例对8~256CPU的CFD大规模并行效率和加速比进行了分析,并将CFD并行计算应用于高温热化学非平衡的返回舱数值计算中。  相似文献   

Particle tracking methods are a versatile computational technique central to the simulation of a wide range of scientific applications. In this paper, we present a new parallel particle tracking framework for the applications of scientific computing. The framework includes the in-element particle tracking method, which is based on the assumption that particle trajectories are computed by problem data localized to individual elements, as well as the dynamic partitioning of particle-mesh computational systems. The ultimate goal of this research is to develop a parallel in-element particle tracking framework capable of interfacing with a different order of accuracy of ordinary differential equation (ODE) solver. The parallel efficiency of such particle-mesh systems depends on the partitioning of both the mesh elements and the particles; this distribution can change dramatically because of movement of the particles and adaptive refinement of the mesh. To address this problem we introduce a combined load function that is a function of both the particle and mesh element distributions. We present experimental results that detail the performance of this parallel load balancing approach for a three-dimensional particle-mesh test problem on an unstructured, adaptive mesh, and demonstrate the ability of interfacing with different ODE solvers.  相似文献   

In this article we present a new parallel programming environment, called distributed object-oriented virtual computing environment (DOVE), for clustered computers based on distributed object model. In DOVE, a parallel program is built as a collection of concurrent objects each of which has its own computing power and which interacts with one another by remote method invocation. The parallelism is encapsulated within distributed objects, which can be handled the same way as local objects. The main goal of DOVE is to provide users with an easy-to-use transparent parallel programming environment while supporting efficient parallelism encapsulated and distributed among objects. For the experiment and evaluation of DOVE, two parallel application programs have been developed both on DOVE and PVM.  相似文献   

