共查询到20条相似文献,搜索用时 31 毫秒
1.
This article focuses on the effect of both process topology and load balancing on various programming models for SMP clusters
and iterative algorithms. More specifically, we consider nested loop algorithms with constant flow dependencies, that can
be parallelized on SMP clusters with the aid of the tiling transformation. We investigate three parallel programming models,
namely a popular message passing monolithic parallel implementation, as well as two hybrid ones, that employ both message
passing and multi-threading. We conclude that the selection of an appropriate mapping topology for the mesh of processes has
a significant effect on the overall performance, and provide an algorithm for the specification of such an efficient topology
according to the iteration space and data dependencies of the algorithm. We also propose static load balancing techniques
for the computation distribution between threads, that diminish the disadvantage of the master thread assuming all inter-process
communication due to limitations often imposed by the message passing library. Both improvements are implemented as compile-time
optimizations and are further experimentally evaluated. An overall comparison of the above parallel programming styles on
SMP clusters based on micro-kernel experimental evaluation is further provided, as well. 相似文献
2.
3.
4.
5.
6.
网格生成是计算流体力学中非常重要的一环,大规模数值模拟过程中对网格精度要求的提高会导致网格生成所耗的时间增加。文中基于OpenFoam开源软件中的网格生成算法,主要研究多面体网格的并行生成,并提出OpenMP和MPI混合并行的多面体网格生成方法。通过理论分析得到,使用混合并行方法生成相同质量的网格时,混合并行方法生成网格的时间消耗随着线程数量和网格单元数量的增加而减少。3组使用不同求解器的数值模拟实验结果表明,该混合并行方法不但可以保证生成网格的质量——可以正常进行数值计算模拟且模拟结果与原方法相比几乎没有差别,而且生成同样质量与数量网格的耗时最多可以缩短至未使用OpenMP并行方法之耗时的1/4以内。 相似文献
7.
基于SMP集群系统的并行编程模式研究与分析 总被引:4,自引:1,他引:4
并行计算技术是计算机技术发展的重要方向之一,SMP与集群是当前主流的并行体系结构。当前并行程序设计方法主要采用基于消息传递模型的MPI和基于共享存储模型的OpenMP,两种编程模式各有特点和适用范围。对SMP集群以及MPI和OpenMP的特点进行了分析,介绍了在SMP集群系统中利用MPI和OpenMP混合编程的可行性方法。 相似文献
8.
SMP集群系统上矩阵特征问题并行求解器的有效算法 总被引:2,自引:0,他引:2
对称矩阵三对角化和三对角对称矩阵的特征值求解是稠密对称矩阵特征问题并行求解器的关键步.针对SMP集群系统的多级体系结构,基于Householder变换的矩阵三对角化和三对角矩阵特征值问题的分而治之算法,给出了它们的MPI+OpenMP混合并行算法.算法研究集中在SMP集群系统环境下的负载平衡、通信开销和性能评价.混合并行算法的设计结合了粗粒度线程并行模式和任务共享的动态调用方法,改善了MPI算法中的负载平衡问题、降低了通信开销.在深腾6800上的实验表明,基于混合并行算法的求解器比纯MPI版本的求解器具有更好的性能和可扩展性. 相似文献
9.
普通Kriging方法是进行空间降水插值的一种有效方法。然而一方面由于海量数据插值计算量大,另一方面该算法的时间复杂度大,为减少空间降水插值的计算时间,采用OpenMP和MPI混合并行技术,实现Kriging并行算法。在Windows操作系统上搭建并行计算环境,实验数据表明,该并行算法能有效地节省计算时间。 相似文献
10.
Parallel loop self‐scheduling on parallel and distributed systems has been a critical problem and it is becoming more difficult to deal with in the emerging heterogeneous cluster computing environments. In the past, some self‐scheduling schemes have been proposed as applicable to heterogeneous cluster computing environments. In recent years, multicore computers have been widely included in cluster systems. However, previous researches into parallel loop self‐scheduling did not consider certain aspects of multicore computers; for example, it is more appropriate for shared‐memory multiprocessors to adopt Open Multi‐Processing (OpenMP) for parallel programming. In this paper, we propose a performance‐based approach using hybrid OpenMP and MPI parallel programming, which partition loop iterations according to the performance weighting of multicore nodes in a cluster. Because iterations assigned to one MPI process are processed in parallel by OpenMP threads run by the processor cores in the same computational node, the number of loop iterations allocated to one computational node at each scheduling step depends on the number of processor cores in that node. Experimental results show that the proposed approach performs better than previous schemes. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献
11.
随着多核计算机的出现,并行计算技术的发展进入了一个新的阶段,如何将并行技术引入空间数据处理系统成为了当前研究的热点问题。本文给出了一种基于分布式/共享内存结构的并行空间数据处理系统设计方案,用于解决空间数据量增大和下行速度大幅度提高所带来的处理速度慢,数据积压等问题。 相似文献
12.
本文分析了非结构网格多群粒子输运Sn方程求解的并行性,拟合多核机群系统的特点,设计了MPI/OpenMP混合程序,针对空间网格点采用区域分解划分,计算结点间基于消息传递MPI编程,每个MPI计算进程在计算过程中碰到关于能群的计算,就生成多个OpenMP线程,计算结点内针对能群进行多线程并行计算。数值测试结果表明,非结构网格上的粒子输运问题的混合并行计算能较好地匹配多核机群系统的硬件结构,具有良好的可扩展性,可以扩展到1024个CPU核。 相似文献
13.
模拟器是计算机体系结构研究的重要工具.近年来并行计算机体系结构的发展给计算机模拟带来了巨大的挑战.一方面,随着体系结构朝着多核以及众核处理器发展,模拟的目标系统规模随着模拟核数以摩尔定律的速度增加而不断增大;另一方面,串行模拟的速度因为模拟器运行所在宿主机主频提速减缓而停滞不前.上述两方面的原因使得传统的串行模拟方式无法满足对新兴体系结构模拟规模和速度的需求.以众核处理器和众核集群这两种体系结构为例,并行模拟技术在并行计算机体系结构模拟中是必要而且可行的.对于众核处理器的模拟,使用并行离散事件模拟对其进行加速,在模拟精度不变的前提下,提高模拟速度10.9倍.对于众核集群的模拟,模拟的目标系统总规模达到1024核,并且支持MPI/Pthreads混合编程的运行环境. 相似文献
14.
15.
Granularity Analysis for Exploiting Adaptive Parallelism of Declarative Programs on Multiprocessors 下载免费PDF全文
1IntroductionAutomaticparallelexecutionofdeclarativelanguageprograms(e.g.functionprogramsandlogicprograms)isattractive,asitmakestheuseofparallelcomputersveryeasy,andtheprogrammerneednotbeconcernedwiththespecificsoftheunderlyingparallelarchitecture.However,ifseveralprocessorsareexecutingconcurrently,exploitingadaptiveparallelismishardduetonon-determinismoftaskgranularityanddatadependenciesamongtasks.TheearlysolutionproposedbyConeryandKibler[2]usesanorderingalgorithmtodeterminedependenciesatrun… 相似文献
16.
为应对传统遗传算法在处理大规模组合优化问题面临的进化速度缓慢,难以达到实时要求的严峻挑战,提出了一种在多核PC集群系统上实现“粗粒度一主从式”混合并行遗传算法的模型:通过把“粗粒度一主从式”并行遗传算法映射到多核PC集群上,结合消息传递和共享存储两种并行编程模型,在节点间使用消息传递模型(MPI),对应的遗传算法为粗粒度并行遗传算法,在节点内使用共享存储模型(OpcnMP),对应的遗传算法为主从式并行遗传算法,用MPI和OpenMP混合编程的方式以进程和线程两级并行在多核集群上实现具体的混合并行遗传算法。理论分析和实验结果表明,提出的实现模型有较好的性能,可大大改进传统遗传算法的缺陷。为利用并行遗传算法在普通多核PC集群上处理大规模组合优化问题提出了一种有效、可行的解决方案。 相似文献
17.
并行计算技术是计算机技术发展的重要方向之一,SMP与集群是当前主流的并行体系结构。当前并行程序设计方法主要采用基于消息传递模型的MPI和基于共享存储模型的OpenMP,两种编程模式各有特点和适用范围。对SMP集群以及MPI和OpenMP的特点进行了分析,介绍了在SMP集群系统中利用MPI和OpenMP混合编程的可行性方法。 相似文献
18.
Hwang Gwan-Hwan Chen Cheng-Wei Lee Jenq Kuen Dz-Ching Ju Roy 《The Journal of supercomputing》2003,25(1):17-41
In this paper, we propose a new automatic data alignment model called segmented alignment. The conventional data alignment model, such as that used in High-Performance Fortran (HPF), aligns arrays with the whole index domain. The principle of our proposed segmented alignment is to allow alignment relations within delimited index domains. We first provide motivating examples to illustrate how code fragments of HPF with EOSHIFT or CSHIFT operations, or produced by synthesis operations can benefit from our enhanced alignment scheme. Second, we show that this new model can be implemented in HPF-like languages by adding WHEN and IN constructs to them. In addition, we show that the new proposed schemes for WHEN and IN constructs can be emulated using standard HPF syntax. Finally, we address issues related to automatic data alignment for the new proposed model, and present an algorithm to automatically align programs using our segmented alignment scheme. Since the optimal algorithm to do this is NP-hard, a practical heuristic is also given. Our experiments were performed on a DEC Alpha Farm with HPF environments. Our experiments confirm our theory that our proposed alignment scheme can significantly enhance not only the performance of HPF code fragments with EOSHIFT or CSHIFT operations, but also that of codes produced by synthesis operations. 相似文献
19.
基于机群系统的数据存储分布是并行数据库领域的一个重要问题。已有的研究工作多集中在基于单个关系的存储分布,不能有效支持复杂多连接查询处理。文章提出了多个关系整体分布方法,给出分布属性选择和处理机分配算法。实验结果表明,算法具有良好的性能,有助于提高并行多连接查询效率。 相似文献
20.
多核处理器机群Memory层次化并行计算模型研究 总被引:7,自引:0,他引:7
多核处理器机群点对点通信同时具有memory纵向层次化特征和横向层次化的新特征.纵向层次化特征揭示了对不同大小和步长的消息进行点对点通信时消息通信中间件对其性能的影响;横向层次化的新特征由intra-CMPi、nter-CMP和inter-node消息通信性能的显著差异引起,目前缺少有效的分析模型.文中提出一种新的memory层次化并行计算模型,对多核处理器机群memory横向、纵向层次化特征进行了统一的抽象.在对多核处理器机群点对点通信和集合通信的开销进行模型分析和实际测试中,新模型的精确性优于现有的未引入memory横向层次化特征的模型. 相似文献