首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 109 毫秒
1.
文中采用阶段并行模型,分析了并行NAS benchmarks的结构和资源需求特征,得到了其对内存、I/O及通信的时延和带宽等需求的参数值。这些量化的参数对各类并行计算机的设计和评价具有一定的参考价值。  相似文献   

2.
几何定理证明的数值验证法以数值计算代替符号计算来提高效率,但是在实际应用中对复杂命题的解题效率还存在问题。该文尝试用并行计算方法来提高算法效率,分析了MPI编程模型下的任务划分、通信组织、任务调度等问题,并在MPICH2下实现了数值并行验证算法,对算法的并行性能指标进行了测试,得到了较好的结果。  相似文献   

3.
Shared-nothing并行数据库系统查询优化技术   总被引:15,自引:0,他引:15  
查询优化是并行数据库系统的核心技术。该文介绍作者自行研制的一个Shared-nothing并行数据库系统PBASE/2中独特的两阶段优化策略。为了缩减并行相称优化庞大的搜索空间,PBASE/2将并行查询优化划分为顺序优化和并行化两个在阶段。在顺序优化阶段对并行化后的通信代价进行预先估算,将通信开销加入顺序优化的代价模型,同时对动态规划搜索算法进行了修正和扩展,保证了顺序优化阶段得到的最小代价计划在  相似文献   

4.
大规模并行计算机的出现和发展迫切要求有新的并行处设计理论和技术来指导更实际的并行算法的设计。本文首先简单介绍了针对MPC提出孤LogP和Barrier-LogP并行计算模型,然后借助于Barrier-LogP模型从通信平衡、数据分配和重叠通信与计算这三个方面讨论了更实际的并行算法设计的一般方法和技巧。  相似文献   

5.
SMPCluster:如何开发两级并行   总被引:3,自引:1,他引:3       下载免费PDF全文
本文由基础的Linux操作系统入手,考察在一个SMP系统内部的两种不同的并行实现机制:代表共享存储模型的线程模型(和OpenMP模型)和代表消息传递模型的MPI模型。然后,通过分析应当如何结合节点和节点内两级并行得出:从效率和易用性的综合考虑,在LinuxSMP Cluster上应当直接使用利用共享内存进行通信的MPI进行编程。  相似文献   

6.
为了利用细观力学方法研究复合固体推进剂材料的力学性能,需要建立具有代表性的推进剂细观胞元模型,针对当前算法普遍存在的计算效率低下问题,依据分子动力学思想生成颗粒堆积模型的性能特性,通过分析负载均衡和消息通信,提出了并行模型的三个准则,设计了区域分解的并行策略,并利用共享存储并行和分布式存储并行两级并行手段实现了并行算法。最后在IBMBladeCenter集群平台上通过实例证明算法可以缓解负载均衡并缩减通信开销,上述试验数据验证了算法的高效性,达到了提高胞元生成效率的目的。  相似文献   

7.
滕腾  李龙澍 《计算机技术与发展》2007,17(10):105-108,112
一般粗粒度并行遗传算法(CGGA)的性能受诸多因素的影响表现不尽如人意。以降低通信代价为主要目标,受物种金字塔模型的启发,设计了一种双阈值限制下的自调整堆结构,并对其堆调整具体操作进行了改进,以期望改进后算法中种群间的通信代价大幅度降低,优化收敛速度,提高算法效率。通过对遗传算法的几个典型测试函数通信量的分析和实验表明,基于该模型的并行遗传算法在降低通信代价、提高收敛速度、优化最终解方面收效明显。  相似文献   

8.
为实现高性能有必要采用细粒度的并行,但必须解决其中增大的通信开销问题。多线程计算不仅用来实现细粒度的并行,合理的调度策略还有助于隐藏通信延迟。但其中存在着线程切换开销的问题,多线程处理器可能是一种解决办法。  相似文献   

9.
一种基于Message Passing的通信技术和并行程序设计方法   总被引:3,自引:0,他引:3  
分布式并行计算机系统中,由于没有处理机间的共享内存以支持数据通信,因而需要以Message-Passing的方式来实现处理机间的数据共享。本文介绍一种基于Messagge-Passing的通信技术以及以此为基础的并行程序设计方法。  相似文献   

10.
并行处理仿真为并行系统的建模分析,并行算法的模拟执行以及并行环境的性能评价提供支持,本文利用任务相关仿真时钟和重叠时间片建立了一种支持完全并和用户并发方式的并行多任务模型,并结合对不同调度算法和互连结构的仿真实验,着重分析了任务调度对系统性能的影响以及互连网络技术与通信开销的关系。同时,仿真环境还提供模拟执行的并发度曲线和任务执行踪迹供和户分析调试并行程序。  相似文献   

11.
有效的消息通讯是提高分布存储器并行计算机性能的关键因素.点对点通讯和广播通讯是2种常用的消息通讯方法,而多播通讯(Multicasting)是指从一个源节点同时给任意多个目标节点发送消息,这种通讯比点对点和广播2种方式更具一般性,适用于很多实际应用的需求.本文针对PAR95并行计算机的二维网格结构,提出一种基于网络分解的多播消息通讯方法,并比较了该方法与用多个点对点方法实现多播通讯的性能.  相似文献   

12.
We present a portable, parallel implementation of an urban air quality model. The parallel model runs on the Intel Delta, Intel Paragon, IBM SP2, and Cray T3D, using a variety of standard communication libraries. We analyze the performance of the air quality model on these platforms based on a model derived from the parallel communication behavior and sequential execution time of the air quality model. We predict the performance of the next generation air quality models based on this analysis.  相似文献   

13.
邬延辉  陆鑫达 《计算机工程》2004,30(9):15-16,30
网格中的机群或者超级计算机通过广域网互相连接,在这个平台上进行并行编程应用的一个主要问题是它们的等级网络结构,广域网上的延迟和带宽通常是局域网中的好几倍。该文针对LogP模型进行了扩展,提出带参数的LogP模型,详细讨论比较了其中的各个参数以及如何通过实验对它们进行测量。在此模型基础上通过选取合适的通信结构,并且把消息分割成多个部分。在不同的广域连接上采用并行化发送,达到优化通信操作的目的。  相似文献   

14.
For the moment,commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers,which are just traditional sequential FORTRAN or C compilers expanded with communication statements.Programmers suffer from writing parallel programs with communication statements. The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD(Single Program Multiple Data) computation model and greatly ease the parallel programming with high communication efficiency.The core function of parallel C precompiler has been successfully verified on a transputer-based parallel computer.Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique.  相似文献   

15.
通信信息并行传输效率是衡量移动通信网络系统性能的重要指标。为了提高移动网络信息通信速率,研究通信延迟对信息并行传输的影响。根据移动网络通信理论和IEEE802.11协议对移动网络信息并行通信过程进行分析,构建移动通信网络模型,得到串行延时、"数据包"的串行延时、路由延时、传播延时等参数,获取移动通信延迟与信息并行传输效率之间的约束关系,分析各个延迟参数对信息并行通信速率的影响。仿真结果表明,随着通信延迟的不断变化,信息并行传输速率也在发生变化,因此降低通信延迟能最大程度地提高信息并行传输速率。  相似文献   

16.
In this paper, we propose a novel parallel 3D Delaunay triangulation algorithm for large-scale simulations on parallel computers. Our method keeps the 3D boundary representation model information during the whole parallel 3D Delaunay triangulation process running on parallel computers so that the solid model information can be accessed dynamically and the meshing results can be very approaching to the model boundary with the increase of meshing scale. The model is coarsely meshed at first and distributed on CPUs with consistent partitioned shared interfaces and partitioned model boundary meshes across processors. The domain partition aims at minimizing the edge-cuts across different processors for minimum communication cost and distributing roughly equal number of mesh vertices for load balance. Then a parallel multi-scale surface mesh refinement phase is iteratively performed to meet the mesh density criteria followed by a parallel surface mesh optimization phase moving vertices to the model boundary so as to fit model geometry feature dynamically. A dynamic load balancing algorithm is performed to change the partition interfaces if necessary. A 3D local non-Delaunay mesh repair algorithm is finally done on the shared interfaces across processors and model boundaries. The experimental results demonstrate our method can achieve high parallel performance and perfect scalability, at the same time preserve model boundary feature and generate high quality 3D Delaunay mesh as well.  相似文献   

17.
在以前的基于目标空间划分的并行体数据绘制算法中,局部绘制和图象融合是两个串行的过程,在节点机的局部绘制阶段几乎没有数据通讯,但在数据融合阶段数据通讯量非常大,出现总线争用甚至通讯阻塞,而且在这个阶段有非常大的同步开销。本文利用流水线结构,让局部体数据绘制和图象融合并行执行,很好地解决了上述缺点。并在一个基于微机的流水线结构上实现了一个新的基于目标空间划分的并行体数据绘制算法。  相似文献   

18.
We present buffered coscheduling, a new methodology to multitask parallel jobs in a message-passing environment and to develop parallel programs that can pave the way to the efficient implementation of a distributed operating system. Buffered coscheduling is based on three innovative techniques: communication buffering, strobing, and non-blocking communication. By leveraging these techniques, we can perform effective optimizations based on the global status of the parallel machine rather than on the limited knowledge available locally to each processor

The advantages of buffered coscheduling include higher resource utilization, reduced communication overhead, efficient implementation of flow-control strategies and fault-tolerant protocols, accurate performance modeling, and a simplified yet ;.till expressive parallel programming model which offloads many resource-management tasks to the operating system. Preliminary experimental results show that buffered coscheduling is very effective in increasing the overall performance in the presence of load imbalance and communication-intensive workloads and is relatively insensitive to the local process scheduling strategy.  相似文献   

19.
D.A.  P.D. 《Performance Evaluation》2005,60(1-4):165-187
We present a new performance modeling system for message-passing parallel programs that is based around a Performance Evaluating Virtual Parallel Machine (PEVPM). We explain how to develop PEVPM models for message-passing programs using a performance directive language that describes a program’s serial segments of computation and message-passing events. This is a novel bottom-up approach to performance modeling, which aims to accurately model when processing and message-passing occur during program execution. The times at which these events occur are dynamic, because they are affected by network contention and data dependencies, so we use a virtual machine to simulate program execution. This simulation is done by executing models of the PEVPM performance directives rather than executing the code itself, so it is very fast. The simulation is still very accurate because enough information is stored by the PEVPM to dynamically create detailed models of processing and communication events. Another novel feature of our approach is that the communication times are sampled from probability distributions that describe the performance variability exhibited by communication subject to contention. These performance distributions can be empirically measured using a highly accurate message-passing benchmark that we have developed. This approach provides a Monte Carlo analysis that can give very accurate results for the average and the variance (or even the probability distribution) of program execution time. In this paper, we introduce the ideas underpinning the PEVPM technique, describe the syntax of the performance modeling language and the virtual machine that supports it, and present some results, for example, parallel programs to show the power and accuracy of the methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号