排序方式: 共有33条查询结果,搜索用时 15 毫秒
31.
32.
HYCOM(hybrid coordinate ocean model)海洋数值模式要求较高的吞吐量和相对较小的计算量,这给并行算法设计带来了巨大的挑战.针对具有高吞吐量的海洋数据同化问题,设计了一种基于区域分解的并行优化算法.首先,提出了一种灵活的文件访问方法,可以高效地从磁盘读取大量的数据,避免数据访问冲突,大幅降低磁盘寻址操作的频率.此外,设计了一种避免通信的策略,以一些额外的计算量为代价大幅减少进程间的通信量.最后,提出了一种基于管道流的通信策略,以实现无冲突的消息传递.实验结果表明,该算法与基线算法相比,总体性能提高了5倍,其中文件读取速度提升6倍,进程间的通信性能提升了2.7倍. 相似文献
33.
In particle transport simulations, radiation effects are often described by the discrete ordinates (Sn) form of Boltzmann equation. In each ordinate direction, the solution is computed by sweeping the radiation flux across the grid. Parallel Sn sweep on an unstructured grid can be explicitly modeled as topological traversal through an equivalent directed acyclic graph (DAG), which is a data-driven algorithm. Its traditional design using MPI model results in irregular communication of massive short messages which cannot be effciently handled by MPI runtime. Meanwhile, in high-end HPC cluster systems, multicore has become the standard processor configuration of a single node. The traditional data-driven algorithm of Sn sweeps has not exploited potential advantages of multi-threading of multicore on shared memory. These advantages, however, as we shall demonstrate, could provide an elegant solution resolving problems in the previous MPI-only design. In this paper, we give a new design of data-driven parallel Sn sweeps using hybrid MPI and Pthread programming, namely Sweep-H, to exploit hierarchical parallelism of processes and threads. With special multi-threading techniques and vertex schedule policy, Sweep-H gets more effcient communication and better load balance. We further present an analytical performance model for Sweep-H to reveal why and when it is advantageous over former MPI counterpart. On a 64-node multicore cluster system with 12 cores per node, 768 cores in total, Sweep-H achieves nearly linear scalability for moderate problem sizes, and better absolute performance than the previous MPI algorithm on more than 16 nodes (by up to two times speedup on 64 nodes). 相似文献