共查询到19条相似文献,搜索用时 78 毫秒
1.
2.
将并行计算应用到大数据量简单要素模型多边形拓扑检查中,设计实现了简单要素模型多边形拓扑检查并行算法。算法针对拓扑检查的计算特点,改进了主从式并行策略,在主进程中进一步划分线程以实现任务并行,从而隐藏拓扑错误提取和结果写入时间。采用MPI和PThread实现进程与线程的结合。利用苏南五市土地现状调查地类图斑数据对算法进行测试。经测试,该算法能够对大数据量简单要素模型多边形进行准确、快速的拓扑检查。算法提出的进程与线程结合的任务并行策略相对于传统主从式策略加速比提高约20%。 相似文献
3.
使用伪谱方法的大涡模拟准确、高效,但在高雷诺数情况下,计算量仍然非常巨大,需要采用并行方法,但是快速傅里叶变换的并行算法在实际应用中有很大的困难。针对这一问题,提出了一种新的基于MPI的伪谱法大涡模拟的并行计算方法。通过实例验证,该方法准确、易行、稳健,并且可以大幅提高计算速度,节省计算时间,这对大涡模拟在工程中的广泛应用具有重要意义。 相似文献
4.
由高档微机或RISC工作站通过高速局域网连接呵成的集群系统的实现,使高性能计算机从研究与应用领域走进普通领域。该文介绍了如何在Linux操作系统下基于分布式存储结构构造一个由普通微机组成的Beowulf并行计算系统的方法。通过编制的并行计算算法对该Beowulf系统进行了并行效率的实际测试,测试结果表明该Beowulf系统具有很高的并行计算效率和并行加速比。 相似文献
5.
分布式并行计算环境:MPI 总被引:3,自引:0,他引:3
1 引言在过去几十年里,大规模和超大规模并行机的可用性取得长足进步。由于各种因素,这些机器大多采用分布主存或分布共享主存结构,为了对用户提供必要的支持,厂商开发了各自专有的消息传递包或消息传递库如Intel的NX、IBM的EUI、Parasoft的Exress、橡树岭的PVM等。它们提供了相似的功能,并且在特定平台上具有优越的性能,但是在应用程序 相似文献
6.
本文在总结各种隧道算法的共同特点的基础上,从隧道算法钻隧过程的可并行性出发,提出了基于agent的分布式并行隧道算法,并给出了系统原型和agent模型设计.通过在agent中使用多线程技术,增强了算法的并行性.数值实验证明了该算法的可行性、可扩展性和并行效率. 相似文献
7.
8.
9.
10.
Welch算法是一种应用很广的经典功率谱估计算法。但是面对现在日益膨胀的海量数据,单纯在串行Matlab环境下运行Welch算法势必耗费大量的运算时间。尽管Matlab也引入了并行计算工具箱,但是价格昂贵,不利于大范围推广使用。根据Welch算法的原理,在Linux集群环境及消息传递接口MPI 的支持下,采用主从并行编程模式,实现了一个开源的Welch并行算法PMWelch。实验结果表明,PMWelch不仅具有Matlab下Welch算法一样的运算结果,还可以大幅减少运行时间。 相似文献
11.
根据解反应扩散方程的自适应样条小波-交替方向(SW-ADI)方法,使用MPI、OpenMP两种并行编程模式,对串行程序进行了直接并行化,并在上海大学的高性能计算机自强2000上分别用MPI和OpenMP实现了对方程的求解。对运算结果进行了分析并给出了与串行程序相比较的并行加速比。 相似文献
12.
Heterogeneous network-based distributed and parallel computing is gaining increasing acceptance as an alternative or complementary paradigm to multiprocessor-based parallel processing as well as to conventional supercomputing. While algorithmic and programming aspects of heterogeneous concurrent computing are similar to their parallel processing counterparts, system issues, partitioning and scheduling, and performance aspects are significantly different. In this paper, we discuss the evolution of heterogeneous concurrent computing, in the context of the parallel virtual machine (PVM) system, a widely adopted software system for network computing. In particular, we highlight the system level infrastructures that are required, aspects of parallel algorithm development that most affect performance, system capabilities and limitations, and tools and methodologies for effective computing in heterogeneous networked environments. We also present recent developments and experiences in the PVM project, and comment on ongoing and future work. 相似文献
13.
14.
根据对距离多普勒(Range Doppler)成像算法的特点进行研究,提出了一种基于流水线的合成孔径雷达(SAR)并行成像算法。这种算法基于C/MPI编写并成功地在32节点的IBM PC集群实现。通过与已建立的通用的并行成像算法进行比较分析,得出基于流水线的并行算法是一种更适合SAR并行成像的一种算法,能够提供更高的并行效率。 相似文献
15.
Optimal design of truss structures using parallel computing 总被引:1,自引:0,他引:1
P.K. Umesha M.T. Venuraju D. Hartmann K.R. Leimbach 《Structural and Multidisciplinary Optimization》2005,29(4):285-297
Parallel design optimization of large structural systems calls for a multilevel approach to the optimization problem. The general optimization problem is decomposed into a number of non-interacting suboptimization problems on the first level. They are controlled from the second level through coordination variables. Thus, the solutions of the independent first-level subsystems are directed towards the overall system optimum. In the present paper, optimal design of truss structures using parallel computing technique is described. In this method, optimization of a large truss structure has been carried out by decomposing the structure into sub-domains and suboptimization tasks. Each sub-domain has independent design variables and a small number of behaviour constraints. The two-level sub-domain optimum design approach is summarized by several numerical examples with speedups and efficiencies of algorithms on message passing systems. It has been noticed that the efficiency of the algorithm for design optimization increases with the size of the structure. 相似文献
16.
Lei Pan Ming Kin Lai Koji Noguchi Javid J. Huseynov Lubomir F. Bic Michael B. Dillencourt 《International journal of parallel programming》2004,32(1):1-37
Message Passing (MP) and Distributed Shared Memory (DSM) are the two most common approaches to distributed parallel computing. MP is difficult to use, whereas DSM is not scalable. Performance scalability and ease of programming can be achieved at the same time by using navigational programming (NavP). This approach combines the advantages of MP and DSM, and it balances convenience and flexibility. Similar to MP, NavP suggests to its programmers the principle of pivot-computes and hence is efficient and scalable. Like DSM, NavP supports incremental parallelization and shared variable programming and is therefore easy to use. The implementation and performance analysis of real-world algorithms, namely parallel Jacobi iteration and parallel Cholesky factorization, presented in this paper supports the claim that the NavP approach is better suited for general-purpose parallel distributed programming than either MP or DSM. 相似文献
17.
18.
并行计算通信库的测试在并行计算系统中起着重要的作用.对通信库的测试一般都是通过设计一些测试程序对库的各个或几个部分分别进行单独隔离测试.但是有许多库中的错误用这种隔离测试方法测不出来,只有当库的多个部分以某种复杂的、有机的方式组合运行时才会暴露出来.而这种复杂的、有机的组合方式,从设计库的测试角度看很难形成.提出两种新的测试方法,根据通信库结构的分层特性,利用可移植的上层库的测试程序来测试下层库.上层库的测试程序也可看做是下层库的应用程序,但与一般的下层库应用程序不同,它几乎覆盖了下层库的各个部分,且有机地将它们组合起来,运行时形成某种复杂的形态,而仅用下层库的测试程序往往达不到这种形态.这样,逃过下层库测试程序的错误就可能暴露出来. 相似文献
19.
Mohammad R. Hajihashemi Magda El-ShenaweeAuthor Vitae 《Journal of Parallel and Distributed Computing》2010
A parallelized version of the level-set algorithm based on the MPI technique is presented. TM-polarized plane waves are used to illuminate two-dimensional perfect electric conducting targets. A variety of performance measures such as the efficiency, the load balance, the weak scaling, and the communication/computation times are discussed. For electromagnetic inverse scattering problems, retrieving the target’s arbitrary shape and location in real time is considered as a main goal, even as a trade-off with algorithm efficiency. For the three cases considered here, a maximum speedup of 53X-84X is achieved when using 256 processors. However, the overall efficiency of the parallelized level-set algorithm is 21%–33% when using 256 processors and 26%–52% when using 128 processors. The effects of the bottlenecks of the level-set algorithm on the algorithm efficiency are discussed. 相似文献