首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
该文基于分布式并行计算机系统,对一类多层二维二相流油藏数值模拟问题给出了3种任务划分策略-"卷帘"方式、区域分解方式和"卷帘"与区域分解结合的方式,对它们进行了比较,提出了减少求解时间、利于负载均衡和提高并行性能的任务划分方法,并实际应用于有多达72万个网格节点的大规模油藏模拟问题.实算结果表明,该策略划分产生的并行求解任务均衡,有利于加速比的提高.该方法也适用于区域或数据并行的任务划分问题.  相似文献   

2.
基于多层油藏问题负载均衡的并行任务划分   总被引:1,自引:0,他引:1  
舒继武  赵金熙  周维四  张德富 《软件学报》1999,10(10):1061-1066
该文基于分布式并行计算机系统,对一类多层二维二相流油藏数值模拟问题给出了3种任务划分策略—“卷帘”方式、区域分解方式和“卷帘”与区域分解结合的方式,对它们进行了比较,提出了减少求解时间、利于负载均衡和提高并行性能的任务划分方法,并实际应用于有多达72万个网格节点的大规模油藏模拟问题.实算结果表明,该策略划分产生的并行求解任务均衡,有利于加速比的提高.该方法也适用于区域或数据并行的任务划分问题.  相似文献   

3.
为提高大数据平台下大规模图例的最大团问题求解效率,提出一种基于并行约束规划的最大团识别算法.通过BMT图划分策略将一个复杂图例分割为若干个可独立计算的子图,并将其分配给Spark集群中的计算节点,每个计算节点采用约束规划方法对分割产生的子问题分别进行建模和求解,实现最大团问题的并行化处理.引入时间预测模型,设计基于任务运行时间预测模型的并行图划分方法,从而有效解决计算节点的负载均衡问题.实验结果表明,与基于BMC图划分策略的最大团并行识别算法相比,该算法具有更高的求解效率,可取得近似线性的加速比.  相似文献   

4.
针对多智能体系统(MAS)任务分配问题中多个任务与MAS两者的分布式特征,将任务分配问题形式化为分布式约束满足问题(DCSP)进行求解,分别建立了以任务为中心和以agent为中心两种MAS任务分配模型,基于改进的DCSP分布式并行求解算法,提出了基于DCSP的MAS任务分配问题求解框架。该方法适合求解agent间通信有随机延迟以及agent间存在多约束的问题,应用实例的求解表明了其实用性与有效性。  相似文献   

5.
朱庆保 《计算机工程》2005,31(1):157-159
为了改进蚁群优化算法的收敛速度,研究了一种基于粗粒度模型的并行蚁群优化算法,该算法将搜索任务划分给q个子群,由这些子群并行地完成搜索,可使搜索速度大幅度提高。实验结果表明,用该算法求解TSP问题,收敛速度比最新的改进算法快百倍以上。  相似文献   

6.
偏微分方程的并行求解,关键问题之一是网格划分,它不仅要求每个进程拥有相等的计算负载,同时要求有良好的划分质量,以减少进程间通信.在自适应有限元计算过程中,网格/基函数不断调整,会导致负载不平衡,必须动态地调整网格分布,从而实现动态负载平衡.本文研究了不同的负载平衡方法,并在并行自适应有限元平台PHG中实现.数值实验表明我们的动态负载平衡算法具有很高的划分质量,运行速度快,可有效划分网格并减少运行时间.  相似文献   

7.
任何算法的有效并行要求深入了解计算过程的细节,掌握参与计算的各个数据部分之间的相互依赖关系,针对计算的类型和应用的约束条件提出合理的任务划分和算法分解方案.因此,本文首先介绍了粒子跟踪算法的计算过程,并分析了该算法并行设计的可能性.从计算过程的数据相关性出发,我们给出了具体的并行绘制模型,设计出可行的并行划分策略.最后,对该并行策略进行了测试,验证了该设计的正确性和可行性.  相似文献   

8.
提出了一种多序列比对ClustalW算法并行化处理的新方法ParaClustalW,该方法使用桌面网格计算平台作为高性能编程环境和运行平台.分析了多序列比对算法在桌面网格平台上的任务划分方式、并行化策略和实现技术.ParaClustalW策略考虑到序列的数目与序列的长度等因素,实现任务划分的均衡性.经实验证明,Para...  相似文献   

9.
可视化并行程序设计平台的研究与实现   总被引:4,自引:0,他引:4  
从改善用户并行程序设计环境出发,研制了一个基于网络的可视化并行程序设计平台。该平台用一个图形表示一个并行程序,图形中的结点表示任务,弧表示任务间的数据依赖关系。用户只须将并行问题可视化地以图形方式描述出来,任务的调度、任务间通信都由系统自动完成,因而大大地方便了用户进行并行程序设计。  相似文献   

10.
一种图K划分的随机算法   总被引:1,自引:0,他引:1  
本文提出一个求解图K划分难题的模拟进化退火模型。该模型将模拟退火和模拟进化方法相结合,实现了多目标并行寻优策略。理论分析和实验结果表明,模拟进化退火模型的性能更优,解的优化程度更高。  相似文献   

11.
Dinning  A. 《Computer》1989,22(7):66-77
An examination is given of how traditional synchronization methods influence the design of MIMD (multiple-instruction multiple-data-stream) multiprocessors. She provides an overview of MIMD multiprocessing and goes on to discuss semaphore-based implementations (Ultracomputers, Cedar, and the Sequent Balance/21000), monitor-based implementations (the HM2p) and implementations based on message-passing (HEP, the BBN Butterfly and the Transputer)  相似文献   

12.
This paper describes the design and implementation ofPanorama, a parallel debugger for MIMD message-passing computers. Programmers can readily adapt Panorama to new parallel platforms and extended it to include their ownviewsof a target program. The system comes with three built-in graphical program views, and it also includes a software tool to help programmers design and implement new views. Panorama avoids detailed dependence on target architectures by using thebase debuggersupplied by each hardware vendor to carry out low-level debugging tasks such as setting breakpoints and examining data. Since the interfaces and capabilities of base debuggers vary, we have developed a strategy that models interactions between Panorama and base debuggers. The model separates general-purpose code from the special-case functions that handle specific debugger characteristics. The resulting system is easy to adapt and free from the clutter of conditionally-executed, special-case code.  相似文献   

13.
一种基于任务的机器人全局并行算法研究及实现   总被引:3,自引:0,他引:3  
沈悦明  陈启军 《机器人》2003,25(6):495-500
本文提出了一种基于任务的机器人全局并行算法,结合主从结构的MIMD并行处理平台将机器人控制中的运动学、动力学、控制律等基本计算任务分别进行任务划分,将划分好的子任务统一用工作池方式实现全局的动态调度.采用流水线及集中式动态调度策略,在一个由5个DSP处理器组成的同构型松耦合MIMD并行处理平台上对平面机器人进行了并行实时仿真实验,取得了满意的并行性能指标.  相似文献   

14.
This paper describes and illustrates a structured programming metalanguage (DPOS) and graphical programming environment for generating and debugging high-level distributed MIMD parallel programs. DPOS introduces an innovative message-passing model and also recursive graphical definition of parallel process networks. It also provides programming and debugging at the meta language level that is portable across implementation languages. The initial development focus of DPOS is to provide a parallel development system for Lisp-based, symbolic and artificial intelligence programs as part of the MAYFLY parallel processing project. The DPOS environment also generates source code and provides a simulation system for graphical debugging and animation of the programs in graph form.  相似文献   

15.
The Proteus architecture is a highly parallel, multiple instruction, multiple data machine (MIMD) optimized for large granularity tasks such as machine vision and image processing. The system can achieve 20 gigaflops (80 gigaflops peak). It accepts data via multiple serial links at a rate of up to 640 MB/S. The system employs a hierarchical reconfigurable interconnection network with the highest level being a circuit-switchedenhanced hypercube, serial interconnection network for internal data transfers. The system is designed to use 256 to 1024 RISC processors. The processors use 1-MB externalread/write allocating caches for reduced multiprocessor contention. The system detects, locates, and replaces faulty subsystems using redundant hardware to facilitatefault tolerance. The parallelism is directly controllable through an advanced software system for partitioning, scheduling, and development. System software includes a translator for the INSIGHT language, a parallel debugger, lowand high-level simulators, and a message-passing system for all control needs. Image-processing application software includes a variety of point operators, neighborhood operators, convolution, and the mathematical morphology operations of binary and gray-scale dilation, erosion, opening, and closing.  相似文献   

16.
In this paper we describe a new algorithm for maintaining a balanced search tree on a message-passing MIMD architecture; the algorithm is particularly well suited for implementation on a small number of processors. We introduce a (2B-2, 2B) search tree that uses a bidirectional ring of O(log n) processors to store n entries. Update operations use a bottom-up node-splitting scheme, which performs significantly better than top-down search tree algorithms. The bottom-up algorithm requires many fewer messages and results in less blocking due to synchronization than top-down algorithms. Additionally, for a given cost ratio of computation to communication the value of B may be varied to maximize performance. Implementations on a parallel-architecture simulator are described  相似文献   

17.
并行仿真任务的自动生成软件   总被引:1,自引:0,他引:1  
本文阐述了一种将连续系统仿真模型-状态方程模型自动生成并行仿真多任务的软件。该软件实现了从仿真模型生成可并行多任务到仿真结果输出过程的自动化,而且生成的并行仿真任务均衡,并行加速比高,适用于基于机间通信的同构型MIMD系统。  相似文献   

18.
Averbuch  A.  Epstein  B.  Ioffe  L.  Yavneh  I. 《The Journal of supercomputing》2000,17(2):123-142
We present an efficient parallelization strategy for speeding up the computation of a high-accuracy 3-dimensional serial Navier-Stokes solver that treats turbulent transonic high-Reynolds flows. The code solves the full compressible Navier-Stokes equations and is applicable to realistic large size aerodynamic configurations and as such requires huge computational resources in terms of computer memory and execution time. The solver can resolve the flow properly on relatively coarse grids. Since the serial code contains a complex infrastructure typical for industrial code (which ensures its flexibility and applicability to complex configurations), then the parallelization task is not straightforward. We get scalable implementation on massively parallel machines by maintaining efficiency at a fixed value by simultaneously increasing the number of processors and the size of the problem.The 3-D Navier-Stokes solver was implemented on three MIMD message-passing multiprocessors (a 64-processors IBM SP2, a 20-processors MOSIX, and a 64-processors Origin 2000). The same code written with PVM and MPI software packages was executed on all the above distinct computational platforms. The examples in the paper demonstrate that we can achieve efficiency of about 60% for as many as 64 processors on Origin 2000 on a full-size 3-D aerodynamic problem which is solved on realistic computational grids.  相似文献   

19.
This paper presents the results of parallelizing a three-dimensional Navier-Stokes solver on a 32K-processor Thinking Machines CM-2, a 128-node Intel iPSC/860, and an 8-processor CRAY Y-MP. The main objective of this work is to study the performance of the flow solver, INS3D-LU code, on two distributed-memory machines, a massively parallel SIMD machine (CM-2) and a moderately parallel MIMD machine (iPSC/860), and compare it with its performance on a shared-memory MIMD machine with a small number of processors (Y-MP). The code is based on a Lower-Upper Symmetric-Gauss-Seidel implicit scheme for the pseudocompressibility formulation of the three-dimensional incompressible Navier-Stokes equations. The code was rewritten in CMFORTRAN with shift operations and run on the CM-2 using the slicewise model. The code was also rewritten with distributed data and Intel message-passing calls and run on the iPSC/860. The timing results for two grid sizes are presented and analyzed using both 32-bit and 64-bit arithmetic. Also, the impact of communication and load balancing on the performance of the code is outlined. The results show that reasonable performance can be achieved on these parallel machines. However, the CRAY Y-MP outperforms the CM-2 and iPSC/860 for this particular algorithm.The author is an employee of Computer Sciences Corporation. This work was funded through NASA Contract NAS 2-12961.  相似文献   

20.
We first define the basic notions of local and non-local tasks for distributed systems. Intuitively, a task is local if, in a system with no failures, each process can compute its output value locally by applying some local function on its own input value (so the output value of each process depends only on the process’ own input value, not on the input values of the other processes); a task is nonlocal otherwise. All the interesting distributed tasks, including all those that have been investigated in the literature (e.g., consensus, set agreement, renaming, atomic commit, etc.) are non-local. In this paper we consider non-local tasks and determine the minimum information about failures that is necessary to solve such tasks in message-passing distributed systems. As part of this work, we also introduces weak set agreement—a natural weakening of set agreement—and show that, in some precise sense, it is the weakest nonlocal task in message-passing systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号