期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Performance evaluation of noncontiguous allocation algorithms for 2D mesh interconnection networks

Saad Bani-MohammadAuthor Vitae Ismail Ababneh^{Author Vitae} 《Journal of Systems and Software》2011,84(12):2156-2170

Several noncontiguous allocation strategies have been proposed for 2D mesh-connected multicomputers. These allocation strategies differ in their ability to detect free submeshes and in the degree of contiguity that exists among the submeshes they allocate to the same job. The previous Adaptive Noncontiguous Allocation (ANCA) policy was evaluated based on a proposed formula that estimates the job execution time when the job is allocated noncontiguous submeshes. Using this formula, simulation results had shown that ANCA could outperform the preceding Multiple Buddy Strategy (MBS). However, the execution times of jobs under noncontiguous allocation depend on message sizes, the number of messages sent, message contention and distances messages traverse. In this paper, we evaluate ANCA for different communication patterns using an event-driven simulator operating at the flit level, which allows for a more realistic evaluation that takes into account the shape of allocation and contention among messages. Moreover, we compare the performance of ANCA with that of other noncontiguous allocation strategies, MBS, Greedy Available Busy List (GABL), and Paging(0). The results show that ANCA is inferior to the remaining strategies, and that GABL has the best performance results, expressed in terms of the average turnaround time and mean system utilization performance parameters. 相似文献

2.

An efficient non-contiguous processor allocation strategy for 2D mesh connected multicomputers

S. Bani-Mohammad M. Ould-Khaoua 《Information Sciences》2007,177(14):2867-2883

In non-contiguous allocation, a job request can be split into smaller parts that are allocated possibly non-adjacent free sub-meshes rather than always waiting until a single sub-mesh of the requested size and shape is available. Lifting the contiguity condition is expected to reduce processor fragmentation and increase system utilization. However, the distances traversed by messages can be long, and as a result the communication overhead, especially contention, is increased. The extra communication overhead depends on how the allocation request is partitioned and assigned to free sub-meshes. In this paper, a new non-contiguous processor allocation strategy, referred to as Greedy-Available-Busy-List, is suggested for the 2D mesh network. Request partitioning in our suggested strategy is based on the sub-meshes available for allocation. To evaluate the performance improvement achieved by our strategy and compare it against well-known existing non-contiguous and contiguous strategies, we conduct extensive simulation runs under the assumption of wormhole routing and three communication patterns, notably one-to-all, all-to-all and random. The results show that the new strategy can reduce the communication overhead and substantially improve performance in terms of job turnaround time and system utilization. 相似文献

3.

Task migration in all-port wormhole-routed 2D mesh multicomputers

Nen-Chung Wang Tzung-Shi Chen 《Information Sciences》2006,176(22):3409-3425

In a mesh multicomputer, submeshes are allocated to perform jobs according to processor allocation schemes, with each task assigned to occupy processors of one submesh with an appropriate size. To assign regions for incoming tasks, task compaction is needed to produce a large contiguous free region. The overhead of task compaction relies mainly on designing an efficient task migration scheme. This paper investigates task migration schemes in 2D wormhole-routed mesh multicomputers with an all-port communication model. Two constraints are given between two submeshes for task migration. Two task migration schemes that follow one of the constraints in 2D mesh multicomputers are then developed. In addition, the proposed schemes are proven to be deadlock-free and congestion-free. Finally, performance analysis is adopted to compare the proposed task migration schemes. 相似文献

4.

A new processor allocation strategy with a high degree of contiguity in mesh-connected multicomputers

《Simulation Modelling Practice and Theory》2007,15(4):465-480

Relaxing the contiguity condition in non-contiguous allocation can reduce processor fragmentation and increase processor utilization. However, communication overhead could increase due to the potential increase in message distances. The communication overhead depends on how the allocation request is partitioned and allocated to free sub-meshes. In this paper, a new non-contiguous processor allocation strategy, referred to as Greedy-Available-Busy-List (GABL for short), is suggested for the mesh network, and is compared against the existing non-contiguous and contiguous allocation strategies. To demonstrate the performance gains achieved by our proposed strategy, we have conducted simulation runs under the assumption of wormhole routing technique. The results have revealed that the new strategy can reduce communication overhead and considerably improve performance in terms of the job turnaround time, system utilization, and jobs finish time. 相似文献

5.

On submesh allocation for mesh multicomputers: a best-fitallocation and a virtual submesh allocation for faulty meshes

Geunmo Kim Hyunsoo Yoon 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(2):175-185

The submesh allocation problem is to recognize and locate a free submesh that can accommodate a request for a submesh of a specified size. In this paper, we propose a new best-fit submesh allocation strategy for mesh-connected multiprocessor systems. The proposed strategy maintains and uses a free submesh list for an efficient allocation. For an allocation request, the strategy selects the best-fit submesh which causes the least amount of potential processor fragmentation. As many large free submeshes as possible are preserved for later allocations. For this purpose, we introduce a novel function quantifying the degree of potential fragmentation of submeshes. The proposed strategy has the capability of recognizing a complete submesh. We also propose an allocation strategy for faulty meshes which can maintain and allocate virtual submeshes derived from faulty submeshes. Extensive simulation is carried out to compare the proposed strategy with previous strategies. The proposed strategy has the best performance: a 6-50 percent improvement over the previous best strategy 相似文献

6.

Noncontiguous processor allocation algorithms for mesh-connectedmulticomputers

Lo V. Windisch K.J. Wanqian Liu Nitzberg B. 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(7):712-726

Current processor allocation techniques for highly parallel systems are typically restricted to contiguous allocation strategies for which performance suffers significantly due to the inherent problem of fragmentation. As a result, message-passing systems have yet to achieve the high utilization levels exhibited by traditional vector supercomputers. We are investigating processor allocation algorithms which lift the restriction on contiguity of processors in order to address the problem of fragmentation. Three noncontiguous processor allocation strategies-paging allocation, random allocation, and the Multiple Buddy Strategy (MBS)-are proposed and studied in this paper. Simulations compare the performance of the noncontiguous strategies with that of several well-known contiguous algorithms. We show that noncontiguous allocation algorithms perform better overall than the contiguous ones, even when message-passing contention is considered. We also present the results of experiments on an Intel Paragon XP/S-15 with 208 nodes that show noncontiguous allocation is feasible with current technologies 相似文献

7.

Task migration in n-dimensional wormhole-routed mesh multicomputers

《Journal of Systems Architecture》2004,50(4):177-192

In a mesh multicomputer, performing jobs needs to schedule submeshes according to some processor allocation scheme. In order to assign the incoming jobs to a free submesh, a task compaction scheme is needed to generate a larger contiguous free region. The overhead of compaction depends on the efficiency of the task migration scheme. In this paper, two simple task migration schemes are first proposed in n-dimensional mesh multicomputers with supporting dimension-ordered wormhole routing in one-port communication model. Then, a hybrid scheme which combines advantages of the two schemes is discussed. Finally, we evaluate the performance of all of these proposed approaches. 相似文献

8.

Submesh Allocation in Mesh Multicomputers Using Busy-List: A Best-Fit Approach with Complete Recognition Capability 总被引：1，自引：0，他引：1

Debendra Das Sharma Dhiraj K. Pradhan 《Journal of Parallel and Distributed Computing》1996,36(2):106

A new approach is proposed for dynamic submesh allocation in mesh-connected multicomputer system, which supports a multiuser environment. The proposed strategy effectively prunes the search space by searching for free submeshes on the corners of allocated (busy) submeshes along with the four corners of the mesh system. A submesh is selected with the potential of causing the least amount of fragmentation in the system. The proposed strategy possesses complete submesh recognition capability; it is a best-fit strategy, as well. Existing strategies do not provide this combination of capabilities. The deallocation time and memory overhead are shown to be constant in that they do not grow with the size of the mesh. To demonstrate effectiveness, the performance of the proposed strategy is compared against all existing schemes. Simulation results indicate that the proposed strategy outperforms existing ones in terms of parameters such as average delay in honoring a request, standard deviation of delay, average allocation time, average deallocation time, and amount of memory required. The proposed scheme achieves a 20 to 30% improvement in the average waiting delay over the best performing existing algorithm to date. Our scheme is shown to be applicable to torus-connected multicomputers as well, with only minor modifications. The scheme can also be used for submesh allocation with failures. 相似文献

9.

An adaptive job scheduling scheme for mesh-connected multicomputers

Ismail Ababneh Saad Bani-Mohammad Mohamed Ould-Khaoua 《The Journal of supercomputing》2010,53(1):5-25

Allocating submeshes to jobs in mesh-connected multicomputers in an FCFS fashion leads to poor system performance because a large job at the head of the waiting queue can prevent the allocation of free submeshes to other smaller waiting jobs. However, serving jobs aggressively out-of-order can lead to excessive waiting delays for large jobs located at the head of the waiting queue. In this paper, we show that the ability of the job scheduling algorithm to bypass the head of the waiting queue should increase with the load, and we propose a scheduling scheme that can bypass the waiting queue head in a load-dependent adaptive fashion. Also, giving priority to large jobs because they are more difficult to accommodate is investigated. The performance of the proposed scheme has been compared to that of FCFS, aggressive out-of-order scheduling, and other previous job scheduling schemes. Extensive simulation results based on synthetic workloads and real workload traces indicate that our scheduling strategy is a good strategy when both average and maximum job waiting delays are considered. In particular, it is substantially superior to FCFS in terms of mean turnaround times, and to aggressive out-of-order scheduling in terms of maximum waiting delays. 相似文献

10.

Task migration in three-dimensional meshes

Ava Bargi Hamid Sarbazi-Azad 《The Journal of supercomputing》2011,56(3):328-352

As a result of the emerging use of mesh-based multicomputers (and recently mesh-based multiprocessor systems-on-chip), issues related to processor management have attracted much attention. In a mesh-based multiprocessor, after repeated submesh allocations and de-allocations, the system network may be fragmented, i.e. there might be unallocated nodes in the network. As a result, in a system with contiguous processor allocation, no new tasks can start running due to the lack of enough free adjacent processors to form a suitable submesh. Although there might be enough free processors available, they remain idle until the allocator can find a set of adjacent free nodes forming a submesh to be used for the new task. This can lead to low system performance. Task migration was introduced as a solution to this problem through migration of tasks running on some submeshes to other free areas in order to reduce fragmentation by chaining the newly freed areas and disengaging nodes to form larger submeshes. In this paper, we propose a novel structured and formulated way to code task migration, which is helpful for congestion detection in different steps of task migration algorithms. Moreover, considering the fact that the 3D mesh-based multicomputers are now very popular, a new task migration algorithm in 3D meshes is proposed. We also address the special case of the 2D migration in a 3D mesh multicomputer. 相似文献

11.

A new window-based job scheduling scheme for 2D mesh multicomputers

Ismail Ababneh Saad Bani-Mohammad 《Simulation Modelling Practice and Theory》2011,19(1):482-493

Allocating submeshes to jobs in mesh-connected multicomputers in a FCFS fashion can lead to poor system performance (e.g., long job waiting delays) because the job at the head of the waiting queue can prevent the allocation of free submeshes to other waiting jobs with smaller submesh requirements. However, serving jobs aggressively out-of-order can lead to excessive waiting delays for jobs with large allocation requests. In this paper, we propose a scheduling scheme that uses a window of consecutive jobs from which it selects jobs for allocation and execution. This window starts with the current oldest waiting job and corresponds to the lookahead of the scheduler. The performance of the proposed window-based scheme has been compared to that of FCFS and other previous job scheduling schemes. Extensive simulation results based on synthetic workloads and real workload traces indicate that the new scheduling strategy exhibits good performance when the scheduling window size is large. In particular, it is substantially superior to FCFS in terms of system utilization, average job turnaround times, and maximum waiting delays under medium to heavy system loads. Also, it is superior to aggressive out-of-order scheduling in terms of maximum job waiting delays. Window-based job scheduling can improve both overall system performance and fairness (i.e., maximum job waiting delays) by adopting large lookahead job scheduling windows. 相似文献

12.

Mesh网均等分区策略 总被引：1，自引：0，他引：1

宋永生余筱琴《计算机研究与发展》1998,35(6):500-505

在大规模并行计算机系统中，处理机资源可能被多个用户作业竞争，操作系统必须采用一种处理机分配策略确定多少和哪些处理机分配给一个作业。文中针对大规模、消息通信并行计算机提出了矩形和非矩形两种处理机分配策略，这两种策略均满足对每个用户所分配处理机数的公平性以及处理机分配的邻近性。相似文献

13.

Comparative evaluation of contiguous allocation strategies on 3D mesh multicomputers 总被引：1，自引：0，他引：1

S. Bani-Mohammad Author Vitae M. Ould-Khaoua Author Vitae I. Ababneh Author Vitae Lewis M. Mackenzie^{Author Vitae} 《Journal of Systems and Software》2009,82(2):307-318

The performance of contiguous allocation strategies can be significantly affected by the type of the distribution adopted for job execution times. In this paper, the performance of the existing contiguous allocation strategies for 3D mesh multicomputers is re-visited in the context of heavy-tailed distributions (e.g., a Bounded Pareto distribution). The strategies are evaluated and compared using simulation experiments for both First-Come-First-Served (FCFS) and Shortest-Service-Demand (SSD) scheduling strategies under a variety of system loads and system sizes. The results show that the performance of the allocation strategies degrades considerably when job execution times follow a heavy-tailed distribution. Moreover, SSD copes much better than FCFS scheduling strategy in the presence of heavy-tailed job execution times. The results also reveal that allocation strategies that employ a list of allocated sub-meshes for both allocation and de-allocation exhibit low allocation overhead, and maintain good system performance in terms of average turnaround time and mean system utilization. 相似文献

14.

机群系统中一种新的自适应结点分配算法:受限负载平衡结点分配(LLBA) 总被引：1，自引：0，他引：1

雷州徐志伟祝明发《计算机学报》1999,22(8):877-881

在机群系统中结点分配策略根据一定的原则为作业确定运行结点是提高系统性能的关键。通过对机群结点分配策略的研究,作者发现当前基于负载平衡自适应的结点分配策略为并行作业选择负载最轻的结点,这不利于系统性能的充分发挥。作者提出了一种新的自适应负载平衡结点分配算法：受限负载平衡结点分配。相似文献

15.

Processor allocation in hypercube multicomputers: fast andefficient strategies for cubic and noncubic allocation

Das Sharma D. Pradhan D.K. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(10):1108-1123

A new approach for dynamic processor allocation in hypercube multicomputers which supports a multi-user environment is proposed. A dynamic binary tree is used for processor allocation along with an array of free lists. Two algorithms are proposed based on this approach, capable of efficiently handling cubic as well as noncubic allocation. Time complexities for both allocation and deallocation are shown to be polynomial, a significant improvement over the existing exponential and even super-exponential algorithms. Unlike existing schemes, the proposed strategies are best-fit strategies within their search space. Simulation results indicate that the proposed strategies outperform the existing ones in terms of parameters such as average delay in honoring a request, average allocation time, average deallocation time, and memory overhead 相似文献

16.

容错处理器阵列的多逻辑列并行重构算法

章子凯武继刚姜文超刘竹松《计算机工程与科学》2018,40(1):24-33

处理器阵列的容错重构技术是片上网络多核、众核高性能体系结构的可靠性技术之一。现有的最大逻辑阵列并行重构技术仅对单条逻辑列的构造实现了并行化,而对多条逻辑列的同步并行仍未见可行算法。依据处理器阵列的潜在并行性,在分治策略的基础上,提出了一种阵列分块的并行重构算法。算法对处理器阵列实施横向分块划分,对每个阵列块进行并行重构,并对所得逻辑子阵列进行归并,实现了多条逻辑列的同步并行重构。与现有的并行算法相比,新算法同样能够生成最大逻辑列,并且减少了通信开销与计算中的数据冗余,有效提高了运行速度。实验结果表明,在物理阵列大小为64×64的处理器阵列上,运行速度比现有并行算法提高39.55%,并且具有良好的可扩展性。相似文献

17.

Using task migration to improve non-contiguous processor allocation in NoC-based CMPs

《Journal of Systems Architecture》2013,59(7):468-481

In this paper, a processor allocation mechanism for NoC-based chip multiprocessors is presented. Processor allocation is a well-known problem in parallel computer systems and aims to allocate the processing nodes of a multiprocessor to different tasks of an input application at run time. The proposed mechanism targets optimizing the on-chip communication power/latency and relies on two procedures: processor allocation and task migration. Allocation is done by a fast heuristic algorithm to allocate the free processors to the tasks of an incoming application when a new application begins execution. The task-migration algorithm is activated when some application completes execution and frees up the allocated resources. Task migration uses the recently deallocated processors and tries to rearrange the current tasks in order to find a better mapping for them. The proposed method can also capture the dynamic traffic pattern of the network and perform task migration based on the current communication demands of the tasks. Consequently, task migration adapts the task mapping to the current network status. We adopt a non-contiguous processor allocation strategy in which the tasks of the input application are allowed to be mapped onto disjoint regions (groups of processors) of the network. We then use virtual point-to-point circuits, a state-of-the-art fast on-chip connection designed for network-on-chips, to virtually connect the disjoint regions and make the communication latency/power closer to the values offered by contiguous allocation schemes. The experimental results show considerable improvement over existing allocation mechanisms. 相似文献

18.

基于多核处理器并行系统的任务调度算法 总被引：6，自引：0，他引：6

袁云邵时《计算机应用》2008,28(Z2)

针对多核处理器并行系统的特点,提出了相应的任务调度算法,该算法在任务调度之前加入了任务分配技术,通过合理的任务分配,可有效减少多个处理器间的通信开销,使任务调度效率更佳.仿真实现了该算法,并通过实验数据证明了该算法的优越性. 相似文献

19.

多线程处理器资源分配策略 总被引：1，自引：0，他引：1

下载免费PDF全文

何军王飙《计算机工程》2008,34(15):283

处理器资源如何在多个线程之间进行分配和共享是直接影响多线程处理器性能的关键问题。该文总结4种分配模型,提出其实现机制,讨论资源分配平衡问题,指出可根据目标应用和流水线不同阶段的特点,在各流水线阶段综合采用不同分配模型和实现机制,实现处理器资源的合理分配。相似文献

20.

合同战术训练评估系统体系结构 总被引：1，自引：1，他引：0

下载免费PDF全文

杨晓恝张宏军马勇波藄秀利《计算机工程》2008,34(15):54-56,5

在介绍合同战术训练评估需求分析、系统总体结构的基础上,研究了其中的演习结果评估子系统的框架和层次结构,横向上将其分为主框架、行动评估模块、算法插件3个部分,纵向上把它分为数据采集、数据处理、成绩报告3层,从而有效地降低了系统各组成部分间的耦合程度,并使系统能综合运用多样化的数据采集手段以及效能分析方法,具有一定的参考价值。相似文献