期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Coordinating Parallel Processes on Networks of Workstations 总被引：1，自引：0，他引：1

Xing Du Xiaodong Zhang 《Journal of Parallel and Distributed Computing》1997,46(2):186

The network of workstations (NOW) we consider for scheduling is heterogeneous and nondedicated, where computing power varies among the workstations and local and parallel jobs may interact with each other in execution. An effective NOW scheduling scheme needs sufficient information about system heterogeneity and job interactions. We use the measured power weight of each workstation to quantify the differences of computing capability in the system. Without a processing power usage agreement between parallel jobs and local user jobs in a workstation, job interactions are unpredictable, and performance of either type of jobs may not be guaranteed. Using the quantified and deterministic system information, we design a scheduling scheme calledself-coordinated local schedulingon a heterogeneous NOW. Based on a power usage agreement between local and parallel jobs, this scheme coordinates parallel processes independently in each workstation based on the coscheduling principle. We discuss its implementation on Unix System V Release 4 (SVR4). Our simulation results on a heterogeneous NOW show the effectiveness of the self-coordinated local scheduling scheme. 相似文献

2.

An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW

《Journal of Parallel and Distributed Computing》1996,38(1):63-80

Networks of workstations (NOW) are receiving increased attention as a viable platform for high performance parallel computations. Heterogeneity and time-sharing are two characteristics that distinguish the NOW systems from conventional multiprocessor/multicomputer systems which are homogeneous and dedicated. It is important to have a practical model for users to predict the execution times of large-scale parallel applications on nondedicated heterogeneous NOW. Another objective of this study is to provide insight into the dynamic performance of parallel computing and into the effects of program structures and system factors on such a platform. In this paper, we study performance predictions for parallel computing on nondedicated heterogeneous networks of workstations. Our approach is based on a two-level model. On the top level, a semideterministic task graph is used to capture the parallel execution behavior including the variances of communication and synchronization. On the bottom level, a discrete time model is used to quantify effects from NOW systems. An iterative process is used to determine the interactive effects between network contention and task execution. We validate the prediction model using experiments on a nondedicated heterogeneous NOW. The maximum differences between predicted results and measured results were less than 10% in most cases and 15% in the worst cases. 相似文献

3.

一种针对结构化并行控制机制的任务调度算法 总被引：4，自引：0，他引：4

张宏莉方滨兴胡铭曾《软件学报》2001,12(5):706-710

缩短程序的执行时间是并行处理的首要目标,有效的任务分配算法是实现这一目标的关键,对机群系统来说更是如此.研究机群系统上针对结构化并行控制机制的任务调度问题,并基于贪心算法、粒度控制、反馈式分派的原则,提出近优的任务调度算法SSA(sub-optimal scheduling algorithm).实验结果表明,在机群环境下,该算法的并行计算性能与其他算法相比均有所提高. 相似文献

4.

Efficient Multiple Multicast on Heterogeneous Network of Workstations

Jan-jan Wu Shih-hsien Yeh Pangfeng Liu 《The Journal of supercomputing》2004,29(1):59-88

In recent years, network of workstations/PCs (so called NOW) are becoming appealing vehicles for cost-effective parallel computing. Due to the commodity nature of workstations and networking equipment, LAN environments are gradually becoming heterogeneous. The diverse sources of heterogeneity in NOW systems pose a challenge on the design of efficient communication algorithms for this class of systems. In this paper, we propose efficient algorithms for multiple multicast on heterogeneous NOW systems, focusing on heterogeneity in processing speeds of workstations/PCs. Multiple multicast is an important operation in many scientific and industrial applications. Multicast on heterogeneous systems has not been investigated until recently. Our work distinguishes itself from others in two aspects: (1) In contrast to the blocking communication model used in prior works, we model communication in a heterogeneous cluster more accurately by a non-blocking communication model, and design multicast algorithms that can fully take advantage of non-blocking communication. (2) While prior works focus on single multicast problem, we propose efficient algorithms for general, multiple multicast (in which single multicast is a special case) on heterogeneous NOW systems. To our knowledge, our work is the earliest effort that addresses multiple multicast for heterogeneous NOW systems. These algorithms are evaluated using a network simulator for heterogeneous NOW systems. Our experimental results on a system of up to 64 nodes show that some of the algorithms outperform others in many cases. The best algorithm achieves completion time that is within 2.5 times of the lower bound. 相似文献

5.

分布异构工作站上的任务调度算法

邬延辉陆鑫达曾志勇《小型微型计算机系统》2004,25(4):733-737

讨论了在一个由高速局域网连接的高性能异构工作站平台上，如何有效地利用空闲工作站来求解计算密集型任务矩阵相乘的问题，为了获得较好的并行计算性能，文中给出了一个异构工作站群之间任务调度的模型和算法，算法中考虑了并行计算中协作任务间的通信时间、数据加栽时间、结果收集时间和各个异构工作站的任务计算时间，通过这个模型，可以在所有可利用的工作站集合中找出最适合的子集，获得最短的执行时间．相似文献

6.

一种基于异构网络的NOW中的群通信模型 总被引：2，自引：0，他引：2

吴礼发谢立《计算机研究与发展》1998,35(11):1042-1047

工作站网络（ＮＯＷ）作为一种新的并行计算结构越来越受到人们的重视。文中讨论了基于异构网络的ＮＯＷ中群通信问题，提出了一种群通信层次模型，较好地解决了异构网络环境中的群通信问题，其实现群通信的思想是将参与群通信操作的结点按它们所处的子网分成一个个基计算域，通过这些基计算域的并行操作来达到降低通信时延同时又不增加太多网络流量的目的，并且还可以在不同的子网中选择最合适的群通信实现，这一做法做全网面向过程相似文献

7.

Performance Modeling and Evaluation of MPI

《Journal of Parallel and Distributed Computing》2001,61(2):202-223

Users of parallel machines need to have a good grasp for how different communication patterns and styles affect the performance of message-passing applications. LogGP is a simple performance model that reflects the most important parameters required to estimate the communication performance of parallel computers. The message passing interface (MPI) standard provides new opportunities for developing high performance parallel and distributed applications. In this paper, we use LogGP as a conceptual framework for evaluating the performance of MPI communications on three platforms: Cray-Research T3D, Convex Exemplar 1600SP, and a network of workstations (NOW). We develop a simple set of communication benchmarks to extract the LogGP parameters. Our objective in this is to compare the performance of MPI communication on several platforms and to identify a performance model suitable for MPI performance characterization. In particular, two problems are addressed: how LogGP quantifies MPI performance and what extra features are required for modeling MPI, and how MPI performance compare on the three computing platforms: Cray Research T3D, Convex Exemplar 1600SP, and workstations clusters. 相似文献

8.

A Case Study of Application Analytical Modeling in Heterogeneous Computing Environments: Cholesky Factorization in a NOW

Aversa R. Mazzocca N. Villano U. 《The Journal of supercomputing》2003,24(1):5-24

相似文献

9.

一种实用的并行计算模型 总被引：11，自引：0，他引：11

计永昶丁卫群陈国良安虹《计算机学报》2001,24(4):437-441

对于当前流行的工作站集群环境和各类并行机系统,文中提出了一种实用的并行计算模型,即基于LogGP的非独占异质同步模型NHBL（Nondedicated Heterogeneous Barrier LogGP）,它旨在反映具有异质性和非独占性的NOW计算环境对并行算法设计和分析的影响,然后用NHBL模型分析了PSRS算法在国家高性能计算中心（合肥）的工作站集群NHPCC-Cluster和曙光－1000MPP由的代价,并用实测结果进行了验证。相似文献

10.

Parallel implementation of back-propagation algorithm in networks of workstations

Suresh S. Omkar S.N. Mani V. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(1):24-34

This work presents an efficient mapping scheme for the multilayer perceptron (MLP) network trained using back-propagation (BP) algorithm on network of workstations (NOWs). Hybrid partitioning (HP) scheme is used to partition the network and each partition is mapped on to processors in NOWs. We derive the processing time and memory space required to implement the parallel BP algorithm in NOWs. The performance parameters like speed-up and space reduction factor are evaluated for the HP scheme and it is compared with earlier work involving vertical partitioning (VP) scheme for mapping the MLP on NOWs. The performance of the HP scheme is evaluated by solving optical character recognition (OCR) problem in a network of ALPHA machines. The analytical and experimental performance shows that the proposed parallel algorithm has better speed-up, less communication time, and better space reduction factor than the earlier algorithm. This work also presents a simple and efficient static mapping scheme on heterogeneous system. Using divisible load scheduling theory, a closed-form expression for number of neurons assigned to each processor in the NOW is obtained. Analytical and experimental results for static mapping problem on NOWs are also presented. 相似文献

11.

Dynamic scheduling techniques for heterogeneous computing systems

Babak Hamidzadeh Yacine Atif David J. Lilja 《Concurrency and Computation》1995,7(7):633-652

There has been a recent increase of interest in heterogeneous computing systems, due partly to the fact that a single parallel architecture may not be adequate for exploiting all of a program's available parallelism. In some cases, heterogeneous systems have been shown to produce higher performance for lower cost than a single large machine. However, there has been only limited work on developing techniques and frameworks for partitioning and scheduling applications across the components of a heterogeneous system. In this paper we propose a general model for describing and evaluating heterogeneous systems that considers the degree of uniformity in the processing elements and the communication channels as a measure of the heterogeneity in the system. We also propose a class of dynamic scheduling algorithms for a heterogeneous computing system interconnected with an arbitrary communication network. These algorithms execute a novel optimization technique to dynamically compute schedules based on the potentially non-uniform computation and communication costs on the processors of a heterogeneous system. A unique aspect of these algorithms is that they easily adapt to different task granularities, to dynamically varying processor and system loads, and to systems with varying degrees of heterogeneity. Our simulations are designed to facilitate the evaluation of different scheduling algorithms under varying degrees of heterogeneity. The results show improved performance for our algorithms compared to the performance resulting from existing scheduling techniques. 相似文献

12.

High performance computing on networks of workstations through the exploitation of function parallelism

Yung-Lin Liu Hau-Yang Cheng Chung-Ta King 《Journal of Systems Architecture》1999,45(15):1307-1321

Network of workstations (NOW) has become a widely accepted form of high-performance parallel computing. As in conventional multicomputers, parallel programs running on such a platform are often written in an SPMD form to exploit data parallelism. Each workstation in a NOW is treated similarly to a processing element in a multicomputer system. However, workstations are far more powerful and flexible than the processing elements in conventional multicomputers. In this paper, we discuss how workstations in a NOW can be used to exploit more parallelism in an SPMD program, especially those induced from concurrent activities. 相似文献

13.

Limitations of Cycle Stealing for Parallel Processing on a Network of Homogeneous Workstations

Scott T. Leutenegger Xian-He Sun 《Journal of Parallel and Distributed Computing》1997,43(2):733

The low cost and availability of clusters of workstations have lead researchers to re-explore distributed computing using independent workstations. This approach may provide better cost/performance than tightly coupled multiprocessors. In practice, this approach often utilizes wasted cycles to run parallel jobs. In this paper we address the feasibility and limitation of such a nondedicated parallel processing environment assuming workstation processes have priority over parallel tasks. We develop a simple analytical model to predict parallel job response times. Our model provides insight into how significantly workstation owner interference degrades parallel program performance. It forms a foundation for task partitioning and scheduling in a nondedicated network environment. A new term, task ratio, which relates the parallel task demand to the mean service demand of nonparallel workstation processes, is introduced. We propose that task ratio is a useful metric for determining how a parallel applications should be partitioned and scheduled in order to make efficient use of a nondedicated distributed system. 相似文献

14.

一个调度Fork-Join任务图的新算法 总被引：16，自引：1，他引：16

刘振英方滨兴姜誉张毅赵宏《软件学报》2002,13(4):693-697

任务调度是影响工作站网络效率的关键因素之一.Fork-Join任务图可以代表很多并行结构,但其他已有调度Fork-Join任务图算法忽略了在非全互连工作站网络环境中通信之间不能并行执行的问题,有些效率高的算法又没有考虑节省处理器个数的问题.因此,专门针对该任务图,综合考虑调度长度、非并行通信和节省处理器个数问题,提出了一个基于任务复制的静态调度算法TSA_FJ.通过随机产生任务的执行时间和通信时间,生成了多个Fork-Join任务图,并且采用TSA_FJ算法和其他调度算法对生成的任务图进行调度.结果表明, 相似文献

15.

Usefulness of adaptive load sharing for parallel processing on networks of workstations

Sheldon Clarke Sivarama P. Dandamudi 《Concurrency and Computation》1999,11(8):387-405

Networks of workstations (NOWs) can be used for parallel processing by using public domain software like PVM. However, NOW-based parallel processing suffers from node heterogeneity, background load variations, and high-latency, low-bandwidth communication network. Previous studies on load sharing in NOW-based systems have indicated that, for applications using the work-pile model, a simple load sharing scheme in which the master process gives a fixed amount of work to the slave processes performs as well as any other, more complex schemes. In this paper, we propose a new adaptive load sharing scheme and evaluate its performance using a Pentium-based NOW machine. The communication network used in the system consists of the standard 10 Mbps Ethernet and the 100 Mbps fast Ethernet. We use both these networks to study their impact on the performance of our new policy. The results presented here indicate that the new policy is useful for computation-intensive applications. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献

16.

Fault-Tolerant Matrix Operations for Networks of Workstations Using Diskless Checkpointing

James S. Plank Youngbae Kim Jack J. Dongarra 《Journal of Parallel and Distributed Computing》1997,43(2):427

Networks of workstations (NOWs) offer a cost-effective platform for high-performance, long-running parallel computations. However, these computations must be able to tolerate the changing and often faulty nature of NOW environments. We present high-performance implementations of several fault-tolerant algorithms for distributed scientific computing. The fault-tolerance is based on diskless checkpointing, a paradigm that uses processor redundancy rather than stable storage as the fault-tolerant medium. These algorithms are able to run on clusters of workstations that change over time due to failure, load, or availability. As long as there are at leastnprocessors in the cluster, and failures occur singly, the computation will complete in an efficient manner. We discuss the details of how the algorithms are tuned for fault-tolerance and present the performance results on a PVM network of Sun workstations connected by a fast, switched ethernet. 相似文献

17.

Prophet: automated scheduling of SPMD programs in workstation networks

Jon B. Weissman 《Concurrency and Computation》1999,11(6):301-321

Obtaining efficient execution of parallel programs in workstation networks is a difficult problem for the user. Unlike dedicated parallel computer resources, network resources are shared, heterogeneous, vary in availability, and offer communication performance that is still an order of magnitude slower than parallel computer interconnection networks. Prophet, a system that automatically schedules data parallel SPMD programs in workstation networks for the user, has been developed. Prophet uses application and resource information to select the appropriate type and number of workstations, divide the application into component tasks and data across these workstations, and assign tasks to workstations. This system has been integrated into the Mentat parallel processing system developed at the University of Virginia. A suite of scientific Mentat applications has been scheduled using Prophet on a heterogeneous workstation network. The results are promising and demonstrate that scheduling SPMD applications can be automated with good performance. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献

18.

A case for NOW (Networks of Workstations)

Anderson T.E. Culler D.E. Patterson D. 《Micro, IEEE》1995,15(1):54-64

Networks of workstations are poised to become the primary computing infrastructure for science and engineering. NOWs may dramatically improve virtual memory and file system performance; achieve cheap, highly available, and scalable file storage: and provide multiple CPUs for parallel computing. Hurdles that remain include efficient communication hardware and software, global coordination of multiple workstation operating systems, and enterprise-scale network file systems. Our 100-node NOW prototype aims to demonstrate practical solutions to these challenges 相似文献

19.

NOW G-Net: learning classification programs on networks ofworkstations

Anglano C. Botta M. 《Evolutionary Computation, IEEE Transactions on》2002,6(5):463-480

相似文献

20.

PORTING REGULAR APPLICATIONS ON HETEROGENEOUS WORKSTATION NETWORKS: PERFORMANCE ANALYSIS AND MODELING

《International Journal of Parallel, Emergent and Distributed Systems》2012,27(3):205-226

Abstract

Heterogeneous networks of workstations and/or personal computers (NOW) are increasingly used as a powerful platform for the execution of parallel applications. When applications previously developed for traditional parallel machines (homogeneous and dedicated) are ported to NOWs, performance worsens owing in part to less efficient communications but more often to unbalancing.

In this paper, we address the problem of the efficient porting to heterogeneous NOWs of data-parallel applications originally developed using the SPMD paradigm for homogeneous parallel systems with regular topology like ring.

To achieve good performance, the computation time on the various machines composing the NOW must be as balanced as possible. This can be obtained in two ways: by using an heterogeneous data partition strategy with a single process per node, or by splitting homogeneously data among processes and assigning to each node a number of processes proportional to its computing power. The first method is however more difficult, since some modifications in the code are always needed, whereas the second approach requires very few changes.

We carry out a simplified but reliable analysis, and propose a simple model able to simulate performance in the various situations. Two test cases, matrix multiplication and computation of long-range interactions, are considered, obtaining a good agreement between simulated and experimental results.

Our analysis shows that an efficient porting of regular homogeneous data-parallel applications on heterogeneous NOWs is possible. Particularly, the approach based on multiple processes per node turns out to be a straightforward and effective way for achieving very satisfying performance in almost all situations, even dealing with highly heterogeneous systems. 相似文献