首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Parallel computing on interconnected workstations is becoming a viable and attractive proposition due to the rapid growth in speeds of interconnection networks and processors. In the case of workstation clusters, there is always a considerable amount of unused computing capacity available in the network. However, heterogeneity in architectures and operating systems, load variations on machines, variations in machine availability, and failure susceptibility of networks and workstations complicate the situation for the programmer. In this context, new programming paradigms that reduce the burden involved in programming for distribution, load adaptability, heterogeneity and fault tolerance gain importance. This paper identifies the issues involved in parallel computing on a network of workstations. The anonymous remote computing (ARC) paradigm is proposed to address the issues specific to parallel programming on workstation systems. ARC differs from the conventional communicating process model by treating a program as one single entity consisting of several loosely coupled remote instruction blocks instead of treating it as a collection of processes. The ARC approach results in distribution transparency and heterogeneity transparency. At the same time, it provides fault tolerance and load adaptability to parallel programs on workstations. ARC is developed in a two-tiered architecture consisting of high level language constructs and low level ARC primitives. The paper describes an implementation of the ARC kernel supporting ARC primitives  相似文献   

2.
Molecular dynamics simulations investigate local and global motion in molecules. Several parallel computing approaches have been taken to attack the most computationally expensive phase of molecular simulations, the evaluation of long range interactions. This paper reviews these approaches and develops a straightforward but effective algorithm using the machine-independent parallel programming language, Linda. The algorithm was run both on a shared memory parallel computer and on a network of high performance Unix workstations. Performance benchmarks were performed on both systems using two proteins. This algorithm offers a portable cost-effective alternative for molecular dynamics simulations. In view of the increasing numbers of networked workstations, this approach could help make molecular dynamics simulations more easily accessible to the research community.  相似文献   

3.
TreadMarks: shared memory computing on networks of workstations   总被引:2,自引:0,他引:2  
Shared memory facilitates the transition from sequential to parallel processing. Since most data structures can be retained, simply adding synchronization achieves correct, efficient programs for many applications. We discuss our experience with parallel computing on networks of workstations using the TreadMarks distributed shared memory system. DSM allows processes to assume a globally shared virtual memory even though they execute on nodes that do not physically share memory. We illustrate a DSM system consisting of N networked workstations, each with its own memory. The DSM software provides the abstraction of a globally shared memory, in which each processor can access any data item without the programmer having to worry about where the data is or how to obtain its value  相似文献   

4.
Modern distributed systems consisting of powerful workstations and high-speed interconnection networks are an economical alternative to special-purpose supercomputers. The technical issues that need to be addressed in exploiting the parallelism inherent in a distributed system include heterogeneity, high-latency communication, fault tolerance and dynamic load balancing. Current software systems for parallel programming provide little or no automatic support towards these issues and require users to be experts in fault-tolerant distributed computing. The Paralex system is aimed at exploring the extent to which the parallel application programmer can be liberated from the complexities of distributed systems. Paralex is a complete programming environment and makes extensive use of graphics to define, edit, execute, and debug parallel scientific applications. All of the necessary code for distributing the computation across a network and replicating it to achieve fault tolerance and dynamic load balancing is automatically generated by the system. In this paper we give an overview of Paralex and present our experiences with a prototype implementation  相似文献   

5.
Parallel programming for multimedia applications   总被引:2,自引:2,他引:0  
Computing capabilities are continuing to increase with the availability of multi core and many core processors. The wide availability of multi core processors has made parallel programming possible for end user applications running on desktops, workstations, and mobile devices. While parallel hardware has become common, software that exploits parallel capabilities is just beginning to take hold. Multimedia applications, with their data parallel nature and large computing requirements will benefit significantly from parallel programming. In this paper an overview of parallel programming is presented and languages and tools for parallel programming such as OpenMP and CUDA are introduced within the scope of multimedia applications.  相似文献   

6.
The NAS parallel benchmarks are a set of applications that embody the key characteristics of typical processing in computational aerodynamics. Five of these, the kernel benchmarks, have been implemented on the PVM system, a software system for network-based concurrent computing, with a view to determining the efficacy of networked environments for high-performance computational aerodynamics applications. We present results of porting and executing the NPB kernels in three different duster environments using low- to medium-powered workstations on Ethernet and two types of FDDI networks. Our results indicate that mediocre to good performance could be obtained despite the communications-intensive nature of the applications. In most cases, we were able to achieve performance levels within an order of magnitude of a Cray Y/MP-1 on eight-workstation clusters via optimizations to the PVM infrastructure alone, i.e., with little or no algorithmic modifications. However, our results also indicate that further improvements are possible and that network-based computing has the potential to be a viable technology for high-performance scientific computing.  相似文献   

7.
Hamdi  Mounir  Pan  Yi  Hamidzadeh  B.  Lim  F. M. 《The Journal of supercomputing》1999,13(2):111-132
Parallel computing on clusters of workstations is receiving much attention from the research community. Unfortunately, many aspects of parallel computing over this parallel computing engine is not very well understood. Some of these issues include the workstation architectures, the network protocols, the communication-to-computation ratio, the load balancing strategies, and the data partitioning schemes. The aim of this paper is to assess the strengths and limitations of a cluster of workstations by capturing the effects of the above issues. This has been achieved by evaluating the performance of this computing environment in the execution of a parallel ray tracing application through analytical modeling and extensive experimentation. We were successful in illustrating the effect of major factors on the performance and scalability of a cluster of workstations connected by an Ethernet network. Moreover, our analytical model was accurate enough to agree closely with the experimental results. Thus, we feel that such an investigation would be helpful in understanding the strengths and weaknesses of an Ethernet cluster of workstation in the execution of parallel applications.  相似文献   

8.
Obtaining efficient execution of parallel programs in workstation networks is a difficult problem for the user. Unlike dedicated parallel computer resources, network resources are shared, heterogeneous, vary in availability, and offer communication performance that is still an order of magnitude slower than parallel computer interconnection networks. Prophet, a system that automatically schedules data parallel SPMD programs in workstation networks for the user, has been developed. Prophet uses application and resource information to select the appropriate type and number of workstations, divide the application into component tasks and data across these workstations, and assign tasks to workstations. This system has been integrated into the Mentat parallel processing system developed at the University of Virginia. A suite of scientific Mentat applications has been scheduled using Prophet on a heterogeneous workstation network. The results are promising and demonstrate that scheduling SPMD applications can be automated with good performance. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

9.
Small organisations can now have access to high raw processing power using networks of workstations (NOW) as parallel computing platforms. Software Distributed Shared Memory (Software DSM) packages have been developed to facilitate the programming of such systems. However, because of the high interprocess latencies in a NOW, the performance of a software DSM application is more susceptible to the partitioning of the problem than what might be expected.This paper presents an approach for a tool to visualise the execution of a program in a way that highlights performance bottlenecks. The tool associates identified bottlenecks with the corresponding source code lines in order to determine what piece of code is the cause of poor performance. The visualisation technique is demonstrated in two case studies. They clearly show that the visualisation is indeed useful and provides an effective way to acquire an understanding of what characterises an applications sharing behaviour.  相似文献   

10.
Piranha is a execution model for Linda4 developed at Yale(1) to reclaim idle cycles from networked workstations for use in executing parallel programs. Piranha has proven to be an effective system for harnessing large amounts of computing power. Most Piranha research to this point has concentrated on efficiently executing a single application at a time. In this paper we evaluate strategies for scheduling multiple Piranha applications. We examine methods for predicting idle periods and the effectiveness of scheduling strategies that make use of these predictions. We present a prototype scheduler for the Piranha system implemented using the process trellis software architecture for networks of workstations. This work was supported by AASERT Grant F49620-92-J-0240. AFOSR-91-0098 and NASA Training Grant NGT-50719.  相似文献   

11.
The availability of a large number of workstations connected through a network can represent an attractive option for high-performance computing for many applications. The message-passing interface (MPI) software environment is an effort from many organisations to define a de facto message-passing standard. In other words, the original specification was not designed as a comprehensive parallel programming environment and some researchers agree that the standard should be preserved as simple and clean as possible. Nevertheless, a software environment such as MPI should have somehow a scheduling mechanism for the effective submission of parallel applications on network of workstations. This paper presents an alternative lightweight approach called Selective-MPI (S-MPI), which was designed to enhance the efficiency of the scheduling of applications on an MPI implementation environment.  相似文献   

12.
《Computers & chemistry》1996,20(4):431-438
Sophisticated software packages put an increasing demand on computer hardware. In local area networks, computational intensive programs can lower the performance of individual workstations to an unacceptable level. However, utilizing in a coarse grained sense the computing power of all hosts in such networks, offers the potential to achieve considerable improvements in execution speed within reasonable cost limits. Since conventional workstations are not designed to be used in a parallel configuration, the program HYDRA is developed to control and synchronize parallel processing in a local area network. Part I of this paper focuses on the technical aspects of HYDRA, i.e. configuration and implementation. The second and third parts describe two applications of the HYDRA package in the field of chemistry: using parallel genetic algorithms for the conformational analysis of nucleic acids, and parallel cross-validation of artificial neural networks.  相似文献   

13.
Networks of workstations (NOW) are receiving increased attention as a viable platform for high performance parallel computations. Heterogeneity and time-sharing are two characteristics that distinguish the NOW systems from conventional multiprocessor/multicomputer systems which are homogeneous and dedicated. It is important to have a practical model for users to predict the execution times of large-scale parallel applications on nondedicated heterogeneous NOW. Another objective of this study is to provide insight into the dynamic performance of parallel computing and into the effects of program structures and system factors on such a platform. In this paper, we study performance predictions for parallel computing on nondedicated heterogeneous networks of workstations. Our approach is based on a two-level model. On the top level, a semideterministic task graph is used to capture the parallel execution behavior including the variances of communication and synchronization. On the bottom level, a discrete time model is used to quantify effects from NOW systems. An iterative process is used to determine the interactive effects between network contention and task execution. We validate the prediction model using experiments on a nondedicated heterogeneous NOW. The maximum differences between predicted results and measured results were less than 10% in most cases and 15% in the worst cases.  相似文献   

14.
《Parallel Computing》1997,22(11):1477-1492
Cluster-based computing, which exploits the aggregate power of a network of workstations, has drawn increasing attention from the parallel processing community. The main problem with this computing environment is the permanently changing workload of individual workstations which makes the efficiency and the execution time of parallel applications unpredictable. In this paper, we introduce an efficient load balancing scheme which aims at dynamically balancing the workload of data parallel applications in this computing environment. Simulation and experimental studies of our load balancing strategy are performed under various load situations and it is shown that it can effectively balance the workload among the workstations involved. Further, it was shown that a significant improvement in computing performance can be achieved when using our load balancing strategy as compared to the case where no load balancing is applied, particularly under a heavily loaded system.  相似文献   

15.
Several large real‐world applications have been developed for distributed and parallel architectures. We examine two different program development approaches. First, the usage of a high‐level programming paradigm which reduces the time to create a parallel program dramatically but sometimes at the cost of a reduced performance; a source‐to‐source compiler, has been employed to automatically compile programs—written in a high‐level programming paradigm—into message passing codes. Second, a manual program development by using a low‐level programming paradigm—such as message passing—enables the programmer to fully exploit a given architecture at the cost of a time‐consuming and error‐prone effort. Performance tools play a central role in supporting the performance‐oriented development of applications for distributed and parallel architectures. SCALA—a portable instrumentation, measurement, and post‐execution performance analysis system for distributed and parallel programs—has been used to analyze and to guide the application development, by selectively instrumenting and measuring the code versions, by comparing performance information of several program executions, by computing a variety of important performance metrics, by detecting performance bottlenecks, and by relating performance information back to the input program. We show several experiments of SCALA when applied to real‐world applications. These experiments are conducted for a NEC Cenju‐4 distributed‐memory machine and a cluster of heterogeneous workstations and networks. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

16.
计算机和网络硬件设备逐步实现商品化和标准化,PC机或工作站的性能越来越高而价格越来越便宜,同时开源Linux微内核及集群工具中间件技术也日趋成熟稳定,高性能计算集群逐渐发展起来,并成为主流的高性能计算平台。高性能计算集群逐渐替代专用、昂贵的超级计算机对大规模并行应用构建原型、调试和运行。基于PCs或工作站的高性能计算快速部署及其可靠性和可管理性研究,对高性能计算集群在科学研究和工程计算等领域的应用,促进高性能计算技术的应用方面具有深远的意义。本文以OSCAR集群为实例,部署一个五结点的集群环境并运行简单的并行测试例子。  相似文献   

17.
Parallel processing systems using networks of workstations are being used to provide an alternative to expensive parallel processors. Scheduling of tasks on these networks is an important and practical problem that must be addressed. Although CPU load is an important parameter to many of the proposed scheduling schemes, no quantitative analysis of CPU load and its precise relation to the run time of application programs has to date been presented. The work in this paper describes the experimental analysis of one common load measure, the UNIX load average, and its relationship to the run time of computation-bound parallel programs. Data was gathered using a test application program designed to mimic common applications, performing long bursts of computation with occasional interprocess data exchange over the network. The resulting execution times and measured load averages were then analyzed using regression analysis to detect load-run time trends. This paper describes the test program and the experiments, then details the results of the data analysis. A technique is then presented for the evaluation of the load-run time relationship for a computation-bound program on a network of workstations.  相似文献   

18.
A number of high‐level parallel programming platforms for networks of workstations (NOWs) have been developed in recent times. Most of these platforms target the exploitation of data parallelism in applications. They do not allow expressibility of applications as a collection of tasks along with their precedence relationships. As a result, the control or task parallelism in an application cannot be expressed or exploited. The current work aims at integrating the notion of task parallelism and precedence relationships among constituting tasks to such high‐level data parallel platforms for NOWs. Our model of integration provides for arbitrary nesting of data and task parallel modules. Also, the precedence relationships are clearly reflected from the program structure. The model relieves the programmer from the need to design applications for non‐determinism in the order of completion of constituting tasks. The design of the runtime support as well as system‐level book keeping is discussed. The model is general enough to be applied to a wide range of data parallel platforms. A specific case of integrating the model into anonymous remote computing (ARC), a data parallel programming platform, is presented. The performance related aspects are also discussed. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

19.
随着高速网络(如ATM)的发展以及工作站性能的不断提高,工作站网络(NOW)作为一种新型的并行计算结构越来越受到人们的重视。传统的传输协议和报文传递系统不能充分作为高速网的传输能力。本文提出一种基于ATM的支持并行处理的高速通信机制HPMPA。在HPMPA中,可靠的端-端传输协议HSTP为并行应用提供高速可靠的数据传输,而不可靠的端-端传输协议UTP则提供不可靠的高速数据报服务,以混合树结构为基础  相似文献   

20.
The network of workstations (NOW) we consider for parallel computing is heterogeneous and nondedicated (time-sharing), where computing power varies among the workstations, and multiple jobs may interact with each other in execution. We address three performance issues in this paper. First, we examine the effects of heterogeneity on co-scheduling and local scheduling policies for parallel computing. Through experimentation and quantitative comparisons, we discuss features and requirements of scheduling policies on heterogeneous NOW. Second, the heterogeneity and non-dedication of NOW introduce new performance factors into parallel computing, which make traditional performance metrics for parallel computing under homogeneous platforms not suitable. We conducted a collection of experimental measurements to show the performance impact to parallel computing. Finally, using network latencies we experimentally evaluate the parallel computing scalability on NOW. Our objective of this study is to provide insights into unique performance bottlenecks and potentials of networks of workstations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号