期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids 总被引：1，自引：0，他引：1

Chtepen M. Claeys F.H.A. Dhoedt B. De Turck F. Demeester P. Vanrolleghem P.A. 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(2):180-190

A grid is a distributed computational and storage environment often composed of heterogeneous autonomously managed subsystems. As a result, varying resource availability becomes commonplace, often resulting in loss and delay of executing jobs. To ensure good grid performance, fault tolerance should be taken into account. Commonly utilized techniques for providing fault tolerance in distributed systems are periodic job checkpointing and replication. While very robust, both techniques can delay job execution if inappropriate checkpointing intervals and replica numbers are chosen. This paper introduces several heuristics that dynamically adapt the above mentioned parameters based on information on grid status to provide high job throughput in the presence of failure while reducing the system overhead. Furthermore, a novel fault-tolerant algorithm combining checkpointing and replication is presented. The proposed methods are evaluated in a newly developed grid simulation environment dynamic scheduling in distributed environments (DSiDE), which allows for easy modeling of dynamic system and job behavior. Simulations are run employing workload and system parameters derived from logs that were collected from several large-scale parallel production systems. Experiments have shown that adaptive approaches can considerably improve system performance, while the preference for one of the solutions depends on particular system characteristics, such as load, job submission patterns, and failure frequency. 相似文献

2.

Parallel programming using shared objects and broadcasting

Tannenbaum A.S. Kaashoek M.F. Bal H.E. 《Computer》1992,25(8):10-19

The two major design approaches taken to build distributed and parallel computer systems, multiprocessing and multicomputing, are discussed. A model that combines the best properties of both multiprocessor and multicomputer systems, easy-to-build hardware, and a conceptually simple programming model is presented. Using this model, a programmer defines and invokes operations on shared objects, the runtime system handles reads and writes on these objects, and the reliable broadcast layer implements indivisible updates to objects using the sequencing protocol. The resulting system is easy to program, easy to build, and has acceptable performance on problems with a moderate grain size in which reads are much more common than writes. Orca, a procedural language whose sequential constructs are roughly similar to languages like C or Modula 2 but which also supports parallel processes and shared objects and has been used to develop applications for the prototype system, is described 相似文献

3.

Analysis of processor allocation in multiprogrammed,distributed-memory parallel processing systems

Setia S.K. Squillante M.S. Tripathi S.K. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(4):401-420

A main objective of scheduling independent jobs composed of multiple sequential tasks in shared-memory and distributed-memory multiprocessor computer systems is the assignment of these tasks to processors in a manner that ensures efficient operation of the system. Achieving this objective requires the analysis of a fundamental tradeoff between maximizing parallel execution, suggesting that the tasks of a job be spread across all system processors, and minimizing synchronization and communication overheads, suggesting that the job's tasks be executed on a single processor. The authors consider a class of scheduling policies that represent the essential aspects of this processor allocation tradeoff, and model the system as a distributed fork-join queueing system. They derive an approximation for the expected job response time, which includes the important effects of various parallel processing overheads (such as task synchronization and communication) induced by the processor allocation policy 相似文献

4.

Modeling parallel and distributed systems with finite workloads

Ahmed M. Lester Reda 《Performance Evaluation》2005,60(1-4):303-325

In studying or designing parallel and distributed systems one should have available a robust analytical model that includes the major parameters that determine the system performance. Jackson networks have been very successful in modeling computer systems. However, the ability of Jackson networks to predict performance with system changes remains an open question, since they do not apply to systems where there are population size constraints. Also, the product-form solution of Jackson networks assumes steady-state and exponential service centers or certain specialized queueing discipline. In this paper, we present a transient model for Jackson networks that is applicable to any population size and any finite workload (no new arrivals). Using several non-exponential distributions we show to what extent the exponential distribution can be used to approximate other distributions and transient systems with finite workloads. When the number of tasks to be executed is large enough, the model approaches the product-form solution (steady-state solution). We also, study the case where the non-exponential servers have queueing (Jackson networks cannot be applied). Finally, we show how to use the model to analyze the performance of parallel and distributed systems. 相似文献

5.

并行文件系统PFS的设计与分析 总被引：2，自引：0，他引：2

武北虹黄大海邢汉承《软件学报》1995,6(11):641-646

并行计算机高速处理能力与低速Ｉ／Ｏ设备之间的矛盾目前已成为并行计算机系统中的主要问题之一，因此必须研制高性能的并行文件系统．。本文介绍的ＰＦＳ是为“八六三计划”中“曙光二号”并行计算机设计的并行文件系统，该ＰＦＳ分散Ｉ／Ｏ设备的管理到多个处理结点，使文件能交叉地分布在不同Ｉ／Ｏ结点所控制的物理设备上，以实现最大程度的并行访问。本文阐述了ＰＦＳ的基本设计、实现方法及其性能的粗略分析。相似文献

6.

Knowledge-based environment for investigating multicomputer architectures

TG Kim BP Zeigler 《Information and Software Technology》1989,31(10):512-520

Multicomputers for massively parallel processing will eventually employ billions of processing elements, each of which will be capable of communicating with every other processing element. A knowledge-based modelling and simulation environment (KBMSE) for investigating such multicomputer architecture at a discrete-event system level is described. The KBMSE implements the discrete-event system specification (DEVS) formalism in an object-oriented programming system of Scheme (a dialect), which supports building models in a hierarchical, modular manner, a systems-oriented approach not possible in conventional simulation languages. The paper presents a framework for knowledge-based modelling and simulation by exemplifying modelling a hypercube multicomputer architecture in the KBMSE. The KBMSE has been tested on a variety of domains characterized by complex, hierarchical structures such as advanced multicomputer architectures, local area computer networks, intelligent multi-robot organizations, and biologically based life-support systems. 相似文献

7.

Performance of Adaptive Space Sharing Processor Allocation Policies for Distributed-Memory Multicomputers

《Journal of Parallel and Distributed Computing》1999,58(1):109-125

Several space sharing policies have been proposed for distributed-memory multicomputer systems. We consider adaptive space sharing policies, as these policies provide a better performance than fixed and static policies by taking system load and user requirements into account. In this paper we propose an improved space sharing policy by suggesting a simple modification to a previously proposed policy. We study performance sensitivity of the original and modified policies to job structure and various other system and workload parameters like variances in inter-arrival times and job service times. The results presented here demonstrate that the modified policy performs substantially better than the original policy. 相似文献

8.

Java for high‐performance network‐based computing: a survey

M. Lobosco C. Amorim O. Loques 《Concurrency and Computation》2002,14(1):1-31

There has been an increasing research interest in extending the use of Java towards high‐performance demanding applications such as scalable Web servers, distributed multimedia applications, and large‐scale scientific applications. However, extending Java to a multicomputer environment and improving the low performance of current Java implementations pose great challenges to both the systems developer and application designer. In this survey, we describe and classify 14 relevant proposals and environments that tackle Java's performance bottlenecks in order to make the language an effective option for high‐performance network‐based computing. We further survey significant performance issues while exposing the potential benefits and limitations of current solutions in such a way that a framework for future research efforts can be established. Most of the proposed solutions can be classified according to some combination of three basic parameters: the model adopted for inter‐process communication, language extensions, and the implementation strategy. In addition, where appropriate to each individual proposal, we examine other relevant issues, such as interoperability, portability, and garbage collection. Copyright © 2002 John Wiley & Sons, Ltd. 相似文献

9.

Solution of finite element systems on concurrent processing computers

Charbel Farhat Edward Wilson Graham Powell 《Engineering with Computers》1987,2(3):157-165

A new computer program architecture for the solution of finite element systems using concurrent processing is presented. The basic approach involves the automatic creation of substructures. A host provides control over a set of processors, each of which is assigned initially to one substructure, then dynamically reassigned to the common interface for the solution of the complete system of substructures. Algorithm details are presented fo each phase of the analysis.Results of analysis of large plate bending problems on a hypercube multicomputer are reported. For a system with 2,000 equations, an efficiency of 80 percent of the maximum theoretical value was obtained using 16 processors. 相似文献

10.

Performance modeling of load-balancing algorithms using neural networks

Ishfaq Ahmad Kishan Mehrotra Chilukuri K. Mohan Sanjay Ranka Arif Ghafoor 《Concurrency and Computation》1994,6(5):393-409

The paper presents a new approach that uses neural networks to predict the performance of a number of dynamic decentralized load-balancing strategies. A distributed multicomputer system using distributed load-balancing strategies is represented by a unified analytical queuing model. A large simulation data set is used to train a neural network using the back-propagation learning algorithm based on gradient descent The performance model using the predicted data from the neural network produces the average response time of various load balancing algorithms under various system parameters. The validation and comparison with simulation data show that the neural network is very effective in predicting the performance of dynamic load-balancing algorithms. Our work leads to interesting techniques for designing load balancing schemes (for large distributed systems) that are computationally very expensive to simulate. One of the important findings is that performance is affected least by the number of nodes, and most by the number of links at each node in a large distributed system. 相似文献

11.

Optimal Clustering of Hierarchical Hyper-Ring Multicomputers

Sibai Fadi N. 《The Journal of supercomputing》1999,14(1):53-76

The Hyper-Ring (HR) is presented as a hierarchical and scalable ring-based topology for small-scale to massively parallel systems which eliminates the major disadvantages of large-scale rings. With a fixed node degree, a low cost, symmetric properties, and a simple routing scheme, the HR topology is very suitable for small-scale to large-scale multicomputer systems. Assuming pipelined communication, the performance of 4- and 5-dimensional HR multicomputers is modeled, the performance model is evaluated, and the results of the performance model evaluation are analyzed. Moreover, the impact of the traffic load and message length on the system performance is analyzed. The major objective of this work is to shed light on how to cluster HRs in order to optimize the system efficiency. Assuming a uniform message arrival rate into the nodes of the HR, the results show that the efficiency of HR topologies with an equal number of nodes is best when the topologies are perfectly balanced. The next best-performing HRs are those with larger rings at the lower (outer) levels and smaller rings at the higher levels (near the root ring). The results confirm that the HR topology is suitable for massively parallel and scalable multicomputer systems as well as for networks of workstations. 相似文献

12.

A parallel logic system on a multicomputer architecture

M. Cannataro G. Spezzano D. Talia 《Future Generation Computer Systems》1991,6(4):317-331

This paper describes the implementation of a logic programming language on a massively parallel architecture. This implementation is based on the AND/OR Process Model which allows the exploitation of both AND and OR parallelism in logic programs. A distributed memory model is used, and a decentralized control mechanism has been designed. The multicomputer, which the system has been implemented on, consists of a network of Inmos Transputers. The AND/OR processes are implemented as Occam processes mapped onto the Transputer nodes. After the presentation of the system architecture and a deep discussion of the distributed memory management, some preliminary performance results are discussed. 相似文献

13.

Response time analysis of parallel computer and storage systems

Varki E. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(11):1146-1161

Fork-join structures have gained increased importance in recent years as a means of modeling parallelism in computer and storage systems. The basic fork-join model is one in which a job arriving at a parallel system splits into K independent tasks that are assigned to K unique, homogeneous servers. In the paper, a simple response time approximation is derived for parallel systems with exponential service time distributions. The approximation holds for networks modeling several devices, both parallel and nonparallel. (In the case of closed networks containing a stand-alone parallel system, a mean response time bound is derived.) In addition, the response time approximation is extended to cover the more realistic case wherein a job splits into an arbitrary number of tasks upon arrival at a parallel system. Simulation results for closed networks with stand-alone parallel subsystems and exponential service time distributions indicate that the response time approximation is, on average, within 3 percent of the seeded response times. Similarly, simulation results with nonexponential distributions also indicate that the response time approximation is close to the seeded values. Potential applications of our results include the modeling of data placement in disk arrays and the execution of parallel programs in multiprocessor and distributed systems 相似文献

14.

Methodology for predicting performance of distributed and parallel systems

Rakesh Kushwaha 《Performance Evaluation》1993,18(3):189-204

This paper describes an accurate and efficient method to model and predict the performance of distributed/parallel systems. Various performance measures, such as the expected user response time, the system throughput and the average server utilization, can be easily estimated using this method. The methodology is based on known product form queueing network methods, with some additional approximations. The method is illustrated by evaluating performance of a multi-client multi-server distributed system. A system model is constructed and mapped to a probabilistic queueing network model which is used to predict its behavior. The effects of user think time and various design parameters on the performance of the system are investigated by both the analytical method and computer simulation. The accuracy of the former is verified. The methodology is applied to identify the bottleneck server and to establish proper balance between clients and servers in distributed/parallel systems. 相似文献

15.

Strategy and Simulation of Adaptive RID for Distributed Dynamic Load Balancing in Parallel Systems

下载免费PDF全文

Lin Chengiiang Li Sanli 《计算机科学技术学报》1997,12(2):113-120

Dynamic load balancing schemes are significant for efficiently executing nonuniform problems in highly parallel multicomputer systems.The objective is to minimize the total exectuion time of single applications.This paper has proposed an ARID strategy for distributed dynamic load balancing.Its principle and control protocol are described,and te communication overhead,the effect on system stability and the performance efficiency are analyzed.Finally,simulation experiments are carried out to compare the adaptive strategy with other dynamic load balancing schemes. 相似文献

16.

Simulation study of multitasking in distributed server systems with variable workload

Helen D. Karatza 《Simulation Modelling Practice and Theory》2004,12(7-8):591

The multitasking performance of a distributed server system is considered where the workload is variable. Time varying distributions are proposed for job arrival, job parallelism, and task service demand. A simulation model is used to address performance issues associated with task scheduling. The objective is to identify conditions that produce good overall system performance, while maintaining fairness of individual job execution times. Simulated results show that all task scheduling methods have merit. In all cases, the best policy depends on performance goals. Furthermore, although the paper studies distributed multiprocessor system performance, it also addresses the performance issues of other systems where multitasking and scheduling is used, since no restrictions are imposed on the server/job/task characteristics. 相似文献

17.

Performance bounds for distributed systems with workload variabilities and uncertainties

《Parallel Computing》1997,22(13):1789-1806

Bounding techniques for queuing network models used to analyze the performance of parallel and distributed computer systems accept single values as model inputs. Uncertainties or variabilities in service demands may exist in many types of systems. Using models with a single aggregate mean value for each parameter for such systems can lead to inaccurate or even incorrect results. This paper proposes to use histograms for characterizing model parameters that are associated with uncertainty and/or variability. The adaptation of the well-known asymptotic bounds as well as balanced job bounds for single class queuing networks to histogram parameters is presented in the paper. 相似文献

18.

Task partitioning,scheduling and load balancing strategy for mixed nature of tasks 总被引：1，自引：1，他引：0

Kalim Qureshi Babar Majeed Jawad Haider Kazmi Sajjad Ahmed Madani 《The Journal of supercomputing》2012,59(3):1348-1359

Load balancing and task partitioning are important components of distributed computing. The optimum performance from the distributed computing system is achieved by using effective scheduling and load balancing strategy. Researchers have well explored CPU, memory, and I/O-intensive tasks scheduling, and load balancing techniques. But one of the main obstacles of the load balancing technique leads to the ignorance of applications having a mixed nature of tasks. This is because load balancing strategies developed for one kind of job nature are not effective for the other kind of job nature. We have proposed a load balancing scheme in this paper, which is known as Mixed Task Load Balancing (MTLB) for Cluster of Workstation (CW) systems. In our proposed MTLB strategy, pre-tasks are assigned to each worker by the master to eliminate the worker’s idle time. A main feature of MTLB strategy is to eradicate the inevitable selection of workers. Furthermore, the proposed MTLB strategy employs Three Resources Consideration (TRC) for load balancing (CPU, Memory, and I/O). The proposed MTLB strategy has removed the overheads of previously proposed strategies. The measured results show that MTLB strategy has a significant improvement in performance. 相似文献

19.

基于光滑聚集代数多重网格的有限元并行计算实现方法

武立伟张健飞张倩《计算机辅助工程》2017,26(6):16-22

基于光滑聚集代数多重网格法实现一种用于结构有限元并行计算的预条件共轭梯度求解方法。对计算区域进行均匀划分,将这些子区域分配给各个进程同时进行单元刚度矩阵的计算,并组合形成分布式存储的整体平衡方程。采用光滑聚集代数多重网格预条件共轭梯度法对整体平衡方程进行并行求解,在天河二号超级计算机上进行数值试验,分析代数多重网格的主要参数对算法性能的影响,测试程序的并行计算性能。试验结果表明该方法具有较好的并行性能和可扩展性,适合于大规模实际应用。相似文献

20.

QoS-aware,access-efficient,and storage-efficient replica placement in grid environments

Chieh-Wen Cheng Jan-Jan Wu Pangfeng Liu 《The Journal of supercomputing》2009,49(1):42-63

In this paper, we study the quality-of-service (QoS)-aware replica placement problem in grid environments. Although there has been much work on the replica placement problem in parallel and distributed systems, most of them concern average system performance and have not addressed the important issue of quality of service requirement. In the very few existing work that takes QoS into consideration, a simplified replication model is assumed; therefore, their solution may not be applicable to real systems. In this paper, we propose a more realistic model for replica placement, which consider storage cost, update cost, and access cost of data replication, and also assumes that the capacity of each replica server is bounded. The QoS-aware replica placement is NP-complete even in the simple model. We propose two heuristic algorithms, called greedy remove and greedy add to approximate the optimal solution. Our extensive experiment results demonstrate that both greedy remove and greedy add find a near-optimal solution effectively and efficiently. Our algorithms can also adapt to various parallel and distributed environments. 相似文献