期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimal replica placement in hierarchical Data Grids with locality assurance

Jan-Jan Wu Yi-Fang Lin Pangfeng Liu 《Journal of Parallel and Distributed Computing》2008

In this paper, we address three issues concerning data replica placement in hierarchical Data Grids that can be presented as tree structures. The first is how to ensure load balance among replicas. To achieve this, we propose a placement algorithm that finds the optimal locations for replicas so that their workload is balanced. The second issue is how to minimize the number of replicas. To solve this problem, we propose an algorithm that determines the minimum number of replicas required when the maximum workload capacity of each replica server is known. Finally, we address the issue of service quality by proposing a new model in which each request must be given a quality-of-service guarantee. We describe new algorithms that ensure both workload balance and quality of service simultaneously. 相似文献

2.

An optimal scheduling algorithm for an agent-based multicast strategy on irregular networks

Pangfeng Liu Yi-Fang Lin Jan-Jan Wu Zhe-Hao Kang 《The Journal of supercomputing》2007,42(3):283-302

This paper describes an agent-based approach for scheduling multiple multicast on wormhole switch-based networks with irregular topologies. Multicast/broadcast is an important communication pattern, with applications in collective communication operations such as barrier synchronization and global combining. Our approach assigns an agent to each subtree of switches such that the agents can exchange information efficiently and independently. The entire multicast problem is then recursively solved with each agent sending message to those switches that it is responsible for. In this way, communication is localized by the assignment of agents to subtrees. This idea can be easily generalized to multiple multicast since the order of message passing among agents can be interleaved for different multicasts. The key to the performance of this agent-based approach is the message-passing scheduling between agents and the destination processors. We propose an optimal scheduling algorithm, called ForwardInSwitch to solve this problem. We conduct extensive experiments to demonstrate the efficiency of our approach by comparing our results with SPCCO, a highly efficient multicast algorithm reported in literature. We found that SPCCO suffers link contention when the number of simultaneous multiple multicast becomes large. On the other hand, our agent-based approach achieves better performance in large cases. 相似文献

3.

Optimizing I/O server placement for parallel I/O on switch-based irregular networks

Yih-Fang Lin Chien-Min Wang Jan-Jan Wu 《The Journal of supercomputing》2006,36(3):201-217

In this paper, we study I/O server placement for optimizing parallel I/O performance on switch-based clusters, which typically adopt irregular network topologies to allow construction of scalable systems with incremental expansion capability. Finding optimal solution to this problem is computationally intractable. We quantified the number of messages travelling through each network link by a workload function, and developed three heuristic algorithms to find good solutions based on the values of the workload function. The maximum-workload-based heuristic chooses the locations for I/O nodes in order to minimize the maximum value of the workload function. The distance-based heuristic aims to minimize the average distance between the compute nodes and I/O nodes, which is equivalent to minimizing average workload on the network links. The load-balance-based heuristic balances the workload on the links based on a recursive traversal of the routing tree for the network. Our simulation results demonstrate performance advantage of our algorithms over a number of algorithms commonly used in existing parallel systems. In particular, the load-balance-based algorithm is superior to the other algorithms in most cases, with improvement ratio of 10 to 95% in terms of parallel I/O throughput. 相似文献

4.

An Interleaving Transformation for Parallelizing Reductions for Distributed-Memory Parallel Machines

Wu Jan-Jan 《The Journal of supercomputing》2000,15(3):321-339

Reduction operations frequently appear in algorithms. Due to their mathematical invariance properties (assuming that round-off errorscan be tolerated), it is reasonable to ignore ordering constraints on the computation of reductions in order to take advantage of the computing power of parallel machines.One obvious and widely-used compilation approach for reductions is syntactic pattern recognition. Either the source language includes explicit reduction operators, or certain specific loops are recognized as equivalent to known reductions. Once such patterns are recognized, hand optimized code for the reductions are incorporated in the target program. The advantage of this approach is simplicity. However, it imposes restrictions on the reduction loops—no data dependence other than that caused by the reduction operation itself is allowed in the reduction loops.In this paper, we present a parallelizing technique, interleaving transformation, for distributed-memory parallel machines. This optimization exploits parallelism embodied in reduction loops through combination of data dependence analysis and region analysis. Data dependence analysis identifies the loop structures and the conditions that can trigger this optimization. Region analysis divides the iteration domain into a sequential region and an order-insensitive region. Parallelism is achieved by distributing the iterations in the order-insensitive region among multiple processors. We use a triangular solver as an example to illustrate the optimization. Experimental results on various distributed-memory parallel machines, including the Connection Machines CM-5, the nCUBE, the IBM SP-2, and a network of Sun Workstations are reported. 相似文献

5.

Optimizing server placement in hierarchical grid environments 总被引：1，自引：1，他引：0

Chien-Min Wang Chun-Chen Hsu Pangfeng Liu Hsi-Min Chen Jan-Jan Wu 《The Journal of supercomputing》2007,42(3):267-282

In this paper, we address some problems related to server placement in Grid environments. Given a hierarchical network with requests from clients and constraints on server capability, the minimum server placement problem attempts to place the minimum number of servers that satisfy requests from clients. Instead of using a heuristic approach, we propose an optimal algorithm based on dynamic programming to solve the problem. We also consider the balanced server placement problem, which tries to place a given number of servers appropriately so that their workloads are as balanced as possible. We prove that an optimal server placement can be achieved by combining the above algorithm with a binary search on workloads. This approach can be further extended to deal with constrains on network capability. The simulation results clearly show the improvement in the number of servers and the maximum workload. Furthermore, as the maximum workload is reduced, the waiting time is reduced accordingly.

Jan-Jan WuEmail:

相似文献

6.

Optimizing server placement in distributed systems in the presence of competition 总被引：1，自引：0，他引：1

Jan-Jan WuAuthor Vitae Shu-Fan ShihAuthor Vitae Pangfeng LiuAuthor VitaeYi-Min ChungAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(1):62-76

Although the problem of data server placement in parallel and distributed systems has been studied extensively, most of the existing work assumes there is no competition between servers. Hence, their goal is to minimize read, update and storage cost. In this paper, we study the server placement problem in which a new server has to compete with existing servers for user requests. Therefore, in addition to minimizing cost, we also need to maximize the benefit of building a new server.Our major results include three parts. First, for tree-structured systems, we propose an O(|V|³k) time dynamic programming algorithm to find the optimal placement of k extra servers that maximizes the benefit in a tree with |V| nodes. We also propose an O(|V|³) time dynamic programming algorithm to find the optimal placement of extra servers that maximizes the benefit, without any constraint on the number of extra servers. Second, for general connected graphs, we prove that the server placement problems are NP-complete, and present three greedy heuristic algorithms, called Greedy Add, Greedy Remove and Greedy Add-Remove, to solve them. Third, we show that if the number of requests a server can handle (i.e., server capacity) is bounded, the server placement problem is NP-complete even for tree networks. We then derive a variation of the same set of greedy heuristic algorithms, with consideration of server capacity constraint, to solve the problem.Our experiment results demonstrate that the greedy algorithms achieve good results, when compared with the upper bounds found by a linear programming algorithm. Greedy Add performs best in the unconstrained model, yielding a benefit within 12% difference from the theoretical upper bound in average. For the constrained model, Greedy Remove performs best for smaller network sizes, while Greedy Add-Remove performs best for larger network sizes. On average, the heuristic algorithms yield a benefit within 13% difference from the theoretical upper bound in the constrained model. 相似文献

7.

Computation and communication schedule optimization for data-sharing tasks on uniprocessor

Jan-Jan En-Jan Pangfeng 《Journal of Systems Architecture》2009,55(7-9):363-372

Almost every computation task requires input data in order to find a solution. This is not a problem for a centralized system because data is usually available locally. However, in a parallel and distributed system, e.g., computation grids, the data may be in remote sites and must be transferred to the local site before the computation can proceed. As a result, the interleaved sequence of data transfer and job execution has a significant impact on the overall computational efficiency. In this paper, we analyze the computational complexity of the shared-data job scheduling problem on uniprocessor, with and without consideration of the storage capacity constraint on the local site.We show that if there is an upper bound on the server capacity, the problem is NP-complete, even when each job depends on at most two data items. For the case where there is no upper bound on the server capacity, we show that there exists an efficient algorithm that can provide an optimal job schedule when each job depends on at most two data items. We also propose an efficient heuristic algorithm that can determine good schedules for cases where there is no limit on the amount of data a job may access. The reported experiment results demonstrate that this heuristic algorithm performs very well, and derives near optimal solutions. 相似文献

8.

Optimizing server placement for parallel I/O in switch-based clusters

Jan-Jan Wu Yi-Fang Lin Da-Wei Wang Chien-Min Wang 《Journal of Parallel and Distributed Computing》2009

相似文献

9.

QoS-aware,access-efficient,and storage-efficient replica placement in grid environments

Chieh-Wen Cheng Jan-Jan Wu Pangfeng Liu 《The Journal of supercomputing》2009,49(1):42-63

In this paper, we study the quality-of-service (QoS)-aware replica placement problem in grid environments. Although there has been much work on the replica placement problem in parallel and distributed systems, most of them concern average system performance and have not addressed the important issue of quality of service requirement. In the very few existing work that takes QoS into consideration, a simplified replication model is assumed; therefore, their solution may not be applicable to real systems. In this paper, we propose a more realistic model for replica placement, which consider storage cost, update cost, and access cost of data replication, and also assumes that the capacity of each replica server is bounded. The QoS-aware replica placement is NP-complete even in the simple model. We propose two heuristic algorithms, called greedy remove and greedy add to approximate the optimal solution. Our extensive experiment results demonstrate that both greedy remove and greedy add find a near-optimal solution effectively and efficiently. Our algorithms can also adapt to various parallel and distributed environments. 相似文献

10.

CRAFT: a framework for F90/HPF compiler optimizations

Jan-Jan Wu Marina Chen James Cowie 《Concurrency and Computation》1999,11(10):529-569

In this paper, we give an overview of the results of the CRAFT optimising compiler project (Fortran 90/HPF subset compilers). We start by describing the theoretical framework within which we designed program transformations for the optimization of inter- and intra-procedural data motion, as well as the optimizations for parallel loops; we then describe the implementation of the CRAFT compilers for Thinking Machines' CM-2 and CM-5. We report results from experiments on the Connection Machine CM-5, the IBM SP-2 and a network of UltraSparc workstations. The results demonstrate that these optimizations can achieve significant object code performance improvement. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献