期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Image shrinking and expanding on a pyramid

Jenq J.-F. Sahni S. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(11):1291-1296

Develops two algorithms to perform the q step shrinking and expanding of an N×N binary image on a pyramid computer with an N×N base. The time complexity of both algorithms is O(√q). However, one uses O(√q) space per processor, while the per-processor space requirement of the other is O(1) 相似文献

2.

Supply facility and input/output point locations in the presence of barriers

《Computers & Operations Research》2002,29(6):685-699

This paper studies a facility location model in which two-dimensional Euclidean space represents the layout of a shop floor. The demand is generated by fixed rectangular-shaped user sites and served by a single supply facility. It is assumed that (i) communication between the supply point and a demand facility occurs at an input/output (I/O) point on the demand facility itself, (ii) the facilities themselves pose barriers to travel and (iii) distance measurement is as per the L₁-metric. The objective is to determine optimal locations of the supply facility as well as I/O points on the demand facilities, in order to minimize total transportation costs. Several, increasingly more complex, versions of the model are formulated and polynomial time algorithms are developed to find the optimal locations in each case.Scope and purposeIn a facility layout setting, often a new central supply facility such as a parts supply center or tool crib needs to be located to serve the existing demand facilities (e.g., workstations or maintenance areas). The demand facilities are physical entities that occupy space, that cannot be traveled through, and that receive material from the central facility, through a perimeter I/O (input/output or drop-off/pick-up) point. This paper addresses the joint problem of locating the central facility and determining the I/O point on each demand facility to minimize the total material transportation cost. Different versions of this problem are considered. The solution methods draw from and extend results of location theory for a class of restricted location problems. For practitioners, simple results and polynomial time algorithms are developed for solving these facility (re) design problems. 相似文献

3.

Efficient Multiple Multicast on Heterogeneous Network of Workstations

Jan-jan Wu Shih-hsien Yeh Pangfeng Liu 《The Journal of supercomputing》2004,29(1):59-88

In recent years, network of workstations/PCs (so called NOW) are becoming appealing vehicles for cost-effective parallel computing. Due to the commodity nature of workstations and networking equipment, LAN environments are gradually becoming heterogeneous. The diverse sources of heterogeneity in NOW systems pose a challenge on the design of efficient communication algorithms for this class of systems. In this paper, we propose efficient algorithms for multiple multicast on heterogeneous NOW systems, focusing on heterogeneity in processing speeds of workstations/PCs. Multiple multicast is an important operation in many scientific and industrial applications. Multicast on heterogeneous systems has not been investigated until recently. Our work distinguishes itself from others in two aspects: (1) In contrast to the blocking communication model used in prior works, we model communication in a heterogeneous cluster more accurately by a non-blocking communication model, and design multicast algorithms that can fully take advantage of non-blocking communication. (2) While prior works focus on single multicast problem, we propose efficient algorithms for general, multiple multicast (in which single multicast is a special case) on heterogeneous NOW systems. To our knowledge, our work is the earliest effort that addresses multiple multicast for heterogeneous NOW systems. These algorithms are evaluated using a network simulator for heterogeneous NOW systems. Our experimental results on a system of up to 64 nodes show that some of the algorithms outperform others in many cases. The best algorithm achieves completion time that is within 2.5 times of the lower bound. 相似文献

4.

Heuristic Discovery of Role-Based Trust Chains in Peer-to-Peer Networks

Chen Ke Hwang Kai Chen Gang 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(1):83-96

Abstract: Credential chains are needed in trusted peer-to-peer (P2P) applications, where trust delegation must be established between each pair of peers at specific role level. Role-based trust is refined from the coarse-grained trust model used in most P2P reputation systems. This paper offers a novel heuristic-weighting approach to selecting the most likely path to construct a role-based trust chain. We apply history-sensitive heuristics to measure the path complexity and assess the chaining efficiency. We discover successive edges of a trust chain, adaptively, to match with the demands from various P2P applications. New heuristic chaining algorithms are developed for backward, forward, and bi-directional discovery of trust chains. Our heuristic chain discovery scheme shortens the search time, reduces the memory requirement, and enhances the chaining accuracy in scalable P2P networks. Consider a trust graph over N credentials and M distinct role nodes. Our heuristic trust-chain discovery algorithms require O(N2logN) search time and O(M) memory space, if the secondary heuristics are generated off-line in advance. These are improved from O(N3) search time and O(NM) space required in non-heuristic discovery algorithms by Li, Winsborough, and Mitchell (2003). Our analytical results are verified by extensive simulation experiments over typical classes of role-based trust graphs. 相似文献

5.

Fault-Tolerant Matrix Operations for Networks of Workstations Using Diskless Checkpointing

James S. Plank Youngbae Kim Jack J. Dongarra 《Journal of Parallel and Distributed Computing》1997,43(2):427

Networks of workstations (NOWs) offer a cost-effective platform for high-performance, long-running parallel computations. However, these computations must be able to tolerate the changing and often faulty nature of NOW environments. We present high-performance implementations of several fault-tolerant algorithms for distributed scientific computing. The fault-tolerance is based on diskless checkpointing, a paradigm that uses processor redundancy rather than stable storage as the fault-tolerant medium. These algorithms are able to run on clusters of workstations that change over time due to failure, load, or availability. As long as there are at leastnprocessors in the cluster, and failures occur singly, the computation will complete in an efficient manner. We discuss the details of how the algorithms are tuned for fault-tolerance and present the performance results on a PVM network of Sun workstations connected by a fast, switched ethernet. 相似文献

6.

New algorithms for max restricted path consistency

Thanasis Balafoutis Anastasia Paparrizou Kostas Stergiou Toby Walsh 《Constraints》2011,16(4):372-406

Max Restricted Path Consistency (maxRPC) is a local consistency for binary constraints that enforces a higher order of consistency than arc consistency. Despite the strong pruning that can be achieved, maxRPC is rarely used because existing maxRPC algorithms suffer from overheads and redundancies as they can repeatedly perform many constraint checks without triggering any value deletions. In this paper we propose and evaluate techniques that can boost the performance of maxRPC algorithms by eliminating many of these overheads and redundancies. These include the combined use of two data structures to avoid many redundant constraint checks, and the exploitation of residues to quickly verify the existence of supports. Based on these, we propose a number of closely related maxRPC algorithms. The first one, maxRPC3, has optimal O(end ³) time complexity, displays good performance when used stand-alone, but is expensive to apply during search. The second one, maxRPC3 ^rm, has O(en ² d ⁴) time complexity, but a restricted version with O(end ⁴) complexity can be very efficient when used during search. The other algorithms are simple modifications of maxRPC3 ^rm. All algorithms have O(ed) space complexity when used stand-alone. However, maxRPC3 has O(end) space complexity when used during search, while the others retain the O(ed) complexity. Experimental results demonstrate that the resulting methods constantly outperform previous algorithms for maxRPC, often by large margins, and constitute a viable alternative to arc consistency on some problem classes. 相似文献

7.

Parallel Algorithms for Image Template Matching on Hypercube SIMD Computers 总被引：1，自引：0，他引：1

Fang Z Li X Ni LM 《IEEE transactions on pattern analysis and machine intelligence》1987,(6):835-841

This correspondence presents several parallel algorithms for image template matching on an SIMD array processor with a hypercube interconnection network. For an N by N image and an M by M window, the time complexity is reduced from O(N2M2) for the serial algorithm to O(M2/K2 + M * log2 N/K + log2 N * log2 K) for the N2K2-PE system (1 ? K ? M), or to O(N2M2/L2) for the L2-PE system (L ? N). With efficient use of the inter-PE communication network, each PE requires only a small local memory, many unnecessary data transmissions are eliminated, and the time complexity is greatly reduced. 相似文献

8.

Massively parallel algorithms for trace-driven cache simulations

Nicol D.M. Greenberg A.G. Lubachevsky B.D. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(8):849-859

Considers the use of massively parallel architectures to execute a trace-driven simulation of a single cache set. A method is presented for the least-recently-used (LRU) policy, which, regardless of the set size C, runs in time O(log N) using N processors on the EREW (exclusive read, exclusive write) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. We present timings of this algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference-based line replacement policies are considered, which includes LRU as well as the least-frequently-used (LFU) and random replacement policies. A simulation method is presented for any such policy that, on any trace of length N directed to a C line set, runs in O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation 相似文献

9.

Efficient EREW PRAM algorithms for parentheses-matching

Prasad S.K. Das S.K. Chen C.C.-Y. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(9):995-1008

We present four polylog-time parallel algorithms for matching parentheses on an exclusive-read and exclusive-write (EREW) parallel random-access machine (PRAM) model. These algorithms provide new insights into the parentheses-matching problem. The first algorithm has a time complexity of O(log² n) employing O(n/(log n)) processors for an input string containing n parentheses. Although this algorithm is not cost-optimal, it is extremely simple to implement. The remaining three algorithms, which are based on a different approach, achieve O(log n) time complexity in each case, and represent successive improvements. The second algorithm requires O(n) processors and working space, and it is comparable to the first algorithm in its ease of implementation. The third algorithm uses O(n/(log n)) processors and O(n log n) space. Thus, it is cost-optimal, but uses extra space compared to the standard stack-based sequential algorithm. The last algorithm reduces the space complexity to O(n) while maintaining the same processor and time complexities. Compared to other existing time-optimal algorithms for the parentheses-matching problem that either employ extensive pipelining or use linked lists and comparable data structures, and employ sorting or a linked list ranking algorithm as subroutines, the last two algorithms have two distinct advantages. First, these algorithms employ arrays as their basic data structures, and second, they do not use any pipelining, sorting, or linked list ranking algorithms 相似文献

10.

Parallel routing algorithms for nonblocking electronic and photonic switching networks

Lu E. Zheng S.Q. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(8):702-713

We study the connection capacity of a class of rearrangeable nonblocking (RNB) and strictly nonblocking (SNB) networks with/without crosstalk-free constraint, model their routing problems as weak or strong edge-colorings of bipartite graphs, and propose efficient routing algorithms for these networks using parallel processing techniques. This class of networks includes networks constructed from banyan networks by horizontal concatenation of extra stages and/or vertical stacking of multiple planes. We present a parallel algorithm that runs in O(lg/sup 2/ N) time for the RNB networks of complexities ranging from O(N lg N) to O(N/sup 1.5/ lg N) crosspoints and parallel algorithms that run in O(min{d* lg N, /spl radic/N}) time for the SNB networks of O(N/sup 1.5/ lg N) crosspoints, using a completely connected multiprocessor system of N processing elements. Our algorithms can be translated into algorithms with an O(lg N lg lg N) slowdown factor for the class of N-processor hypercubic networks, whose structures are no more complex than a single plane in the RNB and SNB networks considered. 相似文献

11.

Randomized routing, selection, and sorting on the OTIS-mesh 总被引：1，自引：0，他引：1

Rajasekaran S. Sahni S. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(9):833-840

The Optical Transpose Interconnection System (OTIS) is a recently proposed model of computing that exploits the special features of both electronic and optical technologies. In this paper we present efficient algorithms for packet routing, sorting, and selection on the OTIS-Mesh. The diameter of an N²-processor OTIS-Mesh is 4√N-3. We present an algorithm for routing any partial permutation in 4√N+o(√N) time. Our selection algorithm runs in time 6√N+o(√N) and our sorting algorithm runs in 8√N+o(√N) time. All these algorithms are randomized and the stated time bounds hold with high probability. Also, the queue size needed for these algorithms is O(1) with high probability 相似文献

12.

Parallel algorithms for arbitrary dimensional Euclidean distance transforms with applications on arrays with reconfigurable optical buses.

Yuh-Rau Wang Shi-Jinn Horng 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2004,34(1):517-532

In this paper, we present algorithms for computing the Euclidean distance transform (EDT) of a binary image on the array with reconfigurable optical buses (AROB). First, we develop a parallel algorithm termed as Algorithm Expander which can be implemented in O(1) time on an AROB with N x Ndelta processors, where delta = 1/k, k is a constant and a positive integer. Algorithm Expander is designed to compute a higher dimensional EDT based on the computed lower dimensional EDT. It functions as a general EDT expander for us to expand EDT from a lower dimension to a higher dimension. We then develop parallel algorithms for the two-dimensional (2-D)_EDT of a binary image array of size N x N in O(1) time on an AROB with N x N x Ndelta processors and for the three-dimensional (3-D)_EDT of a binary image of size N x N x N in O(1) time on an AROB with N x N x N x Ndelta processors. To the best of our knowledge, all results derived above are the best O(1) time algorithms known. We then extend it to compute the nD_EDT of a binary image of size Nn in O(n) time on an AROB with Nn+delta processors. We also apply our parallel EDT algorithms to build Voronoi diagram and Voronoi polyhetra (polygons), to find all maximal empty spheres and the largest empty sphere, and to compute the medial axis transform. All of these applications can be solved in the same time complexity on an AROB with the same number of processors as needed for solving the EDT problems in the same dimensions. 相似文献

13.

A delay-optimal quorum-based mutual exclusion algorithm fordistributed systems

Guohong Cao Singhal M. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(12):1256-1268

The performance of a mutual exclusion algorithm is measured by the number of messages exchanged per critical section execution and the delay between successive executions of the critical section. There is a message complexity and synchronization delay trade-off in mutual exclusion algorithms. The Lamport algorithm (1978) and the Ricart-Agrawal algorithm (1981) both have a synchronization delay of T (T is the average message delay), but their message complexity is O(N). Maekawa's algorithm reduces the message complexity to O(√N); however, it increases the synchronization delay to 2T. After Maekawa's algorithm (1985), many quorum-based mutual exclusion algorithms have been proposed to reduce the message complexity or the increase the resiliency to site and communication link failures. Since these algorithms are Maekawa-type algorithms, they also suffer from the long synchronization delay. We propose a delay-optimal quorum-based mutual exclusion algorithm which reduces the synchronization delay to T and still has a low message complexity of O(K) (K is the size of the quorum which can be as low as log N). A correctness proof and a detailed performance analysis are provided 相似文献

14.

A solution approach based on beam search algorithm for disassembly line balancing problem

《Journal of Manufacturing Systems》2016

The disassembly line balancing (DLB) problem is the process of allocating a set of disassembly tasks to an ordered sequence of workstations in such a way that optimizes some performance measures (e.g., cycle time, number of stations). Since DLB problems belong to the class of NP hard, many heuristic and meta-heuristic algorithms are applied to cope with the complexity of the DLB problems in order to obtain acceptable solutions in a reasonable amount of time. In this study, a beam search (BS) based approach for the DLB problem is proposed. Minimization of number of workstations is used as the performance measure. The proposed algorithm is compared with the optimal solutions of well-known real cases and generated test problems. The results indicate that the proposed approach based on BS is a very competitive and promising tool for further researches. 相似文献

15.

Listing all the minimum spanning trees in an undirected graph

《国际计算机数学杂志》2012,89(14):3175-3185

Efficient polynomial time algorithms are well known for the minimum spanning tree problem. However, given an undirected graph with integer edge weights, minimum spanning trees may not be unique. In this article, we present an algorithm that lists all the minimum spanning trees included in the graph. The computational complexity of the algorithm is O(N(mn+n ² log n)) in time and O(m) in space, where n, m and N stand for the number of nodes, edges and minimum spanning trees, respectively. Next, we explore some properties of cut-sets, and based on these we construct an improved algorithm, which runs in O(N m log n) time and O(m) space. These algorithms are implemented in C language, and some numerical experiments are conducted for planar as well as complete graphs with random edge weights. 相似文献

16.

Dynamic Programming on the Word RAM

Pisinger 《Algorithmica》2003,35(2):128-145

Dynamic programming is one of the fundamental techniques for solving optimization problems. In this paper we propose a general framework which can be used to decrease the time and space complexity of these algorithms with a logarithmic factor. The framework is based on word encoding, i.e. by representing subsolutions as bits in an integer. In this way word parallelism can be used in the evaluation of the dynamic programming recursion. Using this encoding the subset-sum problem can be solved in O( n b/ log b) time and O(b/ log b) space, where n is the number of integers given and b is the target sum. The knapsack problem can be solved in O( n m/ log m) time and O(m/ log m) space, where n is the number of items and m = max{b,z} is the maximum of the capacity b and the optimal solution value z . The problem of finding a path of a given length b in a directed acyclic graph G=(V,E) can be solved in O(|E|b/ log b) time and O(|V|b/ log b) space. Several other examples are given showing the generality of the achieved technique. Extensive computational experiments are provided to demonstrate that the achieved results are not only of theoretical interest but actually lead to algorithms which are up to two orders of magnitude faster than their predecessors. This is a surprising observation as the increase in speed is larger than the word size of the processor. 相似文献

17.

A comparative study of Multi-Objective Ant Colony Optimization algorithms for the Time and Space Assembly Line Balancing Problem

Juan Rada-Vilela Manuel Chica Óscar Cordón Sergio Damas 《Applied Soft Computing》2013,13(11):4370-4382

Assembly lines for mass manufacturing incrementally build production items by performing tasks on them while flowing between workstations. The configuration of an assembly line consists of assigning tasks to different workstations in order to optimize its operation subject to certain constraints such as the precedence relationships between the tasks. The operation of an assembly line can be optimized by minimizing two conflicting objectives, namely the number of workstations and the physical area these require. This configuration problem is an instance of the TSALBP, which is commonly found in the automotive industry. It is a hard combinatorial optimization problem to which finding the optimum solution might be infeasible or even impossible, but finding a good solution is still of great value to managers configuring the line. We adapt eight different Multi-Objective Ant Colony Optimization (MOACO) algorithms and compare their performance on ten well-known problem instances to solve such a complex problem. Experiments under different modalities show that the commonly used heuristic functions deteriorate the performance of the algorithms in time-limited scenarios due to the added computational cost. Moreover, even neglecting such a cost, the algorithms achieve a better performance without such heuristic functions. The algorithms are ranked according to three multi-objective indicators and the differences between the top-4 are further reviewed using statistical significance tests. Additionally, these four best performing MOACO algorithms are favourably compared with the Infeasibility Driven Evolutionary Algorithm (IDEA) designed specifically for industrial optimization problems. 相似文献

18.

Reducing the run-time complexity of multiobjective EAs: The NSGA-II and other algorithms 总被引：1，自引：0，他引：1

Jensen M.T. 《Evolutionary Computation, IEEE Transactions on》2003,7(5):503-515

The last decade has seen a surge of research activity on multiobjective optimization using evolutionary computation and a number of well performing algorithms have been published. The majority of these algorithms use fitness assignment based on Pareto-domination: Nondominated sorting, dominance counting, or identification of the nondominated solutions. The success of these algorithms indicates that this type of fitness is suitable for multiobjective problems, but so far the use of Pareto-based fitness has lead to program run times in O(GMN/sup 2/), where G is the number of generations, M is the number of objectives, and N is the population size. The N/sup 2/ factor should be reduced if possible, since it leads to long processing times for large population sizes. This paper presents a new and efficient algorithm for nondominated sorting, which can speed up the processing time of some multiobjective evolutionary algorithms (MOEAs) substantially. The new algorithm is incorporated into the nondominated sorting genetic algorithm II (NSGA-II) and reduces the overall run-time complexity of this algorithm to O(GN log/sup M-1/N), much faster than the O(GMN/sup 2/) complexity published by Deb et al. (2002). Experiments demonstrate that the improved version of the algorithm is indeed much faster than the previous one. The paper also points out that multiobjective EAs using fitness based on dominance counting and identification of nondominated solutions can be improved significantly in terms of running time by using efficient algorithms known from computer science instead of inefficient O(MN/sup 2/) algorithms. 相似文献

19.

All to-all communication with minimum start-up costs in 2D/3D toriand meshes

Young-Joo Suh Valamanchili S. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(5):442-458

All-to-all communication patterns occur in many important parallel algorithms. This paper presents new algorithms for all-to-all communication patterns (all-to-all broadcast and all-to-all personalized exchange) for wormhole switched 2D/3D torus- and mesh-connected multiprocessors. The algorithms use message combining to minimize message start-ups at the expense of larger message sizes. The unique feature of these algorithms is that they are the first algorithms that we know of that operate in a bottom-up fashion rather than a recursive, top-down manner. For a 2^d×2^d torus or mesh, the algorithms for all-to-all personalized exchange have time complexity of O(2^3d). An important property of the algorithms is the O(d) time due to message start-ups, compared with O(2^d) for current algorithms. This is particularly important for modern parallel architectures where the start-up cost of message transmissions still dominates, except for very large block sizes. Finally, the 2D algorithms for all-to-all personalized exchange are extended to O(2^4d) algorithms in a 2^d×2^d×2^d3D torus or mesh. These algorithms also retain the important property of O(d) time due to message start-ups 相似文献

20.

CMST问题的新算法

王孔勋潘启敬《计算机学报》1991,14(9):651-659

本文提出了解决按约束条件求最小代价生成树(简称CMST)问题的两个新算法,即给定结点数N,每个结点的负载,链路的代价及链路的容量后,在符合某些约束条件下,求代价最小的树结构.两个新算法的计算复杂性均为O(N~2).计算结果表明,新算法所得结果的代价低于几个现有算法,而计算复杂性比现有算法小得多. 相似文献