首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
We consider the problem of dynamically allocating and deallocating local memory resources among multiple users in a parallel or distributed system. Given a group of independent users and a collection of interconnected local memory devices, we want to render the fragmentation of the memory resources irrelevant by allowing any user to allocate space for his or her purposes as long as there is space available anywhere in the system. In effect, we would like it to appear to the users as though they are allocating memory from a single central pool of memory, even though the space is distributed throughout the system. Our goal is to devise an on-line allocation algorithm that minimizes two cost measures: first, the fraction of unused space , which arises due to fragmentation of the memory; second, the slowdown needed by the system to service user requests, which arises due to the contention for access to the memory devices. We solve this distributed dynamic allocation problem in near-optimal fashion by devising an algorithm that allows the memory to be used to 100% of capacity despite the fragmentation and guarantees that service delays will always be within a constant factor of optimal. The algorithm is completely on-line (no foreknowledge of user activity is assumed) and can accommodate any sequence of allocations and deallocations by the users that does not violate global memory bounds. We also consider the distributed dynamic allocation problem in the more restrictive setting where the local memory devices are connected by a low-degree fixed-connection network, rather than being fully interconnected. In this case, communication costs must be more explicitly considered in our allocation algorithms. We give allocation algorithms for butterfly and hypercube networks, and prove necessary and sufficient conditions on the total amount of memory space needed for near-optimal algorithms to exist. Received November 5, 1996; revised December 10, 1997.  相似文献   

2.
An adaptive algorithm, whose step complexity adjusts to the number of active processes, is attractive for distributed systems with a highly-variable number of processes. The cornerstone of many adaptive algorithms is an adaptive mechanism to collect up-to-date information from all participating processes. To date, all known collect algorithms either have non-linear step complexity or they are impractical because of unrealistic memory overhead. This paper presents new randomized collect algorithms with asymptotically optimal O(k) step complexity and linear memory overhead only. In addition we present a new deterministic collect algorithm that beats the best step complexity for previous polynomial-memory algorithms. Partially supported by NSF Grants CCR–0310970 and ANI–0326001. A preliminary version of this paper appeared in the Proceedings of the 18th Annual Conference on Distributed Computing (DISC) 2004 [10].  相似文献   

3.
The 3-domatic number problem asks whether a given graph can be partitioned into three dominating sets. We prove that this problem can be solved by a deterministic algorithm in time n2.695 (up to polynomial factors) and in polynomial space. This result improves the previous bound of n2.8805, which is due to Björklund and Husfeldt. To prove our result, we combine an algorithm by Fomin et al. with Yamamoto's algorithm for the satisfiability problem. In addition, we show that the 3-domatic number problem can be solved for graphs G with bounded maximum degree Δ(G) by a randomized polynomial-space algorithm, whose running time is better than the previous bound due to Riege and Rothe whenever Δ(G)?5. Our new randomized algorithm employs Schöning's approach to constraint satisfaction problems.  相似文献   

4.
We present an algorithm that takes I/Os (sort(N)=Θ((N/(DB))log  M/B (N/B)) is the number of I/Os it takes to sort N data items) to compute a tree decomposition of width at most k, for any graph G of treewidth at most k and size N, where k is a constant. Given such a tree decomposition, we use a dynamic programming framework to solve a wide variety of problems on G in I/Os, including the single-source shortest path problem and a number of problems that are NP-hard on general graphs. The tree decomposition can also be used to obtain an optimal separator decomposition of G. We use such a decomposition to perform depth-first search in G in  I/Os. As important tools that are used in the tree decomposition algorithm, we introduce flippable DAGs and present an algorithm that computes a perfect elimination ordering of a k-tree in I/Os. The second contribution of our paper, which is of independent interest, is a general and simple framework for obtaining I/O-efficient algorithms for a number of graph problems that can be solved using greedy algorithms in internal memory. We apply this framework in order to obtain an improved algorithm for finding a maximal matching and the first deterministic I/O-efficient algorithm for finding a maximal independent set of an arbitrary graph. Both algorithms take I/Os. The maximal matching algorithm is used in the tree decomposition algorithm. An abstract of this paper was presented at the 12th Annual ACM-SIAM Symposium on Discrete Algorithms, Proceedings, pp. 89–90, 2001. Research of A. Maheshwari supported by NSERC. Part of this work was done while the second author was a Ph.D. student at the School of Computer Science of Carleton University.  相似文献   

5.
The block-cyclic data distribution is commonly used to organize array elements over the processors of a coarse-grained distributed memory parallel computer. In many scientific applications, the data layout must be reorganized at run-time in order to enhance locality and reduce remote memory access overheads. In this paper we present a general framework for developing array redistribution algorithms. Using this framework, we have developed efficient algorithms that redistribute an array from one block-cyclic layout to another. Block-cyclic redistribution consists of index set computation , wherein the destination locations for individual data blocks are calculated, and data communication , wherein these blocks are exchanged between processors. The framework treats both these operations in a uniform and integrated way. We have developed efficient and distributed algorithms for index set computation that do not require any interprocessor communication. To perform data communication in a conflict-free manner, we have developed direct indirect and hybrid algorithms. In the direct algorithm, a data block is transferred directly to its destination processor. In an indirect algorithm, data blocks are moved from source to destination processors through intermediate relay processors. The hybrid algorithm is a combination of the direct and indirect algorithms. Our framework is based on a generalized circulant matrix formalism of the redistribution problem and a general purpose distributed memory model of the parallel machine. Our algorithms sustain excellent performance over a wide range of problem and machine parameters. We have implemented our algorithms using MPI, to allow for easy portability across different HPC platforms. Experimental results on the IBM SP-2 and the Cray T3D show superior performance over previous approaches. When the block size of the cyclic data layout changes by a factor of K , the redistribution can be performed in O( log K) communication steps. This is true even when K is a prime number. In contrast, previous approaches take O(K) communication steps for redistribution. Our framework can be used for developing scalable redistribution libraries, for efficiently implementing parallelizing compiler directives, and for developing parallel algorithms for various applications. Redistribution algorithms are especially useful in signal processing applications, where the data access patterns change significantly between computational phases. They are also necessary in linear algebra programs, to perform matrix transpose operations. Received June 1, 1997; revised March 10, 1998.  相似文献   

6.
An important optimization problem in the design of cellular networks is to assign sets of frequencies to transmitters to avoid unacceptable interference. A cellular network is generally modeled as a subgraph of the infinite triangular lattice. The distributed frequency assignment problem can be abstracted as a multicoloring problem on a weighted hexagonal graph, where the weight vector represents the number of calls to be assigned at vertices. In this paper we present a 2-local distributed algorithm for multicoloring triangle-free hexagonal graphs using only the local clique numbers ω1(v) and ω2(v) at each vertex v of the given hexagonal graph, which can be computed from local information available at the vertex. We prove that the algorithm uses no more than colors for any triangle-free hexagonal graph G, without explicitly computing the global clique number ω(G). Hence the competitive ratio of the algorithm is 5/4.  相似文献   

7.
We present a new parallel algorithm for computing a maximum cardinality matching in a bipartite graph suitable for distributed memory computers.The presented algorithm is based on the Push-Relabel algorithm which is known to be one of the fastest algorithms for the bipartite matching problem. Previous attempts at developing parallel implementations of it have focused on shared memory computers using only a limited number of processors.We first present a straightforward adaptation of these shared memory algorithms to distributed memory computers. However, this is not a viable approach as it requires too much communication. We then develop our new algorithm by modifying the previous approach through a sequence of steps with the main goal being to reduce the amount of communication and to increase load balance. The first goal is achieved by changing the algorithm so that many push and relabel operations can be performed locally between communication rounds and also by selecting augmenting paths that cross processor boundaries infrequently. To achieve good load balance, we limit the speed at which global relabelings traverse the graph. In several experiments on a large number of instances, we study weak and strong scalability of our algorithm using up to 128 processors.The algorithm can also be used to find ?-approximate matchings quickly.  相似文献   

8.
Let G=(V, E) be a graph with vertex set V of size n and edge set E of size m. A vertex vV is called a hinge vertex if there exist two vertices in V\{v} such that their distance becomes longer when v is removed. In this paper, we present a distributed algorithm that finds all hinge vertices on an arbitrary graph. The proposed algorithm works for named static asynchronous networks and achieves O(n 2) time complexity and O(m) message complexity. In particular, the total messages exchanged during the algorithm are at most 2m(log n+nlog n+1) bits.  相似文献   

9.
In this paper we propose dynamic algorithms for maintaining a breadth-first search tree from a given source vertex of a directed graph G in either an incremental or a decremental setting. During a sequence of q edge insertions or a sequence of q edge deletions the total time required is O(m·min{q,n}), where n is the number of vertices of G, and m is the final number of edges of G in the case of insertions or the initial number of edges of G in the case of deletions. This gives O(n) amortized time for each operation if the sequence has length Ω(m). Our algorithms require O(n+m) space. These are the first results in the literature concerning the dynamic maintenance of a breadth-first search tree for directed graphs. As a straightforward application of such algorithms we can maintain a shortest path tree for a directed graph in the case of unit edge weights within the same time bounds. In this case distance queries can be answered in constant time, while shortest path queries can be answered in time linear in the length of the retrieved path.  相似文献   

10.
We show that the exact computation of a minimum or a maximum cut of a given graph G is out of reach for any one-pass streaming algorithm, that is, for any algorithm that runs over the input stream of G's edges only once and has a working memory of o(n2) bits. This holds even if randomization is allowed.  相似文献   

11.
An acyclic coloring of a graph G is a coloring of its vertices such that: (i) no two neighbors in G are assigned the same color and (ii) no bicolored cycle can exist in G. The acyclic chromatic number of G is the least number of colors necessary to acyclically color G. In this paper, we show that any graph of maximum degree 5 has acyclic chromatic number at most 9, and we give a linear time algorithm that achieves this bound.  相似文献   

12.
In this paper we consider the problem of on-line graph coloring. In an instance of on-line graph coloring, the nodes are presented one at a time. As each node is presented, its edges to previously presented nodes are also given. Each node must be assigned a color, different from the colors of its neighbors, before the next node is given. LetA(G) be the number of colors used by algorithmA on a graphG and letx(G) be the chromatic number ofG. The performance ratio of an on-line graph coloring algorithm for a class of graphsC is maxG C(A(G)/(G)). We consider the class ofd-inductive graphs. A graphG isd-inductive if the nodes ofG can be numbered so that each node has at mostd edges to higher-numbered nodes. In particular, planar graphs are 5-inductive, and chordal graphs arex(G)-inductive. First Fit is the algorithm that assigns each node the lowest-numbered color possible. We show that ifG isd-inductive, then First Fit usesO(d logn) colors onG. This yields an upper bound ofo(logn) on the performance ratio of First Fit on chordal and planar graphs. First Fit does as well as any on-line algorithm ford-inductive graphs: we show that, for anyd and any on-line graph coloring algorithmA, there is ad-inductive graph that forcesA to use (d logn) colors to colorG. We also examine on-line graph coloring with lookahead. An algorithm is on-line with lookaheadl, if it must color nodei after examining only the firstl+i nodes. We show that, forl/logn, the lower bound ofd logn colors still holds.This research was supported by an IBM Graduate Fellowship.  相似文献   

13.
Let (G) denote the independence number of a graphG, that is the maximum number of pairwise independent vertices inG. We present a parallel algorithm that computes in a planar graphG = (V, E), an independent set such that ¦I¦ (G)/2. The algorithm runs in timeOlog2 n) and requires a linear number of processors. This is achieved by denning a new set of reductions that can be executed locally and simultaneously; furthermore, it is shown that a constant fraction of the vertices in the graph are reducible. This is the best known approximation scheme when the number of processors available is linear; parallel implementation of known sequential algorithms requires many more processors.Joseph Naor was supported by Contract ONR N00014-88-K-0166. Most of this work was done while he was a post-doctoral fellow at the Department of Computer Science, University of Southern California, Los Angeles, CA 90089-0782, USA.  相似文献   

14.
《国际计算机数学杂志》2012,89(8):1680-1691
Let G be a graph with vertex set V(G). Let n, k, d be non-negative integers such that n+2k+d≤|V(G)|?2 and |V(G)|?n?d are even. A matching which saturates exactly |V(G)|?d vertices is called a defect-d matching of G. If when deleting any n vertices the remaining subgraph contains a matching of k edges and every k-matching can be extended to a defect-d matching, then G is said to be an (n, k, d)-graph. We present an algorithm to determine (0, 1, d)-graphs with d constraints. Moreover, we solve the problem of augmenting a bipartite graph G=(B, W) to be a (0, 1, d)-graph by adding fewest edges, where d=∥B|?|W∥. The latter problem is applicable to the job assignment problem, where the number of jobs does not equal the number of persons.  相似文献   

15.
Distributed averaging deals with a network of n > 1 agents and the constraint that each agent is able to communicate only with its neighbors. The purpose of the distributed averaging problem is to devise a protocol which will enable all n agents to asymptotically determine in a decentralized manner, the average of the initial values of their scalar agreement variables. Most distributed averaging protocols involve a linear iteration which depends only on the current estimates of the average. Building on the idea proposed in Muthukrishnan, Ghosh, and Schultz (1998), this paper investigates an augmented linear iteration for fast distributed averaging in which local memory is exploited. A thorough characterization of the behavior of the augmented system is obtained under appropriate assumptions. It is shown that the augmented linear iteration can solve the distributed averaging problem faster than the original linear iteration, but the adjustable parameter must be chosen carefully. The optimal choice of the parameter and the corresponding fastest rate of convergence are also provided in closed form.  相似文献   

16.
Andrews  Bender  Zhang 《Algorithmica》2008,32(2):277-301
Abstract. Processor speed and memory capacity are increasing several times faster than disk speed. This disparity suggests that disk I/ O performance could become an important bottleneck. Methods are needed for using disks more efficiently. Past analysis of disk scheduling algorithms has largely been experimental and little attempt has been made to develop algorithms with provable performance guarantees. We consider the following disk scheduling problem. Given a set of requests on a computer disk and a convex reachability function that determines how fast the disk head travels between tracks, our goal is to schedule the disk head so that it services all the requests in the shortest time possible. We present a 3/2 -approximation algorithm (with a constant additive term). For the special case in which the reachability function is linear we present an optimal polynomial-time solution. The disk scheduling problem is related to the special case of the Asymmetric Traveling Salesman Problem with the triangle inequality (ATSP-Δ ) in which all distances are either 0 or some constant α . We show how to find the optimal tour in polynomial time and describe how this gives another approximation algorithm for the disk scheduling problem. Finally we consider the on-line version of the problem in which uniformly distributed requests arrive over time. We present an algorithm related to the above ATSP-Δ .  相似文献   

17.
We present a distributed algorithm that constructs an O(log n)-approximate minimum spanning tree (MST) in any arbitrary network. This algorithm runs in time Õ(D(G) + L(G, w)) where L(G, w) is a parameter called the local shortest path diameter and D(G) is the (unweighted) diameter of the graph. Our algorithm is existentially optimal (up to polylogarithmic factors), i.e., there exist graphs which need Ω(D(G) + L(G, w)) time to compute an H-approximation to the MST for any $H\,\in\,[1, \Theta({\rm log} n)]We present a distributed algorithm that constructs an O(log n)-approximate minimum spanning tree (MST) in any arbitrary network. This algorithm runs in time ?(D(G) + L(G, w)) where L(G, w) is a parameter called the local shortest path diameter and D(G) is the (unweighted) diameter of the graph. Our algorithm is existentially optimal (up to polylogarithmic factors), i.e., there exist graphs which need Ω(D(G) + L(G, w)) time to compute an H-approximation to the MST for any . Our result also shows that there can be a significant time gap between exact and approximate MST computation: there exists graphs in which the running time of our approximation algorithm is exponentially faster than the time-optimal distributed algorithm that computes the MST. Finally, we show that our algorithm can be used to find an approximate MST in wireless networks and in random weighted networks in almost optimal ?(D(G)) time.  相似文献   

18.
Galluccio  Proietti 《Algorithmica》2008,36(4):361-374
Abstract. Given a 2-edge-connected, real weighted graph G with n vertices and m edges, the 2-edge-connectivity augmentation problem is that of finding a minimum weight set of edges of G to be added to a spanning subgraph H of G to make it 2-edge-connected. While the general problem is NP-hard and 2 -approximable, in this paper we prove that it becomes polynomial time solvable if H is a depth-first search tree of G . More precisely, we provide an efficient algorithm for solving this special case which runs in O (M · α(M,n)) time, where α is the classic inverse of Ackermann's function and M=m · α(m,n) . This algorithm has two main consequences: first, it provides a faster 2 -approximation algorithm for the general 2 -edge-connectivity augmentation problem; second, it solves in O (m · α(m,n)) time the problem of restoring, by means of a minimum weight set of replacement edges, the 2 -edge-connectivity of a 2-edge-connected communication network undergoing a link failure.  相似文献   

19.
Z. -Z. Chen  X. He 《Algorithmica》1997,19(3):354-368
Given a graph G=(V,E), the well-known spanning forest problem of G can be viewed as the problem of finding a maximal subset F of edges in G such that the subgraph induced by F is acyclic. Although this problem has well-known efficient NC algorithms, its vertex counterpart, the problem of finding a maximal subset U of vertices in G such that the subgraph induced by U is acyclic, has not been shown to be in NC (or even in RNC) and is not believed to be parallelizable in general. In this paper we present NC algorithms for solving the latter problem for two special cases. First, we show that, for a planar graph with n vertices, the problem can be solved in time with O(n) processors on an EREW PRAM. Second, we show that the problem is solvable in NC if the input graph G has only vertex-induced paths of length polylogarithmic in the number of vertices of G. As a consequence of this result, we show that certain natural extensions of the well-studied maximal independent set problem remain solvable in NC. Moreover, we show that, for a constant-degree graph with n vertices, the problem can be solved in time with O(n 2 ) processors on an EREW PRAM. Received July 3, 1995; revised April 1, 1996.  相似文献   

20.
We study the on-line caching problem in a restricted cache where each memory item can be placed in only a restricted subset of cache locations. Examples of restricted caches in practice include victim caches, assist caches, and skew caches. To the best of our knowledge, all previous on-line caching studies have considered on-line caching in identical or fully-associative caches where every memory item can be placed in any cache location.In this paper, we focus on companion caches, a simple restricted cache that includes victim caches and assist caches as special cases. Our results show that restricted caches are significantly more complex than identical caches. For example, we show that the commonly studied Least Recently Used algorithm is not competitive unless cache reorganization is allowed while the performance of the First In First Out algorithm is competitive but not optimal. We also present two near optimal algorithms for this problem as well as lower bound arguments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号